ckOCR/README.md

# ckOCR

PHP CLI tool that photographs part identification labels and extracts structured data using **Mistral AI OCR**.

Reads images from `input/`, calls the Mistral API, and writes YAML files to `output/` containing the **Serial Number**, **Model Number**, and **Date**.

---

## Requirements

- PHP **8.1 – 8.5** with the `curl` extension enabled (no Composer required)
- A [Mistral AI](https://console.mistral.ai/) account with API access

**Arch Linux / CachyOS** — enable the curl extension after installing PHP:

```bash
sudo pacman -S php
# uncomment "extension=curl" in /etc/php/php.ini
php -m | grep curl   # verify
```

---

## Installation

```bash
git clone <repo-url> ckOCR
cd ckOCR
cp .env.example .env
```

Edit `.env` and insert your Mistral API key:

```env
MISTRAL_API_KEY=your_api_key_here
```

Alternatively, export it as an environment variable:

```bash
export MISTRAL_API_KEY=your_api_key_here
```

---

## Usage

Place one or more label photos in the `input/` folder, then run:

```bash
php ocr.php
```

Results are written to `output/` as YAML files — one per image, same filename stem.

### Options

| Flag | Description |
|---|---|
| `--force` | Re-process images that already have an output file |
| `--verbose` | Print the raw OCR text and API request details |
| `--help` | Show usage information |

### Examples

```bash
# Process all new images
php ocr.php

# Re-run everything, show full detail
php ocr.php --force --verbose

# Just see options
php ocr.php --help
```

---

## Input

Supported image formats: **JPG, JPEG, PNG, WebP, GIF**

Maximum file size: **5 MB** per image (Mistral API limit)

```
input/
├── part-label-01.jpg
├── motor-sn.png
└── board-sticker.jpg
```

---

## Output

Each processed image produces a YAML file in `output/`:

```
output/
├── part-label-01.yaml
├── motor-sn.yaml
└── board-sticker.yaml
```

### YAML structure

```yaml
---
serial_number: SN-20241234
model_number: "XYZ-4K/B"
date: 2024-01
source_file: part-label-01.jpg
processed_at: 2026-03-04 15:30:00
raw_ocr: |
  Full text extracted from the label by the OCR model,
  preserved exactly as returned.
```

| Field | Description |
|---|---|
| `serial_number` | Serial Number — labelled S/N, SN, Serial No., etc. |
| `model_number` | Model or Part Number — labelled Model, M/N, P/N, MPN, etc. |
| `date` | Any date on the label — MFG date, DOM, expiry, etc. |
| `source_file` | Original image filename |
| `processed_at` | Timestamp of processing |
| `raw_ocr` | Full OCR text returned by Mistral before extraction |

Fields not found on the label are written as `null`.

---

## How it works

Processing runs in two API calls per image:

```
Image file
    │
    ▼
[1] POST /ocr  (mistral-ocr-latest)
    │  base64-encoded image → markdown text
    │
    ▼
[2] POST /chat/completions  (mistral-small-latest)
    │  OCR text + extraction prompt → JSON with the three fields
    │
    ▼
YAML file written to output/
```

1. **OCR step** — the image is base64-encoded and sent to `mistral-ocr-latest`, which returns the full label text as markdown.
2. **Extraction step** — the OCR text is passed to `mistral-small-latest` with a structured prompt. The model returns a JSON object (`response_format: json_object`) containing `serial_number`, `model_number`, and `date`.

Already-processed images are skipped automatically unless `--force` is used.

---

## Project structure

```
ckOCR/
├── ocr.php          # Main script
├── .env             # API key (not committed, see .env.example)
├── .env.example     # Template
├── .gitignore
├── input/           # Label photos (test data included)
└── output/          # YAML results (test data included)
```

---

## Troubleshooting

**`MISTRAL_API_KEY not set`**
Set the key in `.env` or export it as an environment variable.

**`Mistral API 401`**
Your API key is invalid or expired. Check it at [console.mistral.ai](https://console.mistral.ai/).

**`File too large`**
Resize the image below 5 MB before placing it in `input/`.

**`No text found`**
The label may be blurry, low contrast, or too small. Try a clearer photo. The output YAML is still written with `null` fields so the file won't be re-processed accidentally — use `--force --verbose` to retry and inspect the raw OCR output.

**Fields are `null` but text was extracted**
Run with `--verbose` to see the raw OCR text and check whether the label uses non-standard abbreviations. The extraction prompt covers the most common label formats.