# ckOCR PHP CLI tool that photographs part identification labels and extracts structured data using **Mistral AI OCR**. Reads images from `input/`, calls the Mistral API, and writes YAML files to `output/` containing the **Serial Number**, **Model Number**, and **Date**. --- ## Requirements - PHP **8.1 – 8.5** with the `curl` extension enabled (no Composer required) - A [Mistral AI](https://console.mistral.ai/) account with API access **Arch Linux / CachyOS** — enable the curl extension after installing PHP: ```bash sudo pacman -S php # uncomment "extension=curl" in /etc/php/php.ini php -m | grep curl # verify ``` --- ## Installation ```bash git clone ckOCR cd ckOCR cp .env.example .env ``` Edit `.env` and insert your Mistral API key: ```env MISTRAL_API_KEY=your_api_key_here ``` Alternatively, export it as an environment variable: ```bash export MISTRAL_API_KEY=your_api_key_here ``` --- ## Usage Place one or more label photos in the `input/` folder, then run: ```bash php ocr.php ``` Results are written to `output/` as YAML files — one per image, same filename stem. ### Options | Flag | Description | |---|---| | `--force` | Re-process images that already have an output file | | `--verbose` | Print the raw OCR text and API request details | | `--help` | Show usage information | ### Examples ```bash # Process all new images php ocr.php # Re-run everything, show full detail php ocr.php --force --verbose # Just see options php ocr.php --help ``` --- ## Input Supported image formats: **JPG, JPEG, PNG, WebP, GIF** Maximum file size: **5 MB** per image (Mistral API limit) ``` input/ ├── part-label-01.jpg ├── motor-sn.png └── board-sticker.jpg ``` --- ## Output Each processed image produces a YAML file in `output/`: ``` output/ ├── part-label-01.yaml ├── motor-sn.yaml └── board-sticker.yaml ``` ### YAML structure ```yaml --- serial_number: SN-20241234 model_number: "XYZ-4K/B" date: 2024-01 source_file: part-label-01.jpg processed_at: 2026-03-04 15:30:00 raw_ocr: | Full text extracted from the label by the OCR model, preserved exactly as returned. ``` | Field | Description | |---|---| | `serial_number` | Serial Number — labelled S/N, SN, Serial No., etc. | | `model_number` | Model or Part Number — labelled Model, M/N, P/N, MPN, etc. | | `date` | Any date on the label — MFG date, DOM, expiry, etc. | | `source_file` | Original image filename | | `processed_at` | Timestamp of processing | | `raw_ocr` | Full OCR text returned by Mistral before extraction | Fields not found on the label are written as `null`. --- ## How it works Processing runs in two API calls per image: ``` Image file │ ▼ [1] POST /ocr (mistral-ocr-latest) │ base64-encoded image → markdown text │ ▼ [2] POST /chat/completions (mistral-small-latest) │ OCR text + extraction prompt → JSON with the three fields │ ▼ YAML file written to output/ ``` 1. **OCR step** — the image is base64-encoded and sent to `mistral-ocr-latest`, which returns the full label text as markdown. 2. **Extraction step** — the OCR text is passed to `mistral-small-latest` with a structured prompt. The model returns a JSON object (`response_format: json_object`) containing `serial_number`, `model_number`, and `date`. Already-processed images are skipped automatically unless `--force` is used. --- ## Project structure ``` ckOCR/ ├── ocr.php # Main script ├── .env # API key (not committed, see .env.example) ├── .env.example # Template ├── .gitignore ├── input/ # Label photos (test data included) └── output/ # YAML results (test data included) ``` --- ## Troubleshooting **`MISTRAL_API_KEY not set`** Set the key in `.env` or export it as an environment variable. **`Mistral API 401`** Your API key is invalid or expired. Check it at [console.mistral.ai](https://console.mistral.ai/). **`File too large`** Resize the image below 5 MB before placing it in `input/`. **`No text found`** The label may be blurry, low contrast, or too small. Try a clearer photo. The output YAML is still written with `null` fields so the file won't be re-processed accidentally — use `--force --verbose` to retry and inspect the raw OCR output. **Fields are `null` but text was extracted** Run with `--verbose` to see the raw OCR text and check whether the label uses non-standard abbreviations. The extraction prompt covers the most common label formats.