- ocr.php: two-step pipeline (mistral-ocr-latest + mistral-small-latest) extracts Serial Number, Model Number, and Date from part label photos - input/: 5 test images of industrial part labels - output/: corresponding YAML results - README.md: full usage, setup, and troubleshooting docs - .gitignore: excludes .env only - .env.example: API key template Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
4.5 KiB
ckOCR
PHP CLI tool that photographs part identification labels and extracts structured data using Mistral AI OCR.
Reads images from input/, calls the Mistral API, and writes YAML files to output/ containing the Serial Number, Model Number, and Date.
Requirements
- PHP 8.1 – 8.5 with the
curlextension enabled (no Composer required) - A Mistral AI account with API access
Arch Linux / CachyOS — enable the curl extension after installing PHP:
sudo pacman -S php
# uncomment "extension=curl" in /etc/php/php.ini
php -m | grep curl # verify
Installation
git clone <repo-url> ckOCR
cd ckOCR
cp .env.example .env
Edit .env and insert your Mistral API key:
MISTRAL_API_KEY=your_api_key_here
Alternatively, export it as an environment variable:
export MISTRAL_API_KEY=your_api_key_here
Usage
Place one or more label photos in the input/ folder, then run:
php ocr.php
Results are written to output/ as YAML files — one per image, same filename stem.
Options
| Flag | Description |
|---|---|
--force |
Re-process images that already have an output file |
--verbose |
Print the raw OCR text and API request details |
--help |
Show usage information |
Examples
# Process all new images
php ocr.php
# Re-run everything, show full detail
php ocr.php --force --verbose
# Just see options
php ocr.php --help
Input
Supported image formats: JPG, JPEG, PNG, WebP, GIF
Maximum file size: 5 MB per image (Mistral API limit)
input/
├── part-label-01.jpg
├── motor-sn.png
└── board-sticker.jpg
Output
Each processed image produces a YAML file in output/:
output/
├── part-label-01.yaml
├── motor-sn.yaml
└── board-sticker.yaml
YAML structure
---
serial_number: SN-20241234
model_number: "XYZ-4K/B"
date: 2024-01
source_file: part-label-01.jpg
processed_at: 2026-03-04 15:30:00
raw_ocr: |
Full text extracted from the label by the OCR model,
preserved exactly as returned.
| Field | Description |
|---|---|
serial_number |
Serial Number — labelled S/N, SN, Serial No., etc. |
model_number |
Model or Part Number — labelled Model, M/N, P/N, MPN, etc. |
date |
Any date on the label — MFG date, DOM, expiry, etc. |
source_file |
Original image filename |
processed_at |
Timestamp of processing |
raw_ocr |
Full OCR text returned by Mistral before extraction |
Fields not found on the label are written as null.
How it works
Processing runs in two API calls per image:
Image file
│
▼
[1] POST /ocr (mistral-ocr-latest)
│ base64-encoded image → markdown text
│
▼
[2] POST /chat/completions (mistral-small-latest)
│ OCR text + extraction prompt → JSON with the three fields
│
▼
YAML file written to output/
- OCR step — the image is base64-encoded and sent to
mistral-ocr-latest, which returns the full label text as markdown. - Extraction step — the OCR text is passed to
mistral-small-latestwith a structured prompt. The model returns a JSON object (response_format: json_object) containingserial_number,model_number, anddate.
Already-processed images are skipped automatically unless --force is used.
Project structure
ckOCR/
├── ocr.php # Main script
├── .env # API key (not committed, see .env.example)
├── .env.example # Template
├── .gitignore
├── input/ # Label photos (test data included)
└── output/ # YAML results (test data included)
Troubleshooting
MISTRAL_API_KEY not set
Set the key in .env or export it as an environment variable.
Mistral API 401
Your API key is invalid or expired. Check it at console.mistral.ai.
File too large
Resize the image below 5 MB before placing it in input/.
No text found
The label may be blurry, low contrast, or too small. Try a clearer photo. The output YAML is still written with null fields so the file won't be re-processed accidentally — use --force --verbose to retry and inspect the raw OCR output.
Fields are null but text was extracted
Run with --verbose to see the raw OCR text and check whether the label uses non-standard abbreviations. The extraction prompt covers the most common label formats.