Production-Ready Passport OCR API for PHP
Achieve 99.8%+ extraction accuracy and sub-second latency for ICAO 9303 compliant passports via a direct HTTP POST request.

Why Passport OCR is Difficult
Generic OCR tools like Tesseract fail in production because passports are not simple text documents. Real-world images suffer from variable lighting, glare on holographic overlays, and perspective skew from mobile captures. The core data resides in the Machine Readable Zone (MRZ), which follows the strict ICAO 9303 standard. Parsing this requires precise character segmentation, font recognition, and algorithmic validation of check digits for the passport number, date of birth, and expiry date. Maintaining custom RegEx patterns to parse this data is brittle and expensive, failing silently as new passport revisions are issued globally. This high maintenance burden and low accuracy make open-source tools an unacceptable business risk.
Enterprise-Grade Extraction with StructOCR
StructOCR bypasses the fragility of template-based systems. Our API uses pre-trained Deep Learning models specifically architected for identity documents. Upon receiving an image, our system performs automatic pre-processing, including perspective correction (deskewing), glare removal, and ISO/IEC 18013-1 compliance checks. Unlike Tesseract, which returns unstructured text lines, StructOCR identifies, parses, and validates every field, including the MRZ checksums. The result is a guaranteed, standardized JSON output that maps directly to your application's data model, eliminating the need for any post-processing logic on your end.
Production Use Cases
- Digital Onboarding (KYC): Reduce drop-off rates by pre-filling user data from Passports in under 2 seconds.
- Fraud Prevention: Detect tampered fonts or mismatched MRZ checksums automatically.
- Global Compliance: Handle Passports from 200+ jurisdictions without custom rules.
Implementation: Raw PHP (cURL)
The following PHP code demonstrates a complete flow using cURL. It handles image encoding, sets the 'x-api-key' header, and parses both MRZ and Visual Zone (VIZ) data fields.
Prerequisite: PHP 7.4+ with cURL extension
<?php
// 💰 Save 30%+ vs competitors. Get 200 free requests instantly:
// 👉 https://structocr.com/register
$apiKey = 'YOUR_API_KEY_HERE';
$apiUrl = 'https://api.structocr.com/v1/passport';
$imagePath = 'passport.jpg'; // Supports JPG, PNG, WebP
// 1. Validate File
if (!file_exists($imagePath)) {
die("Error: File not found at $imagePath");
}
// 2. Encode Image to Base64
$imageData = file_get_contents($imagePath);
$base64Image = base64_encode($imageData);
// 3. Prepare Payload
$payload = json_encode(['img' => $base64Image]);
// 4. Initialize cURL
$ch = curl_init();
curl_setopt_array($ch, [
CURLOPT_URL => $apiUrl,
CURLOPT_RETURNTRANSFER => true,
CURLOPT_POST => true,
CURLOPT_POSTFIELDS => $payload,
CURLOPT_HTTPHEADER => [
'Content-Type: application/json',
'x-api-key: ' . $apiKey, // Required Header
'Content-Length: ' . strlen($payload)
]
]);
// 5. Execute Request
$response = curl_exec($ch);
$httpCode = curl_getinfo($ch, CURLINFO_HTTP_CODE);
if (curl_errno($ch)) {
die('cURL Error: ' . curl_error($ch));
}
curl_close($ch);
// 6. Handle Response
$result = json_decode($response, true);
if ($httpCode === 200 && isset($result['success']) && $result['success']) {
$data = $result['data'];
echo "✅ Passport Processed!\n";
echo "----------------------\n";
echo "Passport #: " . ($data['passport_number'] ?? 'N/A') . "\n";
echo "Name: " . ($data['given_names'] ?? '') . " " . ($data['surname'] ?? '') . "\n";
echo "Nation: " . ($data['nationality'] ?? 'N/A') . " (" . ($data['country_code'] ?? '') . ")\n";
// Visual Zone Fields (Not available in standard MRZ)
echo "Birth Place:" . ($data['place_of_birth'] ?? 'N/A') . "\n";
echo "Issued At: " . ($data['place_of_issue'] ?? 'N/A') . "\n";
echo "DOB: " . ($data['date_of_birth'] ?? 'N/A') . "\n";
echo "Expiry: " . ($data['date_of_expiry'] ?? 'N/A') . "\n";
} else {
echo "❌ Processing Failed (Status $httpCode)\n";
if (isset($result['error'])) {
echo "Error: " . $result['error'] . "\n";
} else {
echo $response;
}
}
?>Technical Specs
- •Latency: < 5s (Average)
- •Uptime: 98.5% SLA
- •Security: AES-256 Encryption & SOC2 Compliant
- •Input: JPG, PNG, WebP (Base64 Encoded)
- •Max File Size: 4.5MB
- •Output: JSON (Structured Data)
Key Features
- •Visual Extraction (VIZ): Parses non-MRZ data fields like Place of Birth and Issuing Authority.
- •Global Support: Optimized for 195+ countries, handling complex backgrounds and holograms.
- •Date Normalization: Returns all dates (Birth, Issue, Expiry) in a standardized YYYY-MM-DD format.
Sample JSON Output
StructOCR returns a normalized, predictable JSON object, regardless of the input image angle or quality.
{
"success": true,
"data": {
"type": "passport",
"country_code": "USA",
"nationality": "UNITED STATES",
"passport_number": "E12345678",
"surname": "DOE",
"given_names": "JOHN",
"sex": "M",
"date_of_birth": "1990-01-01",
"place_of_birth": "NEW YORK, USA",
"date_of_issue": "2020-01-01",
"date_of_expiry": "2030-01-01",
"place_of_issue": "PASSPORT AGENCY"
}
}Frequently Asked Questions
How does StructOCR compare to AWS Textract or Google Vision?
General-purpose services like Textract return raw text lines or key-value pairs, forcing your engineers to write and maintain brittle parsing logic. StructOCR is a specialized API; it understands the structure of a passport, performs internal validation (like MRZ checksums), and returns a clean, structured JSON object with explicitly defined fields like `date_of_birth` and `passport_number`.
Do you store the uploaded images?
No. We operate on a zero-retention policy for customer data. All uploaded images are processed in-memory and are permanently deleted immediately after the OCR extraction is complete. Nothing is written to disk.
How do you handle blurry or low-quality images?
Our API includes an automatic pre-processing pipeline that performs image enhancement before the OCR engine runs. This includes deskewing, denoising, and contrast correction to maximize accuracy on sub-optimal images captured from mobile devices.
More OCR Tutorials
PHP Driver's License OCR API
High-accuracy PHP API for Driver's License OCR. Parse PDF417 barcodes and extract data directly to a standardized JSON output. Stop fighting Tesseract.
PHP Invoice OCR API
High-accuracy invoice OCR API for PHP. Eliminate manual entry and parse tables into structured JSON output. Integrate via raw HTTP POST.
PHP National ID OCR API
High-accuracy National ID OCR for PHP. Get structured JSON output from ID card images. A superior alternative to Tesseract. No complex PHP SDK needed.
PHP VIN (Vehicle Identification Number) OCR API
Tutorial: How to use the StructOCR PHP Client to extract data from VIN (Vehicle Identification Number)s. Includes code samples and JSON schema.
Precise Data Extraction and Seamless
Integration with AI-powered OCR API.
Empower your solutions with automated data extraction by
integrating best-in class StructOCR via API seamlessly.
No credit card required • Full API access included