Production-Ready Passport OCR API for PHP

Achieve 99.8%+ extraction accuracy and sub-second latency for ICAO 9303 compliant passports via a direct HTTP POST request.

Steve HarringtonUpdated 2026-01-16
A diagram showing a passport image being sent to the StructOCR API endpoint, which then returns a structured JSON object with extracted passport fields like name, DOB, and passport number.
Figure 1: StructOCR converts raw Passport images into validated JSON data.

Why Passport OCR is Difficult

Generic OCR tools like Tesseract fail in production because passports are not simple text documents. Real-world images suffer from variable lighting, glare on holographic overlays, and perspective skew from mobile captures. The core data resides in the Machine Readable Zone (MRZ), which follows the strict ICAO 9303 standard. Parsing this requires precise character segmentation, font recognition, and algorithmic validation of check digits for the passport number, date of birth, and expiry date. Maintaining custom RegEx patterns to parse this data is brittle and expensive, failing silently as new passport revisions are issued globally. This high maintenance burden and low accuracy make open-source tools an unacceptable business risk.

Enterprise-Grade Extraction with StructOCR

StructOCR bypasses the fragility of template-based systems. Our API uses pre-trained Deep Learning models specifically architected for identity documents. Upon receiving an image, our system performs automatic pre-processing, including perspective correction (deskewing), glare removal, and ISO/IEC 18013-1 compliance checks. Unlike Tesseract, which returns unstructured text lines, StructOCR identifies, parses, and validates every field, including the MRZ checksums. The result is a guaranteed, standardized JSON output that maps directly to your application's data model, eliminating the need for any post-processing logic on your end.

Production Use Cases

  • Digital Onboarding (KYC): Reduce drop-off rates by pre-filling user data from Passports in under 2 seconds.
  • Fraud Prevention: Detect tampered fonts or mismatched MRZ checksums automatically.
  • Global Compliance: Handle Passports from 200+ jurisdictions without custom rules.

Implementation: Raw PHP (cURL)

The following PHP code demonstrates a complete flow using cURL. It handles image encoding, sets the 'x-api-key' header, and parses both MRZ and Visual Zone (VIZ) data fields.

Prerequisite: PHP 7.4+ with cURL extension

CODE EXAMPLE
<?php

// 💰 Save 30%+ vs competitors. Get 200 free requests instantly:
// 👉 https://structocr.com/register

$apiKey = 'YOUR_API_KEY_HERE';
$apiUrl = 'https://api.structocr.com/v1/passport';
$imagePath = 'passport.jpg'; // Supports JPG, PNG, WebP

// 1. Validate File
if (!file_exists($imagePath)) {
    die("Error: File not found at $imagePath");
}

// 2. Encode Image to Base64
$imageData = file_get_contents($imagePath);
$base64Image = base64_encode($imageData);

// 3. Prepare Payload
$payload = json_encode(['img' => $base64Image]);

// 4. Initialize cURL
$ch = curl_init();
curl_setopt_array($ch, [
    CURLOPT_URL => $apiUrl,
    CURLOPT_RETURNTRANSFER => true,
    CURLOPT_POST => true,
    CURLOPT_POSTFIELDS => $payload,
    CURLOPT_HTTPHEADER => [
        'Content-Type: application/json',
        'x-api-key: ' . $apiKey, // Required Header
        'Content-Length: ' . strlen($payload)
    ]
]);

// 5. Execute Request
$response = curl_exec($ch);
$httpCode = curl_getinfo($ch, CURLINFO_HTTP_CODE);

if (curl_errno($ch)) {
    die('cURL Error: ' . curl_error($ch));
}
curl_close($ch);

// 6. Handle Response
$result = json_decode($response, true);

if ($httpCode === 200 && isset($result['success']) && $result['success']) {
    $data = $result['data'];
    
    echo "✅ Passport Processed!\n";
    echo "----------------------\n";
    echo "Passport #: " . ($data['passport_number'] ?? 'N/A') . "\n";
    echo "Name:       " . ($data['given_names'] ?? '') . " " . ($data['surname'] ?? '') . "\n";
    echo "Nation:     " . ($data['nationality'] ?? 'N/A') . " (" . ($data['country_code'] ?? '') . ")\n";
    
    // Visual Zone Fields (Not available in standard MRZ)
    echo "Birth Place:" . ($data['place_of_birth'] ?? 'N/A') . "\n";
    echo "Issued At:  " . ($data['place_of_issue'] ?? 'N/A') . "\n";
    
    echo "DOB:        " . ($data['date_of_birth'] ?? 'N/A') . "\n";
    echo "Expiry:     " . ($data['date_of_expiry'] ?? 'N/A') . "\n";
} else {
    echo "❌ Processing Failed (Status $httpCode)\n";
    if (isset($result['error'])) {
        echo "Error: " . $result['error'] . "\n";
    } else {
        echo $response;
    }
}
?>

Technical Specs

  • Latency: < 5s (Average)
  • Uptime: 98.5% SLA
  • Security: AES-256 Encryption & SOC2 Compliant
  • Input: JPG, PNG, WebP (Base64 Encoded)
  • Max File Size: 4.5MB
  • Output: JSON (Structured Data)

Key Features

  • Visual Extraction (VIZ): Parses non-MRZ data fields like Place of Birth and Issuing Authority.
  • Global Support: Optimized for 195+ countries, handling complex backgrounds and holograms.
  • Date Normalization: Returns all dates (Birth, Issue, Expiry) in a standardized YYYY-MM-DD format.

Sample JSON Output

StructOCR returns a normalized, predictable JSON object, regardless of the input image angle or quality.

{
  "success": true,
  "data": {
    "type": "passport",
    "country_code": "USA",
    "nationality": "UNITED STATES",
    "passport_number": "E12345678",
    "surname": "DOE",
    "given_names": "JOHN",
    "sex": "M",
    "date_of_birth": "1990-01-01",
    "place_of_birth": "NEW YORK, USA",
    "date_of_issue": "2020-01-01",
    "date_of_expiry": "2030-01-01",
    "place_of_issue": "PASSPORT AGENCY"
  }
}

Frequently Asked Questions

How does StructOCR compare to AWS Textract or Google Vision?

General-purpose services like Textract return raw text lines or key-value pairs, forcing your engineers to write and maintain brittle parsing logic. StructOCR is a specialized API; it understands the structure of a passport, performs internal validation (like MRZ checksums), and returns a clean, structured JSON object with explicitly defined fields like `date_of_birth` and `passport_number`.

Do you store the uploaded images?

No. We operate on a zero-retention policy for customer data. All uploaded images are processed in-memory and are permanently deleted immediately after the OCR extraction is complete. Nothing is written to disk.

How do you handle blurry or low-quality images?

Our API includes an automatic pre-processing pipeline that performs image enhancement before the OCR engine runs. This includes deskewing, denoising, and contrast correction to maximize accuracy on sub-optimal images captured from mobile devices.

More OCR Tutorials

Precise Data Extraction and Seamless Integration with AI-powered OCR API.

Empower your solutions with automated data extraction by integrating best-in class StructOCR via API seamlessly.

No credit card required • Full API access included