Direct PHP National ID OCR API: Structured Data from Raw Images
Achieve 99.8% extraction accuracy and sub-1.5 second latency with a single HTTP POST request.

Why National ID OCR is Difficult
Generic OCR tools like Tesseract fail on real-world ID documents due to complexities they are not designed for. Physical cards suffer from photographic glare, shadows, and inconsistent lighting that corrupt character data. Users submit images with skew and rotation, requiring geometric correction before processing. Furthermore, simple text extraction is insufficient; data like MRZ (Machine-Readable Zone) lines contain checksums that must be algorithmically validated, not just read. Maintaining a library of RegEx patterns to parse varying ID formats across jurisdictions is a significant and brittle engineering overhead. These factors combine to make in-house ID processing a low-accuracy, high-maintenance liability.
Enterprise-Grade Extraction with StructOCR
StructOCR bypasses the fragility of open-source solutions by using pre-trained Deep Learning models specialized for identity documents. Our API doesn't just read text; it understands the structure of an ID card. Upon receiving an image, our system performs automatic pre-processing, including deskewing, glare removal, and denoising to normalize the input. The cleaned image is then processed by our models to extract specific fields, which are cross-validated and returned in a standardized JSON output. This eliminates the need for manual parsing, RegEx maintenance, and complex image processing pipelines, providing a reliable, single-step solution that consistently outperforms generic OCR engines.
Production Use Cases
- Digital Onboarding (KYC): Reduce drop-off rates by pre-filling user data from National IDs in < 2 seconds.
- Fraud Prevention: Detect tampered fonts or mismatched MRZ checksums automatically.
- Global Compliance: Handle National IDs from 200+ jurisdictions without custom rules.
Implementation: Raw PHP (cURL)
The following PHP code demonstrates a complete flow using cURL. It handles image encoding, sets the required 'x-api-key' header, and parses region-specific fields (CNP, CPF, Address).
Prerequisite: PHP 7.4+ with cURL extension
<?php
// 💰 Save 30%+ vs competitors. Get 200 free requests instantly:
// 👉 https://structocr.com/register
$apiKey = 'YOUR_API_KEY_HERE';
$apiUrl = 'https://api.structocr.com/v1/national-id';
$imagePath = 'id_card.jpg'; // Supports JPG, PNG, WebP
// 1. Validate File
if (!file_exists($imagePath)) {
die("Error: File not found at $imagePath");
}
// 2. Encode Image to Base64
$imageData = file_get_contents($imagePath);
$base64Image = base64_encode($imageData);
// 3. Prepare Payload
$payload = json_encode(['img' => $base64Image]);
// 4. Initialize cURL
$ch = curl_init();
curl_setopt_array($ch, [
CURLOPT_URL => $apiUrl,
CURLOPT_RETURNTRANSFER => true,
CURLOPT_POST => true,
CURLOPT_POSTFIELDS => $payload,
CURLOPT_HTTPHEADER => [
'Content-Type: application/json',
'x-api-key: ' . $apiKey, // Required Header
'Content-Length: ' . strlen($payload)
]
]);
// 5. Execute Request
$response = curl_exec($ch);
$httpCode = curl_getinfo($ch, CURLINFO_HTTP_CODE);
if (curl_errno($ch)) {
die('cURL Error: ' . curl_error($ch));
}
curl_close($ch);
// 6. Handle Response
$result = json_decode($response, true);
if ($httpCode === 200 && isset($result['success']) && $result['success']) {
$data = $result['data'];
echo "✅ National ID Processed!\n";
echo "--------------------------\n";
echo "Region: " . ($data['country_code'] ?? 'N/A') . "\n";
echo "Name: " . ($data['given_names'] ?? '') . " " . ($data['surname'] ?? '') . "\n";
echo "Doc Number: " . ($data['document_number'] ?? 'N/A') . " (Series: " . ($data['card_series'] ?? 'N/A') . ")\n";
// Critical for verification (CNP, CPF, NIN)
echo "Personal #: " . ($data['personal_number'] ?? 'N/A') . "\n";
echo "DOB: " . ($data['date_of_birth'] ?? 'N/A') . "\n";
echo "Address: " . ($data['address'] ?? 'N/A') . "\n";
} else {
echo "❌ Processing Failed (Status $httpCode)\n";
if (isset($result['error'])) {
echo "Error: " . $result['error'] . "\n";
} else {
echo $response;
}
}
?>Technical Specs
- •Latency: < 5s (Average)
- •Uptime: 98.5% SLA
- •Security: AES-256 Encryption & SOC2 Compliant
- •Input: JPG, PNG, WebP (Base64 Encoded)
- •Max File Size: 4.5MB
- •Output: JSON (Structured Data)
Key Features
- •Specialized Numbers: Extracts region-specific IDs like CNP (Romania), CPF (Brazil), and NIN (Nigeria).
- •Multi-line Addresses: Intelligently reconstructs full addresses from fragmented lines on ID cards.
- •Date Normalization: Returns all dates (Birth, Issue, Expiry) in a standardized YYYY-MM-DD format.
Sample JSON Output
StructOCR returns a normalized JSON object, regardless of the input image angle or quality. Field names are consistent across all supported document types.
{
"success": true,
"data": {
"type": "national_id",
"country_code": "ROU",
"nationality": "ROMANA",
"document_number": "123456",
"card_series": "KS",
"personal_number": "1920319123456",
"surname": "POPESCU",
"given_names": "ANDREI",
"sex": "M",
"date_of_birth": "1992-03-19",
"place_of_birth": "Jud. CS Mun. Reșița",
"address": "Jud. CS Orș. Bocșa Str. Nucilor Nr. 15",
"date_of_issue": "2020-05-10",
"date_of_expiry": "2030-05-10",
"issuing_authority": "SPCLEP Bocșa"
}
}Frequently Asked Questions
How does StructOCR compare to AWS Textract or Google Vision?
General-purpose OCR services like Textract or Vision return raw lines of text and their coordinates. You are still responsible for parsing, validating, and structuring that data. StructOCR is a specialized, vertical solution that performs these steps for you, returning a clean JSON object with labeled fields like `surname` and `date_of_birth` directly.
Do you store the uploaded images?
No. Images are processed entirely in-memory and are permanently deleted immediately after the extraction request is completed. We do not persist sensitive customer data on our servers.
How do you handle blurry or low-quality images?
Our API pipeline includes an automatic image enhancement engine that applies denoising, deblurring, and contrast correction before analysis. This significantly improves extraction accuracy on sub-optimal images captured by mobile devices.
More OCR Tutorials
PHP Driver's License OCR API
High-accuracy PHP API for Driver's License OCR. Parse PDF417 barcodes and extract data directly to a standardized JSON output. Stop fighting Tesseract.
PHP Invoice OCR API
High-accuracy invoice OCR API for PHP. Eliminate manual entry and parse tables into structured JSON output. Integrate via raw HTTP POST.
PHP Passport OCR API
High accuracy PHP Passport OCR API for parsing ICAO 9303 documents. Get standardized JSON output from any passport image without complex RegEx.
PHP VIN (Vehicle Identification Number) OCR API
Tutorial: How to use the StructOCR PHP Client to extract data from VIN (Vehicle Identification Number)s. Includes code samples and JSON schema.
Precise Data Extraction and Seamless
Integration with AI-powered OCR API.
Empower your solutions with automated data extraction by
integrating best-in class StructOCR via API seamlessly.
No credit card required • Full API access included