Eliminate Passport Data Entry Errors with a Single API Call
Achieve 99.8%+ accuracy and sub-second response times for ICAO 9303 compliant MRZ and VIZ data extraction.

Why Passport OCR is Difficult
Open-source tools like Tesseract fail on real-world passport scans due to common image defects like holographic glare, shadows, and non-standard lighting. Skew and rotation require complex pre-processing pipelines that are brittle and expensive to maintain. Furthermore, simply extracting text is insufficient. Parsing the ICAO 9303 compliant Machine Readable Zone (MRZ) involves validating complex checksums and character encodings. Manually building and maintaining RegEx patterns for dozens of international passport variations is a significant engineering overhead that rarely achieves the accuracy required for production systems, leading to costly manual review cycles.
Enterprise-Grade Extraction with StructOCR
StructOCR replaces fragile, multi-step OCR pipelines with a single, reliable API endpoint. Our pre-trained deep learning models are specialized for identity documents, significantly outperforming generic engines like Tesseract. The API handles image pre-processing automatically, including deskewing, denoising, and glare removal, before extraction. Instead of returning a wall of unstructured text, StructOCR delivers a predictable, standardized JSON object with validated fields, including correctly calculated MRZ checksums. This eliminates the need for complex post-processing logic on your end.
Production Use Cases
- Digital Onboarding (KYC): Reduce drop-off rates by pre-filling user data from Passports in < 2 seconds.
- Fraud Prevention: Detect tampered fonts or mismatched MRZ checksums automatically.
- Global Compliance: Handle Passports from 200+ jurisdictions without custom rules.
Implementation: Java (Standard HttpClient)
The following Java code uses the native `java.net.http.HttpClient` (Java 11+). It demonstrates how to handle `x-api-key` authentication and extract both MRZ and Visual Zone (VIZ) data without external dependencies.
Prerequisite: JDK 11+
import java.io.IOException;
import java.net.URI;
import java.net.http.HttpClient;
import java.net.http.HttpRequest;
import java.net.http.HttpResponse;
import java.nio.file.Files;
import java.nio.file.Path;
import java.time.Duration;
import java.util.Base64;
public class PassportOcrExample {
// 💰 Save 30%+ vs competitors. Get 200 free requests instantly:
// 👉 https://structocr.com/register
private static final String API_KEY = "YOUR_API_KEY_HERE";
private static final String API_ENDPOINT = "https://api.structocr.com/v1/passport";
public static void main(String[] args) {
// Note: Supports JPG, PNG, WebP (Max 4.5MB)
String imagePath = "passport.jpg";
try {
// 1. Validate File
Path path = Path.of(imagePath);
if (!Files.exists(path)) {
System.err.println("Error: File not found at " + path.toAbsolutePath());
return;
}
// 2. Read and Encode Image
byte[] imageBytes = Files.readAllBytes(path);
String base64Image = Base64.getEncoder().encodeToString(imageBytes);
// 3. Construct JSON Payload (Dependency-free)
// For production, use Jackson or Gson.
String jsonPayload = "{\"img\": \"" + base64Image + "\"}";
// 4. Create HttpClient
HttpClient client = HttpClient.newBuilder()
.version(HttpClient.Version.HTTP_1_1)
.connectTimeout(Duration.ofSeconds(10))
.build();
// 5. Build Request
// Important: 'x-api-key' is required in the header
HttpRequest request = HttpRequest.newBuilder()
.uri(URI.create(API_ENDPOINT))
.header("Content-Type", "application/json")
.header("x-api-key", API_KEY)
.timeout(Duration.ofSeconds(30))
.POST(HttpRequest.BodyPublishers.ofString(jsonPayload))
.build();
System.out.println("Scanning passport at " + API_ENDPOINT + "...");
// 6. Send Request
HttpResponse<String> response = client.send(request, HttpResponse.BodyHandlers.ofString());
// 7. Output Result
if (response.statusCode() == 200) {
System.out.println("✅ Extraction Successful!");
// The response contains both MRZ data and Visual Zone (VIZ) data.
// Parse the JSON to access unique fields like 'place_of_birth' or 'place_of_issue'.
System.out.println(response.body());
} else {
System.err.println("❌ API Error: " + response.statusCode());
System.err.println(response.body());
}
} catch (IOException | InterruptedException e) {
System.err.println("Request failed: " + e.getMessage());
Thread.currentThread().interrupt();
}
}
}Technical Specs
- •Latency: < 5s (Average)
- •Uptime: 98.5% SLA
- •Security: AES-256 Encryption & SOC2 Compliant
- •Input: JPG, PNG, WebP (Base64 Encoded)
- •Max File Size: 4.5MB
- •Output: JSON (Structured Data)
Key Features
- •Visual Extraction (VIZ): Parses non-MRZ data fields like Place of Birth and Issuing Authority.
- •Global Support: Optimized for 195+ countries, handling complex backgrounds and holograms.
- •Date Normalization: Returns all dates (Birth, Issue, Expiry) in a standardized YYYY-MM-DD format.
Sample JSON Output
StructOCR returns a normalized JSON object, regardless of the input image angle or quality.
{
"success": true,
"data": {
"type": "passport",
"country_code": "USA",
"nationality": "UNITED STATES",
"passport_number": "E12345678",
"surname": "DOE",
"given_names": "JOHN",
"sex": "M",
"date_of_birth": "1990-01-01",
"place_of_birth": "NEW YORK, USA",
"date_of_issue": "2020-01-01",
"date_of_expiry": "2030-01-01",
"place_of_issue": "PASSPORT AGENCY"
}
}Frequently Asked Questions
How does StructOCR compare to AWS Textract or Google Vision?
Unlike general-purpose OCR services like AWS Textract or Google Vision that return raw text blocks, StructOCR is a specialized model. It is pre-trained on millions of identity documents to identify and return structured, labeled data fields (e.g., `date_of_birth`, `passport_number`), including validation of MRZ checksums. This eliminates the need for complex and brittle post-processing logic.
Do you store the uploaded images?
No. All image data is processed in-memory and is purged immediately following the API request. We do not persist PII on our systems, aligning with strict data privacy and compliance standards.
How to handle blurry images?
The API includes an integrated image enhancement engine that automatically attempts to deblur, denoise, and correct contrast on suboptimal images before the extraction process begins, maximizing the success rate on real-world captures.
More OCR Tutorials
Java Driver's License OCR API
High-accuracy Java Driver's License OCR API. Get structured JSON output from images via a simple HTTP request. Eliminate manual entry & Tesseract errors.
Java Invoice OCR API
Get high accuracy Java invoice OCR with structured JSON output. Automate AP, extract line items, and eliminate manual entry. Integrate our REST API in minutes.
Java National ID OCR API
Reliable Java API for National ID OCR. Achieve 99.8%+ accuracy for KYC automation. Get structured JSON output from any ID image. High Accuracy guaranteed.
Java VIN (Vehicle Identification Number) OCR API
Tutorial: How to use the StructOCR Java Client to extract data from VIN (Vehicle Identification Number)s. Includes code samples and JSON schema.
Precise Data Extraction and Seamless
Integration with AI-powered OCR API.
Empower your solutions with automated data extraction by
integrating best-in class StructOCR via API seamlessly.
No credit card required • Full API access included