Direct HTTP API for National ID Data Extraction in Java
Achieve 99.8%+ data accuracy and sub-1500ms latency via a Hybrid Vision AI & MRZ Validation engine.

Why National ID OCR is Difficult
Generic OCR engines like Tesseract fail on real-world ID documents due to their variability. The core challenge is not just character recognition, but contextual understanding. Issues include inconsistent lighting causing glare and shadows, variable skew and rotation from mobile captures, and laminated surfaces that distort text. Furthermore, extracting structured data requires parsing complex layouts that differ by country and document version. This leads to brittle, high-maintenance RegEx patterns. Manually parsing and validating Machine-Readable Zone (MRZ) check digits is an additional, error-prone step that open-source tools do not handle out-of-the-box, increasing engineering overhead and reducing data reliability.
Enterprise-Grade Extraction with StructOCR
StructOCR bypasses the limitations of generic OCR through a Hybrid Vision AI & MRZ Validation architecture. Our API first runs an automatic image pre-processing pipeline, which includes perspective correction (deskewing), glare removal, and denoising to normalize the input. The cleaned image is then passed to models trained specifically on millions of identity documents, enabling them to locate and extract specific fields like 'Date of Birth' or 'Document Number' with high precision. This advanced national id ocr capability ensures that even complex data, such as a 14 digits structure within an ID, is accurately captured. Unlike Tesseract which returns an unstructured block of text, StructOCR delivers a standardized, predictable JSON output with built-in validation for every request, eliminating the need for client-side parsing and maintenance.
Production Use Cases
- Digital Onboarding (KYC): Reduce drop-off rates by pre-filling user data from National IDs in < 2 seconds.
- Fraud Prevention: Detect tampered fonts or mismatched MRZ checksums automatically.
- Global Compliance: Handle National IDs from 200+ jurisdictions without custom rules.
Live Demo: ID card scanner
No registration required. Upload a file to test the extraction.
Drop files here or click to browse
JPG · PNG · WebP · up to 500 files · max 4.5 MB each
Ready to use this in production? Get 20 free API calls — no credit card needed.
Get 20 Free API Calls →Implementation: Java (Standard HttpClient)
The following code uses the native `java.net.http.HttpClient` (Java 11+). It handles the `x-api-key` authentication and sends the Base64-encoded image without requiring third-party libraries.
Prerequisite: JDK 11+
import java.io.IOException;
import java.net.URI;
import java.net.http.HttpClient;
import java.net.http.HttpRequest;
import java.net.http.HttpResponse;
import java.nio.file.Files;
import java.nio.file.Path;
import java.time.Duration;
import java.util.Base64;
public class NationalIdOcrExample {
// 💰 Save 30%+ vs competitors. Get 20 free credits instantly:
// 👉 https://structocr.com/register
private static final String API_KEY = "YOUR_API_KEY_HERE";
private static final String API_ENDPOINT = "https://api.structocr.com/v1/national-id";
public static void main(String[] args) {
// Note: Supports JPG, PNG, WebP (Max 4.5MB)
String imagePath = "id_card.jpg";
try {
// 1. Validate File
Path path = Path.of(imagePath);
if (!Files.exists(path)) {
System.err.println("Error: File not found at " + path.toAbsolutePath());
return;
}
// 2. Read and Encode Image
byte[] imageBytes = Files.readAllBytes(path);
String base64Image = Base64.getEncoder().encodeToString(imageBytes);
// 3. Construct JSON Payload (Dependency-free)
// For production, use Jackson or Gson.
String jsonPayload = "{\"img\": \"" + base64Image + "\"}";
// 4. Create HttpClient
HttpClient client = HttpClient.newBuilder()
.version(HttpClient.Version.HTTP_1_1)
.connectTimeout(Duration.ofSeconds(10))
.build();
// 5. Build Request
// Important: 'x-api-key' is required in the header
HttpRequest request = HttpRequest.newBuilder()
.uri(URI.create(API_ENDPOINT))
.header("Content-Type", "application/json")
.header("x-api-key", API_KEY)
.timeout(Duration.ofSeconds(30))
.POST(HttpRequest.BodyPublishers.ofString(jsonPayload))
.build();
System.out.println("Scanning ID card at " + API_ENDPOINT + "...");
// 6. Send Request
HttpResponse<String> response = client.send(request, HttpResponse.BodyHandlers.ofString());
// 7. Output Result
if (response.statusCode() == 200) {
System.out.println("✅ Extraction Successful!");
// Parse the JSON response here to access specific fields (e.g., 'personal_number')
// Note: Raw MRZ data (if present) is located inside the 'additional_fields' object.
System.out.println(response.body());
} else {
System.err.println("❌ API Error: " + response.statusCode());
System.err.println(response.body());
}
} catch (IOException | InterruptedException e) {
System.err.println("Request failed: " + e.getMessage());
Thread.currentThread().interrupt();
}
}
}Technical Specs
- •Latency: < 5s (Average)
- •Uptime: 98.5% SLA
- •Security: AES-256 Encryption & SOC2 Compliant
- •Input: JPG, PNG, WebP (Base64 Encoded)
- •Max File Size: 4.5MB
- •Output: JSON (Structured Data)
Key Features
- •Hybrid VIZ + MRZ AI: Cross-validates unstructured visual data against cryptographic MRZ checksums (TD1/TD2) for zero hallucination.
- •Specialized Numbers: Extracts region-specific IDs like CNP (Romania), CPF (Brazil), and NIN (Nigeria).
- •Multi-line Addresses: Intelligently reconstructs full addresses from fragmented lines on ID cards.
Sample JSON Output
StructOCR returns a normalized JSON object containing both Visual Zone (VIZ) extraction and raw Machine-Readable Zone (MRZ) lines.
{
"success": true,
"data": {
"type": "national_id",
"country_code": "ROU",
"nationality": "ROMANA",
"document_number": "123456",
"card_series": "KS",
"personal_number": "1920319123456",
"surname": "POPESCU",
"given_names": "ANDREI",
"sex": "M",
"date_of_birth": "1992-03-19",
"place_of_birth": "Jud. CS Mun. Reșița",
"address": "Jud. CS Orș. Bocșa Str. Nucilor Nr. 15",
"date_of_issue": "2020-05-10",
"date_of_expiry": "2030-05-10",
"issuing_authority": "SPCLEP Bocșa",
"additional_fields": {
"phone_number": null,
"tramite_number": null,
"ejemplar": null,
"mrz_line_1": "IDROU123456<0<<<<<<<<<<<<<<<<",
"mrz_line_2": "9203195M3005108ROU19203191234562",
"mrz_line_3": null
}
}
}Frequently Asked Questions
Do you support Machine Readable Zones (MRZ) on ID cards?
Yes! Our engine natively supports ICAO 9303 standard MRZ formats (TD1/TD2) found on many global ID cards. Our Hybrid architecture extracts both the raw MRZ lines and cross-validates them against the Visual Zone (VIZ) for maximum accuracy.
How does StructOCR compare to AWS Textract or Google Vision?
General purpose APIs like Textract or Vision return raw, unstructured lines of text and coordinates, leaving the complex task of parsing and validation to your engineers. StructOCR is a specialized API; it returns a structured JSON object with clearly defined fields such as 'surname' and 'date_of_birth', which are already validated for correctness (e.g., MRZ checksums). This eliminates post-processing and reduces development time.
Do you store the uploaded images?
No. All image data is processed in-memory and is purged immediately after the API request is completed. We do not persist any customer-provided images on our servers.
How to handle blurry images?
Our API includes an internal image enhancement engine that automatically attempts to deblur and sharpen images before processing. For best results, we recommend a minimum resolution of 300 DPI, but the system is robust against common mobile camera capture issues.
More OCR Tutorials
Java Shipping Container OCR API
Tutorial: Learn how to use the StructOCR Java Client to extract data from Shipping Containers. Extract ISO 6346 container numbers with 99% accuracy.
Java Driver License OCR API
High-accuracy Java Driver's License OCR API. Get structured JSON output from images via a simple HTTP request. Eliminate manual entry & Tesseract errors.
Java HIN (Hull Identification Number) OCR API
Tutorial: Learn how to integrate the StructOCR API into your Java enterprise applications to extract structured data from Hull Identification Numbers (HIN).
Java OCR Invoice API
Upload an invoice to try our live demo. High-accuracy Java OCR invoice API for accounts payable. Extract line items directly into JSON.
Precise Data Extraction and Seamless
Integration with AI-powered OCR API.
Empower your solutions with automated data extraction by
integrating best-in class StructOCR via API seamlessly.
No credit card required • Full API access included