Direct HTTP API for National ID Data Extraction in Java
Achieve 99.8%+ data accuracy with sub-1500ms latency via a single API call.

Why National ID OCR is Difficult
Generic OCR engines like Tesseract fail on real-world ID documents due to their variability. The core challenge is not just character recognition, but contextual understanding. Issues include inconsistent lighting causing glare and shadows, variable skew and rotation from mobile captures, and laminated surfaces that distort text. Furthermore, extracting structured data requires parsing complex layouts that differ by country and document version. This leads to brittle, high-maintenance RegEx patterns. Manually parsing and validating Machine-Readable Zone (MRZ) check digits is an additional, error-prone step that open-source tools do not handle out-of-the-box, increasing engineering overhead and reducing data reliability.
Enterprise-Grade Extraction with StructOCR
StructOCR bypasses the limitations of generic OCR with specialized, pre-trained Deep Learning models. Our API first runs an automatic image pre-processing pipeline, which includes perspective correction (deskewing), glare removal, and denoising to normalize the input. The cleaned image is then passed to models trained specifically on millions of identity documents, enabling them to locate and extract specific fields like 'Date of Birth' or 'Document Number' with high precision. Unlike Tesseract which returns an unstructured block of text, StructOCR delivers a standardized, predictable JSON output with built-in validation for every request, eliminating the need for client-side parsing and maintenance.
Production Use Cases
- Digital Onboarding (KYC): Reduce drop-off rates by pre-filling user data from National IDs in < 2 seconds.
- Fraud Prevention: Detect tampered fonts or mismatched MRZ checksums automatically.
- Global Compliance: Handle National IDs from 200+ jurisdictions without custom rules.
Implementation: Java (Standard HttpClient)
The following code uses the native `java.net.http.HttpClient` (Java 11+). It handles the `x-api-key` authentication and sends the Base64-encoded image without requiring third-party libraries.
Prerequisite: JDK 11+
import java.io.IOException;
import java.net.URI;
import java.net.http.HttpClient;
import java.net.http.HttpRequest;
import java.net.http.HttpResponse;
import java.nio.file.Files;
import java.nio.file.Path;
import java.time.Duration;
import java.util.Base64;
public class NationalIdOcrExample {
// 💰 Save 30%+ vs competitors. Get 200 free requests instantly:
// 👉 https://structocr.com/register
private static final String API_KEY = "YOUR_API_KEY_HERE";
private static final String API_ENDPOINT = "https://api.structocr.com/v1/national-id";
public static void main(String[] args) {
// Note: Supports JPG, PNG, WebP (Max 4.5MB)
String imagePath = "id_card.jpg";
try {
// 1. Validate File
Path path = Path.of(imagePath);
if (!Files.exists(path)) {
System.err.println("Error: File not found at " + path.toAbsolutePath());
return;
}
// 2. Read and Encode Image
byte[] imageBytes = Files.readAllBytes(path);
String base64Image = Base64.getEncoder().encodeToString(imageBytes);
// 3. Construct JSON Payload (Dependency-free)
// For production, use Jackson or Gson.
String jsonPayload = "{\"img\": \"" + base64Image + "\"}";
// 4. Create HttpClient
HttpClient client = HttpClient.newBuilder()
.version(HttpClient.Version.HTTP_1_1)
.connectTimeout(Duration.ofSeconds(10))
.build();
// 5. Build Request
// Important: 'x-api-key' is required in the header
HttpRequest request = HttpRequest.newBuilder()
.uri(URI.create(API_ENDPOINT))
.header("Content-Type", "application/json")
.header("x-api-key", API_KEY)
.timeout(Duration.ofSeconds(30))
.POST(HttpRequest.BodyPublishers.ofString(jsonPayload))
.build();
System.out.println("Scanning ID card at " + API_ENDPOINT + "...");
// 6. Send Request
HttpResponse<String> response = client.send(request, HttpResponse.BodyHandlers.ofString());
// 7. Output Result
if (response.statusCode() == 200) {
System.out.println("✅ Extraction Successful!");
// Parse the JSON response here to access specific fields like 'personal_number' or 'address'
System.out.println(response.body());
} else {
System.err.println("❌ API Error: " + response.statusCode());
System.err.println(response.body());
}
} catch (IOException | InterruptedException e) {
System.err.println("Request failed: " + e.getMessage());
Thread.currentThread().interrupt();
}
}
}Technical Specs
- •Latency: < 5s (Average)
- •Uptime: 98.5% SLA
- •Security: AES-256 Encryption & SOC2 Compliant
- •Input: JPG, PNG, WebP (Base64 Encoded)
- •Max File Size: 4.5MB
- •Output: JSON (Structured Data)
Key Features
- •Specialized Numbers: Extracts region-specific IDs like CNP (Romania), CPF (Brazil), and NIN (Nigeria).
- •Multi-line Addresses: Intelligently reconstructs full addresses from fragmented lines on ID cards.
- •Date Normalization: Returns all dates (Birth, Issue, Expiry) in a standardized YYYY-MM-DD format.
Sample JSON Output
StructOCR returns a normalized JSON object, regardless of the input image angle or quality.
{
"success": true,
"data": {
"type": "national_id",
"country_code": "ROU",
"nationality": "ROMANA",
"document_number": "123456",
"card_series": "KS",
"personal_number": "1920319123456",
"surname": "POPESCU",
"given_names": "ANDREI",
"sex": "M",
"date_of_birth": "1992-03-19",
"place_of_birth": "Jud. CS Mun. Reșița",
"address": "Jud. CS Orș. Bocșa Str. Nucilor Nr. 15",
"date_of_issue": "2020-05-10",
"date_of_expiry": "2030-05-10",
"issuing_authority": "SPCLEP Bocșa"
}
}Frequently Asked Questions
How does StructOCR compare to AWS Textract or Google Vision?
General purpose APIs like Textract or Vision return raw, unstructured lines of text and coordinates, leaving the complex task of parsing and validation to your engineers. StructOCR is a specialized API; it returns a structured JSON object with clearly defined fields such as 'surname' and 'date_of_birth', which are already validated for correctness (e.g., MRZ checksums). This eliminates post-processing and reduces development time.
Do you store the uploaded images?
No. All image data is processed in-memory and is purged immediately after the API request is completed. We do not persist any customer-provided images on our servers.
How to handle blurry images?
Our API includes an internal image enhancement engine that automatically attempts to deblur and sharpen images before processing. For best results, we recommend a minimum resolution of 300 DPI, but the system is robust against common mobile camera capture issues.
More OCR Tutorials
Java Driver's License OCR API
High-accuracy Java Driver's License OCR API. Get structured JSON output from images via a simple HTTP request. Eliminate manual entry & Tesseract errors.
Java Invoice OCR API
Get high accuracy Java invoice OCR with structured JSON output. Automate AP, extract line items, and eliminate manual entry. Integrate our REST API in minutes.
Java Passport OCR API
Achieve 99%+ accuracy for Passport MRZ and VIZ data extraction in Java. Our API provides structured JSON output, handling glare and blur automatically.
Java VIN (Vehicle Identification Number) OCR API
Tutorial: How to use the StructOCR Java Client to extract data from VIN (Vehicle Identification Number)s. Includes code samples and JSON schema.
Precise Data Extraction and Seamless
Integration with AI-powered OCR API.
Empower your solutions with automated data extraction by
integrating best-in class StructOCR via API seamlessly.
No credit card required • Full API access included