Python SDK for Enterprise-Grade National ID OCR
Achieve 99.8% field-level accuracy and sub-second latency for identity verification and KYC automation.

Why National ID OCR is Difficult
Generic OCR engines like Tesseract fail on National IDs due to inherent complexities. Laminate glare, shadows, and non-uniform lighting create artifacts that corrupt character recognition. User-submitted images often suffer from significant skew and rotation, requiring robust preprocessing. Furthermore, parsing extracted text is a brittle process. It involves maintaining complex RegEx patterns for dozens of ID layouts, which constantly change. Manually implementing logic to parse the Machine-Readable Zone (MRZ) and validate its check digits is error-prone and adds significant engineering overhead. These challenges lead to high error rates and unsustainable maintenance costs for in-house solutions.
Enterprise-Grade Extraction with StructOCR
StructOCR bypasses these challenges using specialized, pre-trained deep learning models. Our API handles the entire pipeline, from automatic image pre-processing—including perspective correction, glare removal, and denoising—to data extraction. Unlike Tesseract, which returns unstructured text lines, StructOCR's models are trained specifically on identity documents to locate and identify semantic fields. The result is a standardized, validated JSON object, eliminating the need for post-processing or manual data validation, delivering production-ready data directly to your application.
Production Use Cases
- Digital Onboarding (KYC): Reduce drop-off rates by pre-filling user data from National IDs in < 2 seconds.
- Fraud Prevention: Detect tampered fonts or mismatched MRZ checksums automatically.
- Global Compliance: Handle National IDs from 200+ jurisdictions without custom rules.
Implementation: Python SDK
The official Python SDK abstracts the API complexity. It automatically parses region-specific fields like CNP (Romania), CPF (Brazil), or NIN (Nigeria) into a standardized structure.
Prerequisite: pip install structocr
from structocr import StructOCR
# 💰 Save 30%+ vs competitors. Get 200 free requests instantly:
# 👉 https://structocr.com/register
# Initialize with your API Key
client = StructOCR("YOUR_API_KEY_HERE")
def scan_national_id():
# Note: Supports JPG, PNG, WebP (Max 4.5MB)
image_path = "id_card.jpg"
try:
print(f"Scanning {image_path}...")
# The SDK handles file upload and API communication
result = client.scan_national_id(image_path)
# Check success flag (SDK returns a dict matching the JSON response)
if result.get('success'):
data = result['data']
print("✅ Extraction Successful!")
# Basic Identity
print(f"Region: {data.get('country_code')} (Series: {data.get('card_series')})")
print(f"Name: {data.get('given_names')} {data.get('surname')}")
print(f"ID Number: {data.get('document_number')}")
# Critical Field: Personal Identity Number (CNP/CPF/NIN)
print(f"Personal #: {data.get('personal_number')}")
# Demographics
print(f"DOB: {data.get('date_of_birth')} ({data.get('sex')})")
print(f"Address: {data.get('address')}")
else:
print(f"❌ Extraction Failed: {result.get('error')}")
except Exception as e:
# Handle SDK or Network errors
print(f"An error occurred: {e}")
if __name__ == "__main__":
scan_national_id()Technical Specs
- •Latency: < 5s (Average)
- •Uptime: 98.5% SLA
- •Security: AES-256 Encryption & SOC2 Compliant
- •Input: JPG, PNG, WebP (File Path)
- •Max File Size: 4.5MB
- •Output: JSON (Structured Data)
Key Features
- •Specialized Numbers: Extracts region-specific IDs like CNP (Romania), CPF (Brazil), and NIN (Nigeria).
- •Multi-line Addresses: Intelligently reconstructs full addresses from fragmented lines on ID cards.
- •Date Normalization: Returns all dates (Birth, Issue, Expiry) in a standardized YYYY-MM-DD format.
Sample JSON Output
StructOCR returns a normalized JSON object, regardless of the input image angle or quality.
{
"success": true,
"data": {
"type": "national_id",
"country_code": "ROU",
"nationality": "ROMANA",
"document_number": "123456",
"card_series": "KS",
"personal_number": "1920319123456",
"surname": "POPESCU",
"given_names": "ANDREI",
"sex": "M",
"date_of_birth": "1992-03-19",
"place_of_birth": "Jud. CS Mun. Reșița",
"address": "Jud. CS Orș. Bocșa Str. Nucilor Nr. 15",
"date_of_issue": "2020-05-10",
"date_of_expiry": "2030-05-10",
"issuing_authority": "SPCLEP Bocșa"
}
}Frequently Asked Questions
How does StructOCR compare to AWS Textract or Google Vision?
General-purpose OCR services like AWS Textract and Google Vision return raw, unstructured text dumps or simple key-value pairs. You are still responsible for parsing, validating, and structuring that data. StructOCR is a specialized model trained exclusively on identity documents. It returns a fully parsed, validated JSON object with predefined fields like `date_of_birth` and `document_number`, eliminating the need for any post-processing logic.
Do you store the uploaded images?
We do not store customer images. All uploaded files are processed in-memory and are permanently deleted immediately after the OCR extraction process is complete. Your data privacy is paramount.
How to handle blurry images?
Our API includes a powerful, automatic image enhancement engine. Before extraction, it performs denoising, deblurring, and contrast correction to maximize the accuracy of results from low-quality or blurry source images.
More OCR Tutorials
Python Driver's License OCR API
Extract driver's license data with our high-accuracy Python SDK. Get structured JSON output in seconds, eliminating manual entry and Tesseract errors.
Python Invoice OCR API
High-accuracy Invoice OCR API for Python. Get structured JSON output with line items, totals, and merchant data. Eliminate Tesseract errors with our Python SDK.
Python Passport OCR API
Reliable Python Passport OCR API for high-accuracy data extraction. Get structured JSON output in milliseconds using our dedicated Python SDK. Eliminate errors.
Python VIN (Vehicle Identification Number) OCR API
Tutorial: How to use the StructOCR Python SDK to extract data from VIN (Vehicle Identification Number)s. Includes code samples and JSON schema.
Precise Data Extraction and Seamless
Integration with AI-powered OCR API.
Empower your solutions with automated data extraction by
integrating best-in class StructOCR via API seamlessly.
No credit card required • Full API access included