Python SDK for Enterprise-Grade National ID OCR

Q: Do you support Machine Readable Zones (MRZ) on ID cards?

Yes! Our engine natively supports ICAO 9303 standard MRZ formats (TD1/TD2) found on many global ID cards. Our Hybrid architecture extracts both the raw MRZ lines and cross-validates them against the Visual Zone (VIZ) for maximum accuracy.

Q: How does StructOCR compare to AWS Textract or Google Vision?

General-purpose OCR services like AWS Textract and Google Vision return raw, unstructured text dumps or simple key-value pairs. You are still responsible for parsing, validating, and structuring that data. StructOCR is a specialized model trained exclusively on identity documents. It returns a fully parsed, validated JSON object with predefined fields like `date_of_birth` and `document_number`, eliminating the need for any post-processing logic.

Q: Do you store the uploaded images?

We do not store customer images. All uploaded files are processed in-memory and are permanently deleted immediately after the OCR extraction process is complete. Your data privacy is paramount.

Q: How to handle blurry images?

Our API includes a powerful, automatic image enhancement engine. Before extraction, it performs denoising, deblurring, and contrast correction to maximize the accuracy of results from low-quality or blurry source images.

Achieve 99.8% field-level accuracy and sub-second latency via a Hybrid Vision AI & MRZ Validation engine.

Steve Harrington | Solutions Consultant•Updated 2026-05-26

↓ Try Free — Upload Your Image Get 20 Free API Calls — No Credit Card

Diagram showing the StructOCR process: A user uploads a photo of a National ID card, which is then processed by the StructOCR API. The API performs image preprocessing, data extraction, and validation, outputting a structured JSON object with key-value pairs like name, DOB, and document number. — Figure 1: StructOCR converts raw National ID images into validated JSON data.

Why National ID OCR is Difficult

Generic OCR engines like Tesseract fail on National IDs due to inherent complexities. Laminate glare, shadows, and non-uniform lighting create artifacts that corrupt character recognition. User-submitted images often suffer from significant skew and rotation, requiring robust preprocessing. Furthermore, parsing extracted text is a brittle process. It involves maintaining complex RegEx patterns for dozens of ID layouts, which constantly change. Manually implementing logic to parse the Machine-Readable Zone (MRZ) and validate its check digits is error-prone and adds significant engineering overhead. These challenges lead to high error rates and unsustainable maintenance costs for in-house solutions.

Enterprise-Grade Extraction with StructOCR

StructOCR bypasses these challenges using specialized, pre-trained deep learning models. Our id parsing api handles the entire pipeline, from automatic image pre-processing—including perspective correction, glare removal, and denoising—to data extraction. Unlike Tesseract, which returns unstructured text lines, StructOCR's models are trained specifically on identity documents to locate and identify semantic fields, ensuring a 14 digits structure output. The result is a standardized, validated JSON object, eliminating the need for post-processing or manual data validation, delivering production-ready data directly to your application.

Production Use Cases

Digital Onboarding (KYC): Reduce drop-off rates by pre-filling user data from National IDs in < 2 seconds.
Fraud Prevention: Detect tampered fonts or mismatched MRZ checksums automatically.
Global Compliance: Handle National IDs from 200+ jurisdictions without custom rules.

Live Demo: ID card scanner

No registration required. Upload a file to test the extraction.

Upload

Results

↑

Drop files here or click to browse

JPG · PNG · WebP · up to 500 files · max 4.5 MB each

No files selected

Ready to use this in production? Get 20 free API calls — no credit card needed.

Get 20 Free API Calls →

Implementation: Python SDK

The official Python SDK abstracts the API complexity. It automatically parses region-specific fields like CNP (Romania), CPF (Brazil), or NIN (Nigeria) into a standardized structure.

Prerequisite: pip install structocr

CODE EXAMPLE

from structocr import StructOCR

# 💰 Save 30%+ vs competitors. Get 20 free credits instantly:
# 👉 https://structocr.com/register
# Initialize with your API Key
client = StructOCR("YOUR_API_KEY_HERE")

def scan_national_id():
    # Note: Supports JPG, PNG, WebP (Max 4.5MB)
    image_path = "id_card.jpg"

    try:
        print(f"Scanning {image_path}...")
        
        # The SDK handles file upload and API communication
        result = client.scan_national_id(image_path)

        # Check success flag (SDK returns a dict matching the JSON response)
        if result.get('success'):
            data = result['data']
            print("✅ Extraction Successful!")
            
            # Basic Identity
            print(f"Region:     {data.get('country_code')} (Series: {data.get('card_series')})")
            print(f"Name:       {data.get('given_names')} {data.get('surname')}")
            print(f"ID Number:  {data.get('document_number')}")
            
            # Critical Field: Personal Identity Number (CNP/CPF/NIN)
            print(f"Personal #: {data.get('personal_number')}")
            
            # Demographics
            print(f"DOB:        {data.get('date_of_birth')} ({data.get('sex')})")
            print(f"Address:    {data.get('address')}")
            
            # Extract MRZ Data if available
            additional = data.get('additional_fields', {})
            if additional.get('mrz_line_1'):
                print(f"MRZ Line 1: {additional.get('mrz_line_1')}")
                if additional.get('mrz_line_2'):
                    print(f"MRZ Line 2: {additional.get('mrz_line_2')}")
        else:
            print(f"❌ Extraction Failed: {result.get('error')}")

    except Exception as e:
        # Handle SDK or Network errors
        print(f"An error occurred: {e}")

if __name__ == "__main__":
    scan_national_id()

Technical Specs

•Latency: < 5s (Average)
•Uptime: 98.5% SLA
•Security: AES-256 Encryption & SOC2 Compliant
•Input: JPG, PNG, WebP (File Path)
•Max File Size: 4.5MB
•Output: JSON (Structured Data)

Key Features

•Hybrid VIZ + MRZ AI: Cross-validates unstructured visual data against cryptographic MRZ checksums (TD1/TD2) for zero hallucination.
•Specialized Numbers: Extracts region-specific IDs like CNP (Romania), CPF (Brazil), and NIN (Nigeria).
•Multi-line Addresses: Intelligently reconstructs full addresses from fragmented lines on ID cards.

Sample JSON Output

StructOCR returns a normalized JSON object containing both Visual Zone (VIZ) extraction and raw Machine-Readable Zone (MRZ) lines.

{
  "success": true,
  "data": {
    "type": "national_id",
    "country_code": "ROU",
    "nationality": "ROMANA",
    "document_number": "123456",
    "card_series": "KS",
    "personal_number": "1920319123456",
    "surname": "POPESCU",
    "given_names": "ANDREI",
    "sex": "M",
    "date_of_birth": "1992-03-19",
    "place_of_birth": "Jud. CS Mun. Reșița",
    "address": "Jud. CS Orș. Bocșa Str. Nucilor Nr. 15",
    "date_of_issue": "2020-05-10",
    "date_of_expiry": "2030-05-10",
    "issuing_authority": "SPCLEP Bocșa",
    "additional_fields": {
      "phone_number": null,
      "tramite_number": null,
      "ejemplar": null,
      "mrz_line_1": "IDROU123456<0<<<<<<<<<<<<<<<<",
      "mrz_line_2": "9203195M3005108ROU19203191234562",
      "mrz_line_3": null
    }
  }
}

Frequently Asked Questions

Do you support Machine Readable Zones (MRZ) on ID cards?

Yes! Our engine natively supports ICAO 9303 standard MRZ formats (TD1/TD2) found on many global ID cards. Our Hybrid architecture extracts both the raw MRZ lines and cross-validates them against the Visual Zone (VIZ) for maximum accuracy.

How does StructOCR compare to AWS Textract or Google Vision?

General-purpose OCR services like AWS Textract and Google Vision return raw, unstructured text dumps or simple key-value pairs. You are still responsible for parsing, validating, and structuring that data. StructOCR is a specialized model trained exclusively on identity documents. It returns a fully parsed, validated JSON object with predefined fields like `date_of_birth` and `document_number`, eliminating the need for any post-processing logic.

Do you store the uploaded images?

We do not store customer images. All uploaded files are processed in-memory and are permanently deleted immediately after the OCR extraction process is complete. Your data privacy is paramount.

How to handle blurry images?

Our API includes a powerful, automatic image enhancement engine. Before extraction, it performs denoising, deblurring, and contrast correction to maximize the accuracy of results from low-quality or blurry source images.

Precise Data Extraction and Seamless
Integration with AI-powered OCR API.

Empower your solutions with automated data extraction by
integrating best-in class StructOCR via API seamlessly.

Get Your 20 Free Credits Test it now in the Playground

No credit card required • Full API access included

Python SDK for Enterprise-Grade National ID OCR

Why National ID OCR is Difficult

Enterprise-Grade Extraction with StructOCR

Production Use Cases

Live Demo: ID card scanner

Implementation: Python SDK

Technical Specs

Key Features

Sample JSON Output

Frequently Asked Questions

Do you support Machine Readable Zones (MRZ) on ID cards?

How does StructOCR compare to AWS Textract or Google Vision?

Do you store the uploaded images?

How to handle blurry images?

More OCR Tutorials

Python Shipping Container OCR API

Python Driver License OCR SDK & API

Python HIN (Hull Identification Number) OCR API SDK

Python Invoice Line Item OCR API

Precise Data Extraction and Seamless
Integration with AI-powered OCR API.

Why National ID OCR is Difficult

Enterprise-Grade Extraction with StructOCR

Production Use Cases

Live Demo: ID card scanner

Implementation: Python SDK

Technical Specs

Key Features

Sample JSON Output

Frequently Asked Questions

Do you support Machine Readable Zones (MRZ) on ID cards?

How does StructOCR compare to AWS Textract or Google Vision?

Do you store the uploaded images?

How to handle blurry images?

More OCR Tutorials

Python Shipping Container OCR API

Python Driver License OCR SDK & API

Python HIN (Hull Identification Number) OCR API SDK

Python Invoice Line Item OCR API

Precise Data Extraction and Seamless Integration with AI-powered OCR API.

Precise Data Extraction and Seamless
Integration with AI-powered OCR API.