The Definitive Node.js SDK for National ID Data Extraction

Achieve 99.8% extraction accuracy in under 1500ms, eliminating manual data entry and Tesseract errors.

Steve HarringtonUpdated 2026-01-16
A diagram showing the StructOCR process: a raw photo of a National ID card is uploaded, the API automatically crops, enhances, and extracts data, then outputs a structured JSON object.
Figure 1: StructOCR converts raw National ID images into validated JSON data.

Why National ID OCR is Difficult

Standard OCR tools like Tesseract fail on real-world ID documents due to their variability. Glare from holographic overlays, shadows from poor lighting, and skewed or rotated images from mobile captures consistently degrade accuracy. Parsing the Machine-Readable Zone (MRZ) requires not just reading characters, but also validating complex checksums. Furthermore, the layout and security features of National IDs differ significantly across jurisdictions, requiring a constantly updated and fragile set of RegEx patterns for each country. This high maintenance overhead and low reliability make generic OCR unsuitable for production-grade KYC systems.

Enterprise-Grade Extraction with StructOCR

StructOCR replaces brittle, open-source OCR pipelines with a specialized API. Our system leverages pre-trained deep learning models, specifically fine-tuned on millions of global identity documents, to locate and interpret data fields. Unlike Tesseract, our API includes automatic image pre-processing. It performs perspective correction (deskewing), glare removal, and denoising before extraction. This ensures high accuracy even with sub-optimal mobile captures. The result is a standardized, validated JSON output, abstracting away the complexity of parsing different ID layouts.

Production Use Cases

  • Digital Onboarding (KYC): Reduce drop-off rates by pre-filling user data from National IDs in < 2 seconds.
  • Fraud Prevention: Detect tampered fonts or mismatched MRZ checksums automatically.
  • Global Compliance: Handle National IDs from 200+ jurisdictions without custom rules.

Implementation: Node.js SDK

The official Node.js SDK abstracts the complexity of file handling and API requests. It automatically parses region-specific fields like CNP (Romania) or CPF (Brazil).

Prerequisite: npm install structocr

CODE EXAMPLE
const StructOCR = require('structocr');

// 💰 Save 30%+ vs competitors. Get 200 free requests instantly:
// 👉 https://structocr.com/register
// Initialize the client with your API key
const client = new StructOCR('YOUR_API_KEY_HERE');

async function scanNationalId() {
  // Note: Supports JPG, PNG, WebP
  const imagePath = './id_card.jpg';

  try {
    console.log(`Scanning National ID: ${imagePath}...`);

    // The SDK handles file reading and the API call
    const result = await client.scanNationalId(imagePath);

    if (result.success && result.data) {
      const data = result.data;
      console.log('✅ Extraction Successful!');
      
      console.log(`Region:     ${data.country_code} (Series: ${data.card_series})`);
      console.log(`Name:       ${data.given_names} ${data.surname}`);
      console.log(`ID Number:  ${data.document_number}`);
      
      // Specialized field for National Identity Numbers (CNP, CPF, NIN)
      console.log(`Personal #: ${data.personal_number}`);
      
      console.log(`DOB:        ${data.date_of_birth} (${data.sex})`);
      console.log(`Address:    ${data.address}`);

    } else {
      console.error('❌ Extraction Failed:', result.error || 'Unknown Error');
    }
  } catch (error) {
    console.error('An unexpected error occurred:', error.message);
  }
}

scanNationalId();

Technical Specs

  • Latency: < 5s (Average)
  • Uptime: 98.5% SLA
  • Security: AES-256 Encryption & SOC2 Compliant
  • Input: JPG, PNG, WebP (File Path)
  • Max File Size: 4.5MB
  • Output: JSON (Structured Data)

Key Features

  • Specialized Numbers: Extracts region-specific IDs like CNP (Romania), CPF (Brazil), and NIN (Nigeria).
  • Multi-line Addresses: Intelligently reconstructs full addresses from fragmented lines on ID cards.
  • Date Normalization: Returns all dates (Birth, Issue, Expiry) in a standardized YYYY-MM-DD format.

Sample JSON Output

StructOCR returns a normalized JSON object, regardless of the input image angle or quality.

{
  "success": true,
  "data": {
    "type": "national_id",
    "country_code": "ROU",
    "nationality": "ROMANA",
    "document_number": "123456",
    "card_series": "KS",
    "personal_number": "1920319123456",
    "surname": "POPESCU",
    "given_names": "ANDREI",
    "sex": "M",
    "date_of_birth": "1992-03-19",
    "place_of_birth": "Jud. CS Mun. Reșița",
    "address": "Jud. CS Orș. Bocșa Str. Nucilor Nr. 15",
    "date_of_issue": "2020-05-10",
    "date_of_expiry": "2030-05-10",
    "issuing_authority": "SPCLEP Bocșa"
  }
}

Frequently Asked Questions

How does StructOCR compare to AWS Textract or Google Vision?

Generic OCR services like Textract or Vision return an unstructured array of text lines and their coordinates. Developers must build and maintain complex parsers to map this raw text to meaningful fields. StructOCR is a specialized model that directly returns a structured JSON object with pre-identified and validated fields, eliminating all post-processing logic.

Do you store the uploaded images?

No. All image data is processed in-memory and is purged immediately after the extraction process completes. We do not persist customer image data on our servers.

How to handle blurry images?

Our API includes an automated image enhancement engine that runs before the OCR process. It applies deblurring and sharpening algorithms to significantly improve the success rate on low-quality or out-of-focus images from mobile captures.

More OCR Tutorials

Precise Data Extraction and Seamless Integration with AI-powered OCR API.

Empower your solutions with automated data extraction by integrating best-in class StructOCR via API seamlessly.

No credit card required • Full API access included