The Node.js SDK for Mission-Critical Passport OCR

Achieve 99.8%+ accuracy and sub-second latency for parsing ICAO 9303 compliant Machine Readable Zones (MRZ).

Steve HarringtonUpdated 2026-01-16
A diagram showing the flow of a passport image being uploaded to the StructOCR API, processed by AI models for deskewing and MRZ detection, and returning structured JSON data.
Figure 1: StructOCR converts raw Passport images into validated JSON data.

Why Passport OCR is Difficult

Implementing reliable passport OCR is non-trivial. Open-source tools like Tesseract fail on low-quality scans due to image artifacts like glare, shadows, and inconsistent lighting. Passports are often photographed at an angle, requiring computationally expensive deskewing and rotation correction. The Machine Readable Zone (MRZ) itself follows the strict ICAO 9303 standard, which includes complex check digit validation logic. Maintaining custom RegEx patterns to parse this zone is brittle and fails to adapt to minor variations across jurisdictions. These challenges lead to high error rates, manual review queues, and significant engineering overhead.

Enterprise-Grade Extraction with StructOCR

StructOCR replaces this entire complex pipeline with a single API call. Our service is powered by pre-trained Deep Learning models specifically optimized for identity documents. The API performs automatic image pre-processing, including perspective correction, denoising, and glare removal, before extraction even begins. Unlike Tesseract which returns unstructured text, StructOCR provides a standardized JSON output with validated fields conforming to the ICAO 9303 standard. This eliminates manual parsing and validation, delivering production-ready data directly to your application.

Production Use Cases

  • Digital Onboarding (KYC): Reduce drop-off rates by pre-filling user data from Passports in < 2 seconds.
  • Fraud Prevention: Detect tampered fonts or mismatched MRZ checksums automatically.
  • Global Compliance: Handle Passports from 200+ jurisdictions without custom rules.

Implementation: Node.js SDK

The official Node.js SDK simplifies passport extraction. It handles file upload and parses both MRZ and Visual Inspection Zone (VIZ) data fields automatically.

Prerequisite: npm install structocr

CODE EXAMPLE
const StructOCR = require('structocr');

// 💰 Save 30%+ vs competitors. Get 200 free requests instantly:
// 👉 https://structocr.com/register
// Initialize the client with your API key
const client = new StructOCR('YOUR_API_KEY_HERE');

async function scanPassport() {
  // Note: Supports JPG, PNG, WebP (Max 4.5MB)
  const imagePath = './passport.jpg';

  try {
    console.log(`Scanning passport: ${imagePath}...`);

    // The SDK handles file reading and the API call
    const result = await client.scanPassport(imagePath);

    if (result.success && result.data) {
      const data = result.data;
      console.log('✅ Extraction Successful!');
      
      // Basic Identity (MRZ + VIZ)
      console.log(`Passport #: ${data.passport_number}`);
      console.log(`Name:       ${data.given_names} ${data.surname}`);
      console.log(`Nation:     ${data.nationality} (${data.country_code})`);
      
      // Visual Zone Specifics (Unique to StructOCR)
      console.log(`Birth Place:${data.place_of_birth}`);
      console.log(`Issued At:  ${data.place_of_issue}`);
      
      // Dates
      console.log(`DOB:        ${data.date_of_birth} (${data.sex})`);
      console.log(`Expiry:     ${data.date_of_expiry}`);

    } else {
      console.error('❌ Extraction Failed:', result.error || 'Unknown Error');
    }
  } catch (error) {
    console.error('An unexpected error occurred:', error.message);
  }
}

scanPassport();

Technical Specs

  • Latency: < 5s (Average)
  • Uptime: 98.5% SLA
  • Security: AES-256 Encryption & SOC2 Compliant
  • Input: JPG, PNG, WebP (File Path)
  • Max File Size: 4.5MB
  • Output: JSON (Structured Data)

Key Features

  • Visual Extraction (VIZ): Parses non-MRZ data fields like Place of Birth and Issuing Authority.
  • Global Support: Optimized for 195+ countries, handling complex backgrounds and holograms.
  • Date Normalization: Returns all dates (Birth, Issue, Expiry) in a standardized YYYY-MM-DD format.

Sample JSON Output

StructOCR returns a normalized JSON object, regardless of the input image angle or quality.

{
  "success": true,
  "data": {
    "type": "passport",
    "country_code": "USA",
    "nationality": "UNITED STATES",
    "passport_number": "E12345678",
    "surname": "DOE",
    "given_names": "JOHN",
    "sex": "M",
    "date_of_birth": "1990-01-01",
    "place_of_birth": "NEW YORK, USA",
    "date_of_issue": "2020-01-01",
    "date_of_expiry": "2030-01-01",
    "place_of_issue": "PASSPORT AGENCY"
  }
}

Frequently Asked Questions

How does StructOCR compare to AWS Textract or Google Vision?

Generic OCR services like AWS Textract and Google Vision are designed to extract raw lines of text from an image. They do not understand the document's structure. You are still responsible for parsing those lines, validating checksums, and mapping 'DOE' to a 'surname' field. StructOCR is a specialized API that performs both OCR and data structuring, returning validated, labeled fields like `surname`, `date_of_birth`, and `passport_number` directly.

Do you store the uploaded images?

We do not store customer images. All data is processed in-memory (RAM) and is purged immediately after the API request is completed. We are a SOC2 compliant provider.

How to handle blurry images?

Our API includes an advanced image pre-processing engine. It automatically attempts to correct blur, adjust contrast, and remove noise before the OCR model analyzes the image, significantly increasing success rates on sub-optimal inputs.

More OCR Tutorials

Precise Data Extraction and Seamless Integration with AI-powered OCR API.

Empower your solutions with automated data extraction by integrating best-in class StructOCR via API seamlessly.

No credit card required • Full API access included