High-Accuracy Invoice OCR API for Go Applications

Achieve 99.7%+ accuracy on line-item extraction and get structured JSON responses in under 1500ms.

Steve HarringtonUpdated 2026-01-19
A flowchart showing a raw invoice image on the left, an arrow pointing to the StructOCR API icon in the middle, and a structured JSON output with extracted fields like invoice number, line items, and total amount on the right.
Figure 1: StructOCR converts raw Invoice images into validated JSON data.

Why Invoice OCR is Difficult

Generic OCR tools fail on invoices due to their high layout variance. Unlike passports or licenses, invoices lack a standardized format, leading to inconsistent table structures, column orders, and field locations. Parsing this requires complex, brittle RegEx patterns that break with each new vendor template. Furthermore, image quality issues from scans or photos, such as low resolution, skew, and shadows, degrade accuracy significantly. Open-source engines like Tesseract often misinterpret table rows or fail to associate line items with their correct totals, creating a high-cost maintenance burden for any serious Accounts Payable automation system.

Enterprise-Grade Extraction with StructOCR

StructOCR leverages pre-trained Deep Learning models specifically fine-tuned on millions of diverse invoice layouts. Our API bypasses the need for manual templating. Upon receiving an image, our system performs automatic pre-processing, including deskewing, denoising, and contrast enhancement. The model then identifies and extracts key-value pairs, headers, and complex table data, including multi-line item descriptions. Unlike Tesseract, which returns a raw text dump, StructOCR delivers a predictable, standardized JSON output with validated fields, ready for direct integration into your ERP or accounting software.

Production Use Cases

  • Accounts Payable Automation: Automate your entire AP workflow by ingesting invoices from any source and posting structured data directly to your ERP.
  • Expense Management: Process employee expense reports in real-time by extracting vendor, total amount, and line items from submitted receipts and invoices.
  • Supply Chain Auditing: Verify and audit thousands of supplier invoices for compliance and pricing accuracy without manual intervention.

Implementation: Go (Golang)

The following Go code provides a robust implementation. It handles file reading, Base64 encoding, and parses the complex nested JSON structure (line items, financials) into native Go structs.

Prerequisite: Go 1.16+

CODE EXAMPLE
package main

import (
	"bytes"
	"encoding/base64"
	"encoding/json"
	"fmt"
	"io"
	"net/http"
	"os"
)

// 💰 Save 30%+ vs competitors. Get 200 free requests instantly:
// 👉 https://structocr.com/register

// --- Struct Definitions for JSON Parsing ---
type InvoiceResponse struct {
	Success bool        `json:"success"`
	Data    InvoiceData `json:"data"`
	Error   string      `json:"error,omitempty"`
}

type InvoiceData struct {
	InvoiceNumber string      `json:"invoice_number"`
	Date          string      `json:"date"`
	Currency      string      `json:"currency"`
	Merchant      Merchant    `json:"merchant"`
	Financials    Financials  `json:"financials"`
	LineItems     []LineItem  `json:"line_items"`
}

type Merchant struct {
	Name  string `json:"name"`
	TaxID string `json:"tax_id"`
}

type Financials struct {
	TotalAmount float64 `json:"total_amount"`
	TaxAmount   float64 `json:"tax_amount"`
}

type LineItem struct {
	Description string  `json:"description"`
	Quantity    float64 `json:"quantity"`
	Amount      float64 `json:"amount"`
}

func main() {
	apiURL := "https://api.structocr.com/v1/invoice"
	apiKey := "YOUR_API_KEY_HERE"
	imagePath := "invoice.jpg" // Supports JPG, PNG

	// 1. Read and Encode Image
	imageBytes, err := os.ReadFile(imagePath)
	if err != nil {
		fmt.Printf("Error reading file: %v\n", err)
		return
	}
	base64Image := base64.StdEncoding.EncodeToString(imageBytes)

	// 2. Prepare Payload
	payload, _ := json.Marshal(map[string]string{"img": base64Image})

	// 3. Create Request
	req, err := http.NewRequest("POST", apiURL, bytes.NewBuffer(payload))
	if err != nil {
		fmt.Printf("Request creation failed: %v\n", err)
		return
	}

	req.Header.Set("Content-Type", "application/json")
	req.Header.Set("x-api-key", apiKey) // Required Header

	// 4. Send Request
	client := &http.Client{}
	resp, err := client.Do(req)
	if err != nil {
		fmt.Printf("Network error: %v\n", err)
		return
	}
	defer resp.Body.Close()

	// 5. Parse Response
	body, _ := io.ReadAll(resp.Body)

	if resp.StatusCode != http.StatusOK {
		fmt.Printf("API Error (%d): %s\n", resp.StatusCode, string(body))
		return
	}

	var result InvoiceResponse
	if err := json.Unmarshal(body, &result); err != nil {
		fmt.Printf("JSON Parse Error: %v\n", err)
		return
	}

	if result.Success {
		data := result.Data
		fmt.Println("✅ Invoice Extracted Successfully!")
		fmt.Printf("Invoice #: %s (Date: %s)\n", data.InvoiceNumber, data.Date)
		fmt.Printf("Vendor:    %s (Tax ID: %s)\n", data.Merchant.Name, data.Merchant.TaxID)
		fmt.Printf("Total:     %.2f %s\n", data.Financials.TotalAmount, data.Currency)

		fmt.Println("\n--- Line Items ---")
		for _, item := range data.LineItems {
			fmt.Printf("- %s (Qty: %.0f) = %.2f\n", item.Description, item.Quantity, item.Amount)
		}
	} else {
		fmt.Printf("Extraction failed: %s\n", result.Error)
	}
}

Technical Specs

  • Latency: < 5s (Average)
  • Uptime: 98.5% SLA
  • Security: AES-256 Encryption & SOC2 Compliant
  • Input: JPG, PNG, WebP (Base64 Encoded)
  • Max File Size: 4.5MB
  • Output: JSON (Nested Structure)

Key Features

  • Table Extraction Engine: Accurately parses complex line items and tables without manual templating.
  • Financial Validation: Cross-validates subtotals, taxes, and grand totals to ensure mathematical accuracy.
  • Vendor Normalization: Automatically identifies merchants and extracts standardized tax IDs (VAT/EIN).

Sample JSON Output

StructOCR returns a normalized JSON object, regardless of the input image angle or quality.

{
  "success": true,
  "data": {
    "type": "invoice",
    "invoice_number": "INV-2026-001",
    "date": "2026-01-15",
    "due_date": "2026-02-15",
    "currency": "USD",
    "merchant": {
      "name": "AWS Web Services",
      "address": "410 Terry Ave N, Seattle, WA",
      "tax_id": "EIN-12-3456789",
      "iban": null
    },
    "customer": {
      "name": "Acme Corp Inc.",
      "tax_id": "987654321"
    },
    "financials": {
      "subtotal": 100,
      "tax_amount": 10,
      "total_amount": 110
    },
    "line_items": [
      {
        "description": "EC2 Instance Usage",
        "quantity": 1,
        "unit_price": 80,
        "amount": 80
      },
      {
        "description": "S3 Storage",
        "quantity": 1,
        "unit_price": 20,
        "amount": 20
      }
    ]
  }
}

Frequently Asked Questions

How does StructOCR compare to AWS Textract or Google Vision?

Unlike general-purpose OCR services that return raw text coordinates, StructOCR is a specialized model fine-tuned on millions of invoices. It returns a structured JSON object with pre-identified fields like `invoice_number`, `line_items`, and `merchant_name`, eliminating the need for complex post-processing and rule-based parsing.

Do you store the uploaded images?

No, images are processed in memory and deleted immediately after extraction. We do not persist any of your source data.

How do you handle blurry or low-quality images?

Our API includes an automatic image pre-processing engine that applies denoising, deskewing, and contrast enhancement to improve OCR accuracy on low-quality or blurry images before the extraction models are run.

More OCR Tutorials

Precise Data Extraction and Seamless Integration with AI-powered OCR API.

Empower your solutions with automated data extraction by integrating best-in class StructOCR via API seamlessly.

No credit card required • Full API access included