95%+ OCR Accuracy Scanned PDFs + Images 150+ DPI Support

Zera OCR Reference

OCR accuracy benchmarks by document type, image quality requirements, DPI thresholds, and how Zera OCR handles degraded scans across bank statements, invoices, checks, and financial statements.

★★★★★ 4.9 Trustpilot 95%+ OCR accuracy 99.6% overall extraction accuracy 150 DPI minimum requirement
Try for one week → See feature page ↗

⚡ TL;DR — Zera OCR Capabilities

Zera OCR Handles
  • Scanned PDFs (150+ DPI)
  • JPG/PNG image uploads
  • Password-protected PDFs
  • Multi-page documents
  • Rotated/skewed pages (auto-deskew)
  • Low-contrast fax-quality scans
Not Supported
  • Below 100 DPI (insufficient quality)
  • Fully handwritten documents
  • Damaged/torn physical documents
  • Documents in unsupported languages
1

OCR Accuracy by Document Type

Accuracy is measured as field-level extraction match against ground-truth human review on 10,000 documents per type. "Accuracy" means the extracted value exactly matches the correct value — no partial credit.

Document TypeDigital PDF AccuracyScanned PDF AccuracyImage (JPG/PNG)
Bank Statements99.8%96.2%95.4%
Invoices99.6%95.8%94.1%
Checks (MICR fields)99.1%97.3%93.7%
Financial Statements99.4%94.6%93.2%
Fax-quality scans (<150 DPI)N/A87.3%82.1%

* All scanned document benchmarks at 200 DPI. Fax-quality benchmark at 120 DPI. Data from Q1 2025 internal accuracy study.


2

Image Quality Requirements

Zera OCR performs best with clean, high-resolution inputs. These are the minimum and recommended specifications for each input type.

ParameterMinimumRecommendedNotes
Resolution (DPI)150 DPI200–300 DPIBelow 150 DPI: accuracy drops sharply, especially for MICR and small print
Color modeGrayscaleGrayscale or colorBlack-and-white bitonal acceptable for standard print; color helps logo/header detection
File size (per page)No minimum50KB–3MB/pageFiles >10MB/page are auto-compressed before OCR
Page rotationAny angle0° (upright)Auto-deskew corrects up to ±15° rotation; extreme angles reduce accuracy
Contrast ratioLow acceptableHigh contrastLow contrast (fax quality) processed but with lower field accuracy
JPG compressionQuality 60+Quality 85+Heavy compression artifacts degrade character edges; use PNG for critical docs

Upload a Scanned Statement and See the Results

Zera OCR processes scanned PDFs, images, and fax-quality documents in under 30 seconds.

Try for one week →
3

Input Format Handling

Zera OCR accepts all common document formats used in accounting workflows. No pre-processing or format conversion is required by the user.

Digital PDFs (text-based)

PDFs with embedded text are processed via direct text extraction (no OCR needed). Accuracy 99.6%+. Fastest processing path — typically under 5 seconds per page.

Scanned PDFs (image-based)

PDFs containing scanned images trigger Zera OCR. Each page is rasterized, deskewed, and processed field-by-field. Accuracy 94–97% at 200 DPI.

JPG / JPEG Images

Direct image upload supported. Zera OCR reads EXIF orientation metadata and auto-rotates before processing. Recommended quality 85+ to prevent artifact-related errors.

PNG Images

Lossless PNG format ideal for screenshots and digital-native documents. Supports transparency (flattened to white before OCR). No quality degradation from compression.

Password-Protected PDFs

Password provided at upload time is used for session-only decryption. Password is never stored, logged, or transmitted beyond the decryption step.

Multi-Page Documents

Multi-page PDFs processed page by page. Pages are stitched into a single transaction dataset after extraction. No page count limit on upload.


4

OCR Processing Pipeline

Zera OCR does not simply run raw character recognition. It applies a financial document-specific pipeline designed to maximize accuracy on the fields that matter most to accountants.

95%+Overall OCR accuracy (scanned docs)
97%+MICR line accuracy on checks at 200 DPI
<30sAverage processing time per document
±15°Maximum auto-deskew correction angle

Pipeline Steps

StepProcessPurpose
1. IngestionFormat detection, password decryption, rasterizationNormalize input to image form for OCR
2. Pre-processingDeskew, denoising, contrast normalizationMaximize character edge clarity
3. Layout AnalysisRegion detection: header, table body, footer, MICR zoneApply appropriate OCR model per region
4. Field OCRCharacter recognition per field with confidence scoringExtract and score each individual field
5. Arithmetic ValidationRunning totals, balance checks, line item sums verifiedCatch transposition errors and OCR misreads
6. Output AssemblyFields merged with extraction schema, low-confidence flags appliedProduce clean, structured output for export

5

Handling Low-Quality Inputs

Not all documents arrive in ideal condition. Zera OCR applies fallback strategies for degraded inputs rather than failing silently.

Below 150 DPI

Document is processed with a warning flag. Accuracy degrades to 82–87%. Fields with <70 confidence score are highlighted for manual review in the export.

High Skew / Rotation

Auto-deskew corrects up to ±15°. Pages beyond this range are flagged. Users can manually rotate pages before reprocessing via the document preview tool.

Low Contrast / Fax Quality

Contrast normalization applied. For extreme cases (thermally faded, water-damaged), a "manual review required" flag is applied to affected pages.

Arithmetic Validation Failures

If extracted totals don't match sum of line items, the discrepancy is reported as an extraction warning. The raw OCR values are still exported so users can identify the error.


?

Frequently Asked Questions

What is the minimum DPI required for Zera OCR?

Zera OCR requires a minimum of 150 DPI for acceptable accuracy. 200–300 DPI is recommended for optimal results. Below 150 DPI, character recognition degrades significantly, particularly for fine print like routing numbers on checks.

Can Zera OCR process handwritten documents?

Zera OCR is optimized for printed financial documents. Handwritten amounts on checks are supported with 88% accuracy. Fully handwritten invoices or statements are not supported — the document must be printed or typed.

Does Zera OCR work on password-protected PDFs?

Yes. Zera Books can process password-protected PDFs. Users provide the password at upload time and it is used only for that session — it is never stored or logged.

What happens when OCR confidence is low on a field?

Low-confidence OCR fields are flagged in the output with a yellow highlight. The extracted value is included but marked as uncertain. Users can review and correct these fields before exporting to QuickBooks or Xero.

Related Resources

Explore related Zera Books features and documentation.

Process Scanned Documents Without Manual Re-Entry

Zera OCR extracts data from scanned PDFs, images, and fax-quality documents with 95%+ accuracy — no templates, no configuration.

Try for one week →