Zera OCR Reference | OCR Accuracy by Document Type

Q: What is the minimum DPI required for Zera OCR?

Zera OCR requires a minimum of 150 DPI for acceptable accuracy. 200–300 DPI is recommended for optimal results. Below 150 DPI, character recognition degrades significantly, particularly for fine print like routing numbers on checks.

Q: Can Zera OCR process handwritten documents?

Zera OCR is optimized for printed financial documents. Handwritten amounts on checks are supported with 88% accuracy. Fully handwritten invoices or statements are not supported — the document must be printed or typed.

Q: Does Zera OCR work on password-protected PDFs?

Yes. Zera Books can process password-protected PDFs. Users provide the password at upload time and it is used only for that session — it is never stored or logged.

Q: What happens when OCR confidence is low on a field?

Low-confidence OCR fields are flagged in the output with a yellow highlight. The extracted value is included but marked as uncertain. Users can review and correct these fields before exporting to QuickBooks or Xero.

⚡ TL;DR — Zera OCR Capabilities

Zera OCR Handles

✓ Scanned PDFs (150+ DPI)
✓ JPG/PNG image uploads
✓ Password-protected PDFs
✓ Multi-page documents
✓ Rotated/skewed pages (auto-deskew)
✓ Low-contrast fax-quality scans

Not Supported

✗ Below 100 DPI (insufficient quality)
✗ Fully handwritten documents
✗ Damaged/torn physical documents
✗ Documents in unsupported languages

1

OCR Accuracy by Document Type

Accuracy is measured as field-level extraction match against ground-truth human review on 10,000 documents per type. "Accuracy" means the extracted value exactly matches the correct value — no partial credit.

Document Type	Digital PDF Accuracy	Scanned PDF Accuracy	Image (JPG/PNG)
Bank Statements	99.8%	96.2%	95.4%
Invoices	99.6%	95.8%	94.1%
Checks (MICR fields)	99.1%	97.3%	93.7%
Financial Statements	99.4%	94.6%	93.2%
Fax-quality scans (<150 DPI)	N/A	87.3%	82.1%

* All scanned document benchmarks at 200 DPI. Fax-quality benchmark at 120 DPI. Data from Q1 2025 internal accuracy study.

2

Image Quality Requirements

Zera OCR performs best with clean, high-resolution inputs. These are the minimum and recommended specifications for each input type.

Parameter	Minimum	Recommended	Notes
Resolution (DPI)	150 DPI	200–300 DPI	Below 150 DPI: accuracy drops sharply, especially for MICR and small print
Color mode	Grayscale	Grayscale or color	Black-and-white bitonal acceptable for standard print; color helps logo/header detection
File size (per page)	No minimum	50KB–3MB/page	Files >10MB/page are auto-compressed before OCR
Page rotation	Any angle	0° (upright)	Auto-deskew corrects up to ±15° rotation; extreme angles reduce accuracy
Contrast ratio	Low acceptable	High contrast	Low contrast (fax quality) processed but with lower field accuracy
JPG compression	Quality 60+	Quality 85+	Heavy compression artifacts degrade character edges; use PNG for critical docs

Upload a Scanned Statement and See the Results

Zera OCR processes scanned PDFs, images, and fax-quality documents in under 30 seconds.

Try for one week →

3

Input Format Handling

Zera OCR accepts all common document formats used in accounting workflows. No pre-processing or format conversion is required by the user.

Digital PDFs (text-based)

PDFs with embedded text are processed via direct text extraction (no OCR needed). Accuracy 99.6%+. Fastest processing path — typically under 5 seconds per page.

Scanned PDFs (image-based)

PDFs containing scanned images trigger Zera OCR. Each page is rasterized, deskewed, and processed field-by-field. Accuracy 94–97% at 200 DPI.

JPG / JPEG Images

Direct image upload supported. Zera OCR reads EXIF orientation metadata and auto-rotates before processing. Recommended quality 85+ to prevent artifact-related errors.

PNG Images

Lossless PNG format ideal for screenshots and digital-native documents. Supports transparency (flattened to white before OCR). No quality degradation from compression.

Password-Protected PDFs

Password provided at upload time is used for session-only decryption. Password is never stored, logged, or transmitted beyond the decryption step.

Multi-Page Documents

Multi-page PDFs processed page by page. Pages are stitched into a single transaction dataset after extraction. No page count limit on upload.

4

OCR Processing Pipeline

Zera OCR does not simply run raw character recognition. It applies a financial document-specific pipeline designed to maximize accuracy on the fields that matter most to accountants.

95%+Overall OCR accuracy (scanned docs)

97%+MICR line accuracy on checks at 200 DPI

<30sAverage processing time per document

±15°Maximum auto-deskew correction angle

Pipeline Steps

Step	Process	Purpose
1. Ingestion	Format detection, password decryption, rasterization	Normalize input to image form for OCR
2. Pre-processing	Deskew, denoising, contrast normalization	Maximize character edge clarity
3. Layout Analysis	Region detection: header, table body, footer, MICR zone	Apply appropriate OCR model per region
4. Field OCR	Character recognition per field with confidence scoring	Extract and score each individual field
5. Arithmetic Validation	Running totals, balance checks, line item sums verified	Catch transposition errors and OCR misreads
6. Output Assembly	Fields merged with extraction schema, low-confidence flags applied	Produce clean, structured output for export

5

Handling Low-Quality Inputs

Not all documents arrive in ideal condition. Zera OCR applies fallback strategies for degraded inputs rather than failing silently.

Below 150 DPI

Document is processed with a warning flag. Accuracy degrades to 82–87%. Fields with <70 confidence score are highlighted for manual review in the export.

High Skew / Rotation

Auto-deskew corrects up to ±15°. Pages beyond this range are flagged. Users can manually rotate pages before reprocessing via the document preview tool.

Low Contrast / Fax Quality

Contrast normalization applied. For extreme cases (thermally faded, water-damaged), a "manual review required" flag is applied to affected pages.

Arithmetic Validation Failures

If extracted totals don't match sum of line items, the discrepancy is reported as an extraction warning. The raw OCR values are still exported so users can identify the error.

?

Frequently Asked Questions

What is the minimum DPI required for Zera OCR?