1

Fields Extracted from Bank Statements

Zera Books extracts every meaningful data point from a bank statement — not just the transaction rows. This includes metadata about the statement period, account identifiers, and opening/closing balances that are essential for proper reconciliation.

FieldDescriptionNotes
Transaction DateDate the transaction posted to the accountNormalized to YYYY-MM-DD regardless of source format (MM/DD/YYYY, DD-MMM-YYYY, etc.)
DescriptionFull merchant name or transaction referenceMulti-line descriptions merged; bank reference codes preserved
Debit AmountMoney leaving the accountStandardized to positive values; parenthetical negatives handled
Credit AmountMoney entering the accountSeparate column from debits in all output formats
Running BalanceAccount balance after each transactionExtracted where present; calculated where not shown
Account NumberFull or masked account numberExtracted from header; matched to correct transactions in multi-account PDFs
Statement PeriodStart and end dates of the statementExtracted from header; used for duplicate detection across overlapping periods
Opening BalanceBalance at period startUsed to validate extraction completeness
Closing BalanceBalance at period endCross-checked against sum of transactions + opening balance
2

Format and Input Handling

Bank statement PDFs come in two fundamentally different forms: digital (text-layer) and scanned (image-only). Each requires a different processing approach. Zera Books handles both automatically — it detects the input type and routes accordingly without requiring manual flags.

Digital PDFs (Text Layer)

PDFs with an embedded text layer are processed directly by Zera AI. No OCR step required. 99.6% field-level accuracy. Handles any column layout, any page structure.

Scanned PDFs and Images

Scanned statements are processed by Zera OCR at 95%+ accuracy. Handles blurry scans, low resolution, rotated pages, and skewed images that generic OCR engines fail on.

Password-Protected PDFs

Enter the password once during upload. The platform decrypts, processes, and stores extracted data — not the raw PDF — for security compliance.

Multi-Page Statements

Statements spanning 50+ pages are processed as a single unit. Page breaks in the middle of transactions are handled correctly — no split transactions.

Multi-account PDFs: A single PDF containing checking, savings, and credit card data is detected and split into separate output files automatically. See multi-account detection for how the boundary detection works.

Any bank. Any format. No templates.

Upload a PDF from any institution and get clean, categorized transactions ready for QuickBooks or Xero in under 60 seconds.

Try for one week
3

Output Formats and Accounting Software Compatibility

The output format determines how much work you do after conversion. A generic CSV requires column mapping in your accounting software. A QBO file imports directly with no configuration. Zera Books supports both, plus pre-formatted exports for every major platform.

FormatBest ForWhat's Included
Excel (XLSX)Firms that do their own import or review before uploadingAll extracted fields, AI category column, confidence scores, standardized dates
CSVCustom imports, data analysis, internal accounting systemsUTF-8 encoded, standardized field names, configurable delimiters
QBOQuickBooks Online direct importOFX-based format, bank ID mapped, transactions dated correctly
IIFQuickBooks DesktopAccount type, split transactions, vendor names pre-mapped
Xero-formatted CSVXero bank feed importReference numbers, descriptions, amounts in Xero column order
Sage / Wave / Zoho / NetSuiteThose specific platformsEach export matches the exact import spec of the target platform
4

Accuracy Benchmarks

Accuracy claims without methodology are meaningless. Here's what the 99.6% figure means and where it comes from:

Field-Level Accuracy — How It's Measured

Field-level accuracy (digital PDFs)
99.6%
OCR accuracy (scanned PDFs)
95%+
Training statements
2.8M
Transactions processed in training
847M

What 99.6% means: Field-level accuracy measures each individual extracted value (date, amount, description) against the source document. At 99.6%, a statement with 100 transactions and 400 field values has an average of 1.6 errors. In practice, amount fields have higher accuracy than description fields because amounts are numerically validated against opening/closing balance checks.

5

Frequently Asked Questions

What fields does Zera Books extract from bank statements?

Date, description, debit, credit, running balance, account number, account holder name, opening balance, closing balance, and statement period. Multi-line descriptions are merged correctly. All date formats are normalized.

Does Zera Books handle scanned bank statement PDFs?

Yes. Zera OCR handles scanned PDFs at 95%+ accuracy. It processes blurry scans, rotated pages, and low-resolution images. Scanned statements are automatically detected and routed through the OCR pipeline.

Can Zera Books process statements from any bank?

Yes. Zera AI dynamically processes any bank format without template setup — trained on 2.8 million statements from hundreds of institutions. It adapts to format changes automatically. See Zera AI reference for model details.

What output formats are available?

Excel, CSV, QBO, IIF, and pre-formatted exports for Xero, Sage, Wave, Zoho, NetSuite, FreshBooks, MYOB, and Oracle. All exports include AI-categorized transactions. See zerabooks.com/products/bank-statements for full details.