Which model should I start with?

Start with standard-v2 for most invoices, receipts, forms, and IDs. Move up to pro-v2 for harder pages, stricter review tolerance, or queues where the cost of a miss is higher.

Do I need a schema before I start?

No. Many teams begin with markdown for review or exploration, then move to structured JSON once the target schema is clear.

Can I use markdown and structured JSON in the same workflow?

Yes. Teams often start with markdown for review, QA, or exception handling, then return structured JSON for the records that need to land in another system.

What happens on harder or lower-confidence pages?

Hard pages do not need to pretend they are clean. You can escalate them to pro-v2, attach bbox for review, or route them into an explicit review path instead of quietly trusting a weak record.

Do you support multilingual documents and mixed file types?

Yes. LeapOCR supports 100+ file formats plus multilingual paperwork, so PDFs, scans, photos, office docs, and mixed queues can come through the same intake path.

What is the fastest way to talk through a workflow?

Use the scheduler for a technical walkthrough, or send a sample use case to hello@leapocr.com if you prefer to start asynchronously.

V2 released

Schema-first OCR for messy, real-world documents

LeapOCR turns PDFs, scans, photos, and multilingual paperwork into markdown for review or JSON that already fits your downstream schema.

Start free with 100 credits

Document types teams already handle in one OCR flow

Invoices

Receipts

Driver IDs

Forms

Bills of lading

Archive scans

Multilingual docs

Markdown + JSON

Workflow

From intake to handoff that fits the next system.

LeapOCR keeps document intake simple, then shapes and validates the result before it lands in finance, logistics, compliance, or internal ops tools.

Send the page once. Get back a usable record.

The goal is not raw text. It is output a person can review or a system can accept without another cleanup layer.

See digitization

sdk.ts

TypeScript SDK

const job = await leapocr.ocr.processFile("./invoice.pdf", {  format: "structured",  schema: invoiceSchema,  instructions: "Multiply all monetary fields by 100",});const result = await leapocr.ocr.getJobResult(job.jobId);

Ingest mixed documents through one lane

Upload PDFs, scans, photos, and batches without sorting the queue by template before OCR even starts.

invoice.md

# Invoice INV-100> **Vendor** · MICROSOFT CORPORATION> **Bill to** · CONTOSO LTD · `CID-12345`| | || --- | --- || **Issued** | 2019-11-15 || **Due** | 2019-12-15 || **Terms** | Net 30 || **Currency** | USD |## Line items| Description | Qty | Unit price | Line total || --- | --: | --: | --: || Professional services — Sprint 4 | 12 | $95.00 | $1,140.00 || Platform support (tier 2) | 1 | $250.00 | $250.00 |## Totals- Subtotal · **$1,390.00**- Sales tax (10%) · **$139.00**- Credits applied · **($919.00)**### Amount due**$610.00 USD** — *$1,529.00 − $919.00 credits* — payable by due date.

Shape the output before it breaks downstream

Send markdown to reviewers or schema-fit JSON to systems, with exceptions surfaced explicitly instead of hidden in the payload.

Fit

Built for real document ops. Not demo OCR.

LeapOCR is built for receipts, invoices, IDs, forms, and multilingual paperwork that still need to pass downstream checks after extraction.

For teams that cannot afford brittle OCR

When capture quality, layout, and language vary, the output still has to be dependable enough for the next workflow step.

Read docs

Stable enough for production document programs

Use one OCR stack across invoices, forms, IDs, and archive imports while keeping the output contract stable where it matters.

🇺🇸EN

🇲🇽ES

🇧🇷PT

🇫🇷FR

🇩🇪DE

🇷🇺RU

🇦🇪AR

🇮🇳HI

🇨🇳ZH

🇯🇵JA

Useful across multilingual and cross-border paperwork

Keep labels, totals, and document structure coherent even when the page language or layout shifts across vendors and regions.

SDK

Pick the client. See the implementation.

The same OCR flow, shown through four real integration surfaces. Switch languages at the top and inspect the exact request shape you would wire into production.

For teams that ship on day one.

One API surface, four client shapes. Pick the one that fits your stack and hand off a completed record.

Job model

Async by default

Submit work once, poll or wait, then hand off a completed record.

Output shape

Schema-first

Return structured data without inventing another translation layer.

Surface area

SDK + raw HTTP

Use the official client where it helps, or stay close to the API.

Read docs

Active file

invoice.ts

import { LeapOCR } from "leapocr";import { z } from "zod";import { zodToJsonSchema } from "zod-to-json-schema";const client = new LeapOCR({  apiKey: process.env.LEAPOCR_API_KEY,});// Define your schema with Zodconst invoiceSchema = z.object({  invoice_number: z.string(),  invoice_total: z.number(),});const job = await client.ocr.processURL("https://example.com/document.pdf", {  format: "structured",  model: "standard-v2",  schema: zodToJsonSchema(invoiceSchema),});const result = await client.ocr.waitUntilDone(job.jobId);console.log(result.data);

Step 01

Initialize the client and keep auth in one obvious place.

Step 02

Submit the page URL with model, format, and schema choices.

Step 03

Wait for completion and hand the result to the next system.

Bring in the queue you already have.

LeapOCR handles 100+ file formats, multilingual documents, webhook delivery, and rollout support for production teams.

Compare models

Coverage

Formats and language support

100+ file types

File intake

PDFs, images, spreadsheets, office docs, scans, and mixed uploads in one queue.

Language range

Multilingual invoices, forms, IDs, and logistics documents without separate routing.

Take the formats and languages already in the queue

Process PDFs, scans, photos, office docs, and multilingual paperwork without pre-cleaning everything upstream.

Deliver results into your stack with real support

Return results by API or webhook, then work with the LeapOCR team on edge cases, rollout details, and production tuning.

Setup

Set the input. Set the output. Stay in control.

Set up a document workflow without turning OCR into another cleanup project.

Upload

Send a file or URL, choose output settings, and run OCR

OCR ready

SourceDirect upload

Direct Upload

URL Upload

Drop invoice.pdf or choose a file

PDF, PNG, JPG, TIFF up to the upload limit

ConfigurationRequired

Choose the source first

Then set the model and result format for this request.

NextModel, format

Request

Choose a file or URL to start

Upload

Send a file or URL, choose output settings, and run OCR

OCR ready

SourceURL upload

Direct Upload

URL Upload

Document URL

https://storage.example.com/invoices/march/invoice.pdf

LeapOCR fetches the remote file and processes it the same way as a direct upload.

ConfigurationRequired

Model

standard-v2

Result format

structured

Processing instructionsOptional

Custom schemaOptional

invoice-schema.json

{  "invoice_number": "string",  "invoice_date": "date",  "total": "number"}

Request

URL upload with model, format, instructions, and schema ready

Step 1

Choose the document source

Start with file upload, a URL, or an API request from the queue you actually care about.

Step 2

Choose the output contract

Return markdown or structured JSON, then add instructions or bbox only where the downstream workflow benefits from them.

Step 3

Route harder pages deliberately

Escalate difficult pages to pro-v2 or explicit review instead of quietly posting a record your team cannot trust.

Choose the document source

Start with file upload, a URL, or an API request from the queue you actually care about.

Choose the output contract

Return markdown or structured JSON, then add instructions or bbox only where the downstream workflow benefits from them.

Route harder pages deliberately

Escalate difficult pages to pro-v2 or explicit review instead of quietly posting a record your team cannot trust.

Operator feedback

Accuracy should show up before the workflow gets complicated.

The fastest way to build trust is to perform well on the first pass, before templates, exceptions, and custom handling enter the picture.

Customer testimonial

Strong results, even before template setup.

That matters for teams evaluating a new OCR workflow. If the base extraction is already reliable, rollout gets easier and confidence builds faster.

Accurate on first passNo templates requiredBetter early confidence

Try a real document

What stood out in early product testing

“The tool seems very accurate even without using templates, which is quite interesting.”

Samuel

Progiprogi.com

Pricing

Subscription plans for teams with ongoing document workflows

Eligible paid plans include a 3-day trial with 100 credits after you add a payment method. Then move into the subscription that fits your ongoing document volume. For one-time volume or enterprise pricing, see the full pricing page.

Introductory trial

3-day trial, 100 credits

Available after you add a payment method on eligible plans. Visit the full pricing page for trial details, top-up packs, and enterprise pricing.

View full pricing

Lite

Yearly-only entry plan for smaller production workloads

Yearly only

A low-commitment annual tier for teams that have moved past evaluation and need a clean paid starting point.

$9/ month

Billed annually at $99 / year

15,000 credits / year

1 workspace with 1 team
Built for 1 workspace member
Best for small production automations

Choose Lite

Starter

For teams launching real document workflows

The first recurring tier for production pipelines that need a stable monthly credit budget and room to grow.

$45/ month

Billed annually at $539 / year

90,000 credits / year

Annual billing includes one month at no additional cost

1 workspace with 1 team
Built for 1 workspace member
Works well with one-time top-ups for bursts

Choose Starter

Growth

For steady OCR throughput in production

Most used

A balanced recurring tier for document-heavy apps and operations teams that need more room without jumping to enterprise-style volume.

$137/ month

Billed annually at $1,639 / year

270,000 credits / year

Annual billing includes one month at no additional cost

Up to 15 workspaces
Up to 15 teams in each workspace
Up to 5 workspace members

Choose Growth

Scale

For larger recurring document programs

Built for heavier throughput, cleaner annual budgeting, and teams that want a recurring pool sized for serious document automation.

$366/ month

Billed annually at $4,389 / year

720,000 credits / year

Annual billing includes one month at no additional cost

Up to 50 workspaces
Up to 50 teams in each workspace
Up to 50 workspace members

Choose Scale

Need more than recurring credits?

Top-up packs and enterprise pricing live on the dedicated pricing page

Keep the homepage focused on subscriptions. Use the full pricing page for burst packs, surcharge details, and sales-led enterprise conversations.

Open pricing

Frequently asked questions

Short answers to the questions teams ask before they put OCR into a production workflow.

Run a real document through LeapOCR with a 3-day trial and 100 credits.

Available after you add a credit card on an eligible plan. Start with a page that matters and see whether the right handoff is markdown, schema-fit JSON, bbox, or custom instructions before you change the rest of the workflow.

Try a real document