Developer-first document extraction
OCR API

An OCR API that returns usable output, not raw text you still have to clean up.

LeapOCR starts from one upload surface, then lets you choose the result shape that actually fits the next step. Use markdown when a person or an LLM needs a clean document view. Use structured when a system needs fields it can trust immediately.

Why teams use this

Start from file upload or remote URL ingestion in the same OCR workflow.

Choose markdown for readable handoff or structured output for schema-fit extraction.
Add instructions, schema, templates, or bounding boxes only when the document actually needs them.
Upload experience

The parsing workflow stays simple on purpose: pick a source, choose the output shape, add optional guidance, and run.

Live request surface

Upload

Send a file or URL, choose output settings, and run OCR

SourceURL upload
Direct Upload
URL Upload
Document URL
https://storage.example.com/invoices/march/invoice.pdf

LeapOCR fetches the remote file and processes it the same way as a direct upload.

ConfigurationRequired
Model
standard-v2
Result format
structured
Processing instructionsOptional
|
Custom schemaOptional
invoice-schema.json
{  "invoice_number": "string",  "invoice_date": "date",  "total": "number"}

Request

URL upload with model, format, instructions, and schema ready

Why it works

What you actually get

The key decision is not technical jargon. It is whether the next consumer needs a readable document, a structured object, or both with review context attached.

Readable mode

Markdown that still feels like the source document

Markdown keeps headings, sections, tables, and line flow intact, which makes it useful for QA, handoff, and LLM context building.

System-facing mode

Structured output shaped for downstream software

Structured mode returns a stable object instead of page prose, which means less post-processing, fewer brittle parsers, and cleaner integrations.

Layout data

Add bbox only when geometry matters

Bounding boxes are exposed as a boolean enhancement. Keep them off for pure parsing and on for review tools, overlays, and human-in-the-loop queues.

What you control

Choose the mode, then add control

These are the decisions teams actually make when they turn OCR into a production workflow instead of a raw text dump.

markdown
Mode · readable output

Return page content for humans and LLMs

Use markdown when the consumer needs a coherent page representation with headings, tables, and sections that still read naturally after OCR.

structured
Mode · schema-fit JSON

Return a stable object instead of page prose

Use structured mode when another system wants fields, arrays, and nested objects it can validate and write directly without another parsing pass.

instructions
Optional · string

Steer normalization and extraction choices

Use inline instructions for light behavior changes. Move heavier rules into templates once the workflow needs more room or more reuse.

schema
Structured only · object

Define the shape the downstream system can trust

Inline structured jobs need schema when you are not using a template. That is what keeps the result useful beyond a demo.

template_slug
Optional selector · string

Reuse a saved workflow setup

This is the cleanest path for production. It moves prompt, schema, and model choices out of every request body and into one reusable config.

extract_bounding_boxes
Optional · boolean

Attach geometry without changing the output mode

Bbox is not its own parsing mode. It is a toggle that adds page coordinates when the consuming workflow needs visual grounding.

Examples

Two common output patterns

Most teams settle into one of two patterns: readable markdown for human or LLM handoff, and structured extraction for systems that need direct writes.

Readable mode

Markdown output for review, handoff, and LLM context

This is the cleanest route when the next consumer still needs to read the page, quote it, or pass it into another model with the layout preserved in text form.

No schema required for markdown mode.
Instructions stay optional and focused on cleanup or normalization.
Result shape is page-oriented rather than object-oriented.
POST /ocr/uploads/url
json
  {  "url": "https://example.com/claim-form.pdf",  "file_name": "claim-form.pdf",  "format": "markdown",  "model": "standard-v1",  "instructions": "Keep section headings and normalize dates.",  "extract_bounding_boxes": false}
Result excerpt
json
  {  "job_id": "job_01",  "status": "completed",  "result_format": "markdown",  "pages": [    {      "page_number": 1,      "result": "# Claim form\n\n## Policy holder\n- Name: ..."    }  ]}
System-facing mode

Structured extraction with inline schema

This is the route for teams that need the API to return a stable object immediately instead of another layer of page text to parse.

If you do not pass `template_slug`, you need `format: structured` plus schema.
The schema is what turns OCR into a stable object instead of a best-effort blob.
Bounding boxes can still be attached when review tooling needs field coordinates.
POST /ocr/uploads/direct
json
  {  "file_name": "purchase-order.pdf",  "content_type": "application/pdf",  "file_size": 918443,  "format": "structured",  "model": "standard-v1",  "instructions": "Extract the order header and SKU table.",  "schema": {    "type": "object",    "properties": {      "po_number": { "type": "string" },      "order_date": { "type": "string" },      "items": {        "type": "array",        "items": {          "type": "object",          "properties": {            "sku": { "type": "string" },            "quantity": { "type": "number" }          }        }      }    }  },  "extract_bounding_boxes": true}
Result excerpt
json
  {  "job_id": "job_02",  "status": "completed",  "result_format": "structured",  "pages": [    {      "page_number": 1,      "result": {        "po_number": "PO-10441",        "order_date": "2026-03-10",        "items": [          { "sku": "AX-44", "quantity": 12 }        ]      }    }  ]}

FAQ

Questions teams ask before wiring this up

Straight answers for teams evaluating how this workflow fits into production.

How do I choose between markdown and structured output?

Choose markdown when the result needs to stay readable. Choose structured when the next consumer is code, a queue, or a business system expecting named fields.

When should I bring templates into the parsing flow?

Use templates once a workflow repeats and the setup is stable. They let you keep model, instructions, schema, and review behavior in one reusable place.

Is bounding-box extraction a separate output mode?

No. It is an optional enhancement you can layer onto markdown or structured workflows when page geometry matters downstream.

Ready to test

Pick the output your workflow actually wants

Start with the output shape that matches the next step. When the workflow repeats, store the winning configuration as a template and stop rebuilding it.