5 min read

How to Integrate LeapOCR in Your App: A Step-by-Step API + SDK Guide

A practical, copy-pasteable walkthrough for adding LeapOCR to your app using the JavaScript/TypeScript SDK – from installation to your first production-ready workflow.

How to Integrate LeapOCR in Your App: A Step-by-Step API + SDK Guide

You don’t need to be a machine learning expert to add AI-powered document processing to your product.

In this guide, we’ll walk through how to integrate LeapOCR into a Node.js or TypeScript app using the official JavaScript/TypeScript SDK, and get from “API key” to “structured data” as quickly (and safely) as possible.

This walkthrough focuses on the TypeScript SDK. If you want to use another language or call the HTTP API directly, check the LeapOCR docs for language-specific guides and reference material.

What We’ll Build (And Who This Is For)

By the end of this guide, you’ll have a small but production-minded integration that can:

  • Take a PDF or image (via URL or upload)
  • Send it to LeapOCR for processing
  • Wait for the job to finish
  • Return structured JSON that you can store, search, or push into other systems

This is aimed at:

  • Product engineers and SaaS builders
  • Comfortable with basic Node.js/TypeScript
  • Familiar with environment variables and API keys

We’ll stay out of ML theory and focus on the pieces you actually need to ship a feature.

Prerequisites: Keys, Runtime, and Docs

You’ll get the most out of this guide if you have:

  • Node.js 18 or higher installed
  • A LeapOCR account and API key
  • A package manager: npm, yarn, or pnpm
  • A basic TypeScript / JavaScript project (Express, Next.js, Nest, a plain Node script—anything is fine)

You can generate an API key from your LeapOCR dashboard. Once you have it:

  • Store it as an environment variable (for example, LEAPOCR_API_KEY)
  • Do not commit it to Git
  • Do not expose it in client-side code

For anything beyond this guide—other SDKs, raw HTTP requests, model details—keep the LeapOCR docs open in a tab.

SDK vs Raw HTTP: Which Path to Choose?

Under the hood, LeapOCR is a set of HTTP APIs. You could talk to it with fetch or axios and some JSON payloads, and you’ll be fine.

So why use the SDK?

  • TypeScript-first: you get request and response types out of the box
  • Less boilerplate: no need to hand-roll multipart uploads or polling logic
  • Built-in retry logic: transient network issues are handled for you
  • Universal runtime: works in Node.js, Deno or bun.

When does raw HTTP make sense?

  • If you’re in an environment where you can’t install the SDK
  • If you’re building for a language LeapOCR doesn’t have an SDK for yet

For most JavaScript/TypeScript apps, starting with the SDK is faster, safer, and easier to maintain.

Installing and Initializing the LeapOCR Client

First, install the SDK:

npm install leapocr
# or
yarn add leapocr
# or
pnpm add leapocr

Then create a small helper module—for example src/lib/leapocrClient.ts:

import { LeapOCR } from "leapocr";

if (!process.env.LEAPOCR_API_KEY) {
  throw new Error("Missing LEAPOCR_API_KEY environment variable");
}

export const leapocr = new LeapOCR({
  apiKey: process.env.LEAPOCR_API_KEY,
});

Now anywhere in your backend code, you can import this client:

import { leapocr } from "../lib/leapocrClient";

// later: await leapocr.ocr.processURL(...)

Keeping initialization in one place makes it easier to:

  • Swap configuration (e.g., timeouts) later
  • Mock or stub the client in tests
  • Avoid accidentally creating many clients with different keys

Step 1: Processing Your First Document from a URL

Let’s start with the simplest possible flow: process a document that’s already accessible via URL.

The basic lifecycle looks like this:

  1. Submit a document for processing → get back a jobId
  2. Wait for the job to finish
  3. Fetch the final result and inspect the extracted data

Here’s a minimal example:

import { leapocr } from "../lib/leapocrClient";

async function processInvoiceFromUrl() {
  const job = await leapocr.ocr.processURL("https://example.com/invoice.pdf", {
    format: "structured",
    model: "standard-v1",
    instructions: "Extract invoice number, date, and total amount",
  });

  // Simple helper that polls until the job is done
  const status = await leapocr.ocr.waitUntilDone(job.jobId);

  if (status.status !== "completed") {
    throw new Error(`OCR job failed with status: ${status.status}`);
  }

  const result = await leapocr.ocr.getJobResult(job.jobId);

  console.log("Credits used:", result.credits_used);
  console.dir(result.pages, { depth: null });
}

processInvoiceFromUrl().catch(console.error);

At this point you’ve already connected all the pieces:

  • LeapOCR client
  • URL-based processing
  • A basic asynchronous job flow

In a real app, you’d call this from an API route or background worker instead of a one-off script, but the core logic is the same.

Step 2: Processing Local Files with processFile

Most real workflows involve user uploads or files stored locally (or in object storage) rather than public URLs.

For Node.js, you can combine fs with processFile:

import { readFileSync } from "fs";
import { leapocr } from "../lib/leapocrClient";

async function processLocalInvoice(path: string) {
  const fileBuffer = readFileSync(path);

  const job = await leapocr.ocr.processFile(fileBuffer, {
    format: "structured",
    model: "pro-v1",
    instructions:
      "Extract invoice number, vendor, date, currency, and total amount",
  });

  const status = await leapocr.ocr.waitUntilDone(job.jobId);

  if (status.status !== "completed") {
    throw new Error(`OCR job failed with status: ${status.status}`);
  }

  const result = await leapocr.ocr.getJobResult(job.jobId);
  return result.pages;
}

In a web app, this function could:

  • Live in your backend
  • Be called from an API endpoint that receives an uploaded file
  • Return the extracted JSON to your frontend or persist it in a database

Choosing Models and Output Formats

LeapOCR gives you a few levers to tune behavior:

  • Models (examples):
    • standard-v1: fast, general-purpose OCR + extraction
    • pro-v1: higher accuracy for complex documents, at higher cost
  • Formats:
    • "structured": a single JSON object for the document (great for form-like data)
    • "markdown": readable text per page (good for full-text conversion and search)
    • "per-page-structured": JSON per page (useful for multi-section or mixed documents)

Rough guidance:

  • Start with standard-v1 and "structured" for most business documents
  • Use "markdown" when you care more about content than specific fields
  • Reach for "per-page-structured" when each page is logically independent

Adding Custom Schemas for Structured Extraction

You’ll usually know what fields you care about before you start writing code.

Instead of scraping values out of generic OCR output, you can tell LeapOCR what you want using a JSON schema.

Option 1: Plain JSON schema

For a simple invoice, a hand-written schema might look like this:

const invoiceSchema = {
  type: "object",
  properties: {
    invoice_number: { type: "string" },
    total_amount: { type: "number" },
    invoice_date: { type: "string" },
    vendor_name: { type: "string" },
  },
};

Then pass it when you process a file:

const job = await leapocr.ocr.processFile("./invoice.pdf", {
  format: "structured",
  model: "pro-v1",
  schema: invoiceSchema,
  instructions: "Extract values and multiply all monetary fields by 100",
});

const status = await leapocr.ocr.waitUntilDone(job.jobId);

if (status.status === "completed") {
  const result = await leapocr.ocr.getJobResult(job.jobId);
  console.log("Extracted invoice:", result.pages);
}

Option 2: Reusing existing Zod schemas

If you already use Zod to validate your data, you don’t have to define everything twice. You can:

  1. Define your invoice shape as a Zod schema
  2. Convert it to JSON Schema
  3. Send that JSON Schema to LeapOCR

Using zod-to-json-schema as an example:

import { z } from "zod";
import { zodToJsonSchema } from "zod-to-json-schema";

const InvoiceZodSchema = z.object({
  invoice_number: z.string(),
  total_amount: z.number(),
  invoice_date: z.string(),
  vendor_name: z.string(),
});

const invoiceJsonSchema = zodToJsonSchema(InvoiceZodSchema, "Invoice");

const job = await leapocr.ocr.processFile("./invoice.pdf", {
  format: "structured",
  schema: invoiceJsonSchema,
  instructions: "Translate all text fields to French where reasonable",
});

Now:

  • LeapOCR uses the JSON Schema to structure its extraction
  • Your app uses the same Zod schema to validate and type your own models
  • Optional instructions let you nudge behavior further (for example, “translate to French” or “return monetary values in cents instead of dollars”)

The benefit in both approaches is the same: your downstream code can assume a stable shape for the extracted data, while instructions let you fine-tune how that data should look.

Waiting for Jobs and Monitoring Progress

For small or medium documents, waitUntilDone is usually all you need. It handles polling behind the scenes and returns when the job is complete (or failed).

If you need more control—for example, to show progress in a UI—you can manually poll:

const pollIntervalMs = 2000;
const maxAttempts = 150; // ~5 minutes
let attempts = 0;

while (attempts < maxAttempts) {
  const status = await leapocr.ocr.getJobStatus(job.jobId);

  console.log(
    `Status: ${status.status} (${status.progress?.toFixed(1)}% complete)`,
  );

  if (status.status === "completed") {
    const result = await leapocr.ocr.getJobResult(job.jobId);
    console.log("Processing complete!");
    break;
  }

  if (status.status === "failed") {
    throw new Error("OCR job failed");
  }

  await new Promise((resolve) => setTimeout(resolve, pollIntervalMs));
  attempts++;
}

As a rule of thumb:

  • Use waitUntilDone for server-side flows where the client doesn’t need progress
  • Use manual polling (or webhooks / queues) for longer-running jobs and richer UIs

Handling Errors, Timeouts, and Retries

Things will go wrong occasionally. Plan for:

  • Authentication issues: invalid or missing API key
  • Bad inputs: unsupported file types, corrupted PDFs, unreachable URLs
  • Network hiccups: timeouts, transient errors
  • Job failures: documents the model simply can’t process

The SDK includes built-in retry logic for transient errors, but you should still:

  • Wrap calls in try/catch
  • Log the jobId, input metadata, and error messages
  • Apply your own timeouts around long-running jobs

For example:

try {
  const job = await leapocr.ocr.processURL(docUrl, {
    format: "structured",
    model: "standard-v1",
  });

  const status = await leapocr.ocr.waitUntilDone(job.jobId);

  if (status.status !== "completed") {
    throw new Error(`Job did not complete successfully: ${status.status}`);
  }
} catch (error) {
  console.error("LeapOCR error", { error, docUrl });
  // Optionally notify an error tracking service or mark this document as failed
}

Using Templates (templateSlug) for Reusable Configurations

If you’re processing the same kind of document over and over—like invoices from many vendors or a standard onboarding form—templates can save you a lot of time.

Templates let you define:

  • Which fields you care about
  • Any special instructions for the model
  • The model and formats you want to use

Once a template is defined in LeapOCR, you can reference it by templateSlug:

const job = await leapocr.ocr.processFile("./invoice.pdf", {
  templateSlug: "my-invoice-template",
  model: "pro-v1",
});

const result = await leapocr.ocr.waitUntilDone(job.jobId);
console.log("Extracted data:", result.data);

This is especially helpful when:

  • Multiple services need to share the same extraction behavior
  • You want to tweak extraction logic centrally without redeploying code

Securing Your API Key in Different Environments

Your API key is effectively a password to your LeapOCR account. Treat it like one.

General rules:

  • Never hardcode it in source files
  • Never expose it directly to the browser or mobile clients
  • Use environment variables or your hosting provider’s secrets manager

Typical patterns:

  • Local development: store LEAPOCR_API_KEY in .env (but don’t commit that file)
  • Serverless / PaaS: configure the key via your platform’s environment variable UI
  • Frontend apps: send files/URLs to your own backend; your backend talks to LeapOCR

If a key is ever exposed, rotate it from the LeapOCR dashboard and redeploy with the new value.

Taking It to Production: Logging, Monitoring, and Cleanup

Getting something working locally is step one. Running it reliably in production is where the real work starts.

A few best practices:

  • Log jobIds and document identifiers so you can trace failures
  • Track credits or usage over time to understand cost
  • Monitor error rates and latency for OCR calls
  • Alert when failure spikes happen or processing falls behind

You can also delete jobs when you’re done with them:

await leapocr.ocr.deleteJob(job.jobId);
console.log("Job deleted successfully");

Whether you do that depends on how long you want to keep results around and how you handle auditing. The key is to be intentional rather than letting jobs accumulate silently.

Where to Go Next: Deeper Docs and Example Workflows

Once you’ve wired up the basics, the fun part is applying this to real workflows in your product.

Good next steps:

  • Browse the LeapOCR docs for more options and examples
  • Explore different models and formats to see how they affect results
  • Add schemas or templates for your specific document types
  • Read the companion guide on building an automated invoice processing system to see how this fits into a larger workflow

Most teams see the best results by starting with a single, narrow use case—nailing it end-to-end—and then expanding to more document types once the pattern feels solid.

Back to Blog
Share this article

Ready to automate your document workflows?

Join thousands of developers using LeapOCR to extract data from documents with high accuracy.

Get Started for Free