Structured Data Output

Extract data into structured JSON formats with custom schemas or free-form AI extraction for powerful data integration and analysis.

Two powerful JSON formats with schema validation and free-form AI extraction

Formats

Two JSON Output Formats

Choose from structured JSON or per-page structured formats with powerful AI extraction capabilities.

Structured JSON

Define custom JSON schemas or use free-form AI extraction. Perfect for API integration, automated workflows, and structured data analysis.

Per-Page Structured

Extract data with page-by-page structure and positioning metadata. Perfect for detailed document analysis and page-specific workflows.

Examples

Real Output Examples

See how your data looks in structured JSON formats with free-form AI extraction and custom schemas.

Structured JSON Free-Form Extraction

Invoice Processing - Document Output

Complete API response showing structured financial data extracted from invoice PDF using free-form AI instructions.

AI Instructions Used:
"Extract all financial information including invoice numbers, dates, vendor details, line items with quantities and prices, subtotals, tax rates, and total amounts. Format dates as YYYY-MM-DD and identify currency symbols. Group related items and calculate totals automatically."
Complete API Response:
{
  "job_id": "8a3d7c9b-5e4f-2a1b-8c7d-9e5f4a2b1c8d",
  "file_name": "invoice_acme_corp_2024-001.pdf",
  "status": "completed",
  "result_format": "structured",
  "model": "gemini-1.5-pro",
  "average_confidence": 0.96,
  "credits_used": 12,
  "processing_time_seconds": 8.3,
  "completed_at": "2024-01-15T10:30:45Z",
  "total_pages": 1,
  "processed_pages": 1,
  "pagination": {
    "page": 1,
    "limit": 50,
    "total": 1,
    "total_pages": 1
  },
  "pages": [
    {
      "id": "page_001",
      "page_number": 1,
      "confidence": 0.96,
      "processed_at": "2024-01-15T10:30:37Z",
      "text": "INVOICE\n\nINV-2024-001\n\nDate: January 15, 2024\nDue: February 14, 2024\n\nBill To:\nGlobal Enterprises Ltd.\n456 Business Ave\nCommerce City, CC 67890\n\nFrom:\nAcme Corporation\n123 Innovation Drive\nTech City, TC 12345\n\nDescription            Qty    Unit Price    Total\nWeb Development Services  40     $75.00         $3,000.00\nUI/UX Design             20     $60.00         $1,200.00\n\nSubtotal: $4,200.00\nTax (8%): $336.00\nTotal: $4,536.00\n\nPayment Terms: Net 30",
      "metadata": {
        "dimensions": {
          "width": 792,
          "height": 1122,
          "dpi": 300
        },
        "processing_ms": 2847,
        "retry_count": 0,
        "extra": {
          "format": "structured",
          "ai_model": "gemini-1.5-pro",
          "extraction_type": "free_form"
        }
      },
      "images": []
    }
  ],
  "result": {
    "invoice_number": "INV-2024-001",
    "date": "2024-01-15",
    "due_date": "2024-02-14",
    "vendor": {
      "name": "Acme Corporation",
      "address": "123 Innovation Drive, Tech City, TC 12345"
    },
    "client": {
      "name": "Global Enterprises Ltd.",
      "address": "456 Business Ave, Commerce City, CC 67890"
    },
    "line_items": [
      {
        "description": "Web Development Services",
        "quantity": 40,
        "unit_price": 75.00,
        "total": 3000.00,
        "currency": "USD"
      },
      {
        "description": "UI/UX Design",
        "quantity": 20,
        "unit_price": 60.00,
        "total": 1200.00,
        "currency": "USD"
      }
    ],
    "subtotal": 4200.00,
    "tax_rate": 0.08,
    "tax_amount": 336.00,
    "total": 4536.00,
    "currency": "USD",
    "payment_terms": "Net 30",
    "extraction_metadata": {
      "ai_confidence": 0.96,
      "processing_time_ms": 2847,
      "extraction_method": "free_form",
      "schema_applied": false
    }
  }
}
Per-Page Structured Legal Document

Contract Analysis - Document Output

Complete API response showing page-by-page structured extraction from multi-page legal contract with positioning metadata.

AI Instructions Used:
"Extract contract terms, parties, dates, obligations, and key clauses from each page. Identify page numbers and section headers. Include document structure and relationships between parties."
Complete API Response:
{
  "job_id": "f4a7b8c9-1d2e-3f4a-5b6c-7d8e9f0a1b2c",
  "file_name": "service_agreement_techcorp_global.pdf",
  "status": "completed",
  "result_format": "per_page_structured",
  "model": "gemini-1.5-pro",
  "average_confidence": 0.95,
  "credits_used": 45,
  "processing_time_seconds": 24.7,
  "completed_at": "2024-01-15T10:35:22Z",
  "total_pages": 3,
  "processed_pages": 3,
  "pagination": {
    "page": 1,
    "limit": 50,
    "total": 3,
    "total_pages": 1
  },
  "pages": [
    {
      "id": "page_001",
      "page_number": 1,
      "confidence": 0.98,
      "processed_at": "2024-01-15T10:35:08Z",
      "text": "SERVICE AGREEMENT\n\nThis Service Agreement (\"Agreement\") is made and entered into as of February 1, 2024 (\"Effective Date\"), by and between:\n\nTechCorp Inc. (\"Provider\")\n123 Innovation Drive\nTech City, TC 12345\n\nand\n\nGlobal Enterprises Ltd. (\"Client\")\n456 Business Avenue\nCommerce City, CC 67890\n\n1. SCOPE OF SERVICES",
      "metadata": {
        "dimensions": {
          "width": 792,
          "height": 1122,
          "dpi": 300
        },
        "processing_ms": 8923,
        "retry_count": 0,
        "extra": {
          "format": "per_page_structured",
          "ai_model": "gemini-1.5-pro",
          "extraction_type": "detailed"
        }
      },
      "images": [],
      "result": {
        "page_number": 1,
        "section": "Parties",
        "elements": [
          {
            "type": "heading",
            "content": "SERVICE AGREEMENT",
            "confidence": 0.98,
            "position": {"x": 200, "y": 100, "width": 300, "height": 30}
          },
          {
            "type": "party",
            "role": "provider",
            "name": "TechCorp Inc.",
            "address": "123 Innovation Drive, Tech City, TC 12345",
            "confidence": 0.96
          },
          {
            "type": "party",
            "role": "client", 
            "name": "Global Enterprises Ltd.",
            "address": "456 Business Avenue, Commerce City, CC 67890",
            "confidence": 0.97
          }
        ],
        "metadata": {
          "processing_time_ms": 8923,
          "ai_confidence": 0.98,
          "extraction_method": "page_structured"
        }
      }
    },
    {
      "id": "page_002",
      "page_number": 2,
      "confidence": 0.94,
      "processed_at": "2024-01-15T10:35:17Z",
      "text": "2. SCOPE OF SERVICES\n\nProvider shall provide the following services to Client:\n\n2.1 Software Development and Maintenance\nProvider will develop and maintain custom software solutions as specified in Exhibit A.\n\n2.2 Technical Support and Consultation\nProvider will provide technical support and consultation services during business hours.\n\n3. TERM AND TERMINATION\n\n3.1 Term\nThis Agreement shall commence on the Effective Date and continue for a period of twenty-four (24) months.",
      "metadata": {
        "dimensions": {
          "width": 792,
          "height": 1122,
          "dpi": 300
        },
        "processing_ms": 7845,
        "retry_count": 0,
        "extra": {
          "format": "per_page_structured",
          "ai_model": "gemini-1.5-pro",
          "extraction_type": "detailed"
        }
      },
      "images": [],
      "result": {
        "page_number": 2,
        "section": "Scope of Services",
        "elements": [
          {
            "type": "heading",
            "content": "2. SCOPE OF SERVICES",
            "confidence": 0.99,
            "position": {"x": 150, "y": 80, "width": 250, "height": 25}
          },
          {
            "type": "service_item",
            "description": "Software Development and Maintenance",
            "duration": "24 months",
            "confidence": 0.94
          },
          {
            "type": "service_item",
            "description": "Technical Support and Consultation",
            "duration": "24 months", 
            "confidence": 0.95
          },
          {
            "type": "term_clause",
            "content": "This Agreement shall commence on the Effective Date and continue for a period of twenty-four (24) months.",
            "confidence": 0.96
          }
        ],
        "metadata": {
          "processing_time_ms": 7845,
          "ai_confidence": 0.94,
          "extraction_method": "page_structured"
        }
      }
    }
  ],
  "summary": {
    "document_type": "Service Agreement",
    "contract_value": "$1,200,000",
    "term": "24 months",
    "effective_date": "2024-02-01",
    "parties": ["TechCorp Inc.", "Global Enterprises Ltd."],
    "total_sections_extracted": 6,
    "renewal_clause": "Automatic renewal for 12-month terms",
    "extraction_metadata": {
      "ai_confidence": 0.95,
      "processing_time_ms": 16768,
      "extraction_method": "page_structured",
      "schema_applied": false
    }
  }
}
Schema-Based Business Card

Business Card Processing - Schema Example

Small example showing structured extraction with custom JSON schema for consistent output validation.

Input Schema:
{
  "type": "object",
  "properties": {
    "name": {"type": "string"},
    "title": {"type": "string"},
    "company": {"type": "string"},
    "email": {"type": "string", "format": "email"},
    "phone": {"type": "string"},
    "website": {"type": "string", "format": "uri"}
  },
  "required": ["name", "company"]
}
API Result:
{
  "job_id": "a1b2c3d4-e5f6-7g8h-9i0j-k1l2m3n4o5p",
  "file_name": "business_card.pdf",
  "status": "completed",
  "result_format": "structured",
  "average_confidence": 0.92,
  "credits_used": 3,
  "result": {
    "name": "Sarah Johnson",
    "title": "Product Manager",
    "company": "TechCorp Inc.",
    "email": "sarah.johnson@techcorp.com",
    "phone": "+1 (555) 123-4567",
    "website": "https://techcorp.com",
    "extraction_metadata": {
      "schema_applied": true,
      "validation_passed": true,
      "ai_confidence": 0.92
    }
  }
}
Schema-Based Per-Page

Form Processing - Per-Page Schema Example

Structured form data extraction with page-level validation using custom schema.

Page Schema:
{
  "type": "object",
  "properties": {
    "page_number": {"type": "integer"},
    "form_fields": {
      "type": "array",
      "items": {
        "type": "object",
        "properties": {
          "field_name": {"type": "string"},
          "field_value": {"type": "string"},
          "field_type": {"enum": ["text", "checkbox", "signature"]},
          "confidence": {"type": "number"}
        }
      }
    }
  }
}
Page 1 Result:
{
  "page_number": 1,
  "result": {
    "form_fields": [
      {
        "field_name": "full_name",
        "field_value": "John Doe",
        "field_type": "text",
        "confidence": 0.98
      },
      {
        "field_name": "agreement_accepted",
        "field_value": "true",
        "field_type": "checkbox",
        "confidence": 0.95
      }
    ]
  }
}
Comparison Choose Your Approach

Schema-Based vs Free-Form Extraction

Choose between strict schema validation for compliance or flexible AI extraction for rapid development.

Schema-Based

Strict data validation
Consistent output structure
Compliance ready
API integration friendly

Free-Form AI

No setup required
Adapts to document structure
Handles unstructured data
Rapid prototyping

Start Extracting Structured Data

Transform your documents into structured JSON formats with powerful AI extraction capabilities.

Ready to transform your documents with AI?