Back to blog Technical guide

The Ultimate Guide to Automated Supplier Due Diligence for ESG

Focus on the 'S' in ESG—social compliance. Automating the review of supplier codes of conduct and certifications.

supplier due diligence social compliance S in ESG automation
Published
January 18, 2025
Read time
9 min
Word count
1,765
The Ultimate Guide to Automated Supplier Due Diligence for ESG preview

Supplier Due Diligence Header

The Ultimate Guide to Automated Supplier Due Diligence for ESG

Your procurement team just received 200 supplier questionnaires. Each one contains a code of conduct compliance checklist, various certifications (ISO 14001, SA8000, EcoVadis), health and safety policies running 50+ pages, and human rights disclosures in different formats.

Reviewing a single supplier takes about 45 minutes. For 200 suppliers, that adds up to 150 person-days. Few teams have that kind of time to spare.

This guide shows you how to automate social compliance due diligence using AI-powered document processing.

The Regulatory Landscape

The EU Corporate Sustainability Due Diligence Directive (CS3D), adopted in May 2024, requires companies with 1,000+ employees and €450 million in global turnover to conduct due diligence. Compliance is mandatory by 2027. Similar regulations are appearing worldwide, making automation essential for scaling supplier assessments.

By 2026-2027, over 50,000 companies will need to perform supplier due diligence. The KPMG Global ESG Due Diligence Study 2024 found that ESG considerations are becoming a priority in transactions, with leading investors integrating ESG factors into their investment decisions.

The “S” in ESG: Why Social Compliance Matters

Social compliance covers three main areas.

Labor practices include child labor prohibitions, forced labor prevention, working hours compliance, and fair wages and benefits.

Human rights encompass non-discrimination policies, freedom of association, collective bargaining rights, and workplace safety.

Ethics cover anti-bribery and corruption measures, whistleblower protections, data privacy, and supply chain transparency.

Several regulations drive these requirements:

  • EU CS3D (Corporate Sustainability Due Diligence Directive): Affects companies with 1,000+ employees and €450M turnover. Compliance required by 2027.
  • German Supply Chain Due Diligence Act (LkSG): Mandatory for 3,000+ German companies
  • UK Modern Slavery Act: Requires annual statements
  • California Transparency in Supply Chains Act: Mandates specific disclosures

What Documents Need Review?

Supplier Codes of Conduct

You’ll need to extract the policy existence and date, coverage scope (employees, contractors), specific provisions (child labor, forced labor, discrimination), enforcement mechanisms, and signature details. The challenge here is that these documents come in varying formats—PDFs, Word documents, web pages.

Certifications and Audits

Common certifications include ISO 14001 (environmental management), ISO 45001 (occupational health and safety), SA8000 (social accountability), EcoVadis (sustainability rating), and SEDEX (supplier ethical data exchange).

For each certification, extract the certificate number, issue and expiry dates, certification body, scope (which facilities are covered), and any scores or ratings.

Research from 2025 shows that 79% of companies struggle with supplier data availability. AI-powered extraction can improve response rates from 40% (manual processes) to over 65%, while cutting review time by 90%.

Policy Documents

Key policies to review include health and safety, human rights, anti-bribery and corruption, whistleblower protection, and data privacy. Extract whether each policy exists, when it was last updated, the key commitments it makes, and who approved it.

Questionnaire Responses

Suppliers may complete questionnaires through frameworks like EcoVadis, CDP supply chain, or custom assessments. Extract each response, whether supporting evidence was provided, self-assessed scores, and any improvement targets they’ve set.

How to Automate the Process

Step 1: Classify Each Document

First, automatically identify what type of document you’re working with.

# Schema for supplier compliance documents
classification_schema = {
  "type": "object",
  "properties": {
    "document_type": {
      "enum": [
        "code_of_conduct",
        "certification",
        "policy_document",
        "questionnaire_response",
        "audit_report"
      ]
    },
    "supplier_id": {"type": "string"},
    "supplier_name": {"type": "string"},
    "document_date": {"type": "string", "format": "date"}
  }
}

classification_template = {
  "name": "supplier-doc-classifier",
  "schema": classification_schema,
  "instructions": """
  Classify the supplier compliance document:
  - Code of conduct: Look for "code of conduct," "supplier code," "ethical standards"
  - Certification: Look for "certificate," "ISO," "certified by"
  - Policy: Look for "policy," "procedure," "standard"
  - Questionnaire: Look for questions, responses, ratings
  - Audit report: Look for "audit," "assessment," "findings"
  """
}

Step 2: Extract Data Based on Document Type

Once you know the document type, extract the relevant fields using schemas specific to each document category.

For codes of conduct, you’ll capture whether a policy exists, when it was dated, who it covers (employees and contractors), what it prohibits (child labor, forced labor, discrimination), how it’s enforced, and whether it’s signed.

coc_schema = {
  "type": "object",
  "properties": {
    "policy_exists": {"type": "boolean"},
    "policy_date": {"type": "string", "format": "date"},
    "covers_employees": {"type": "boolean"},
    "covers_contractors": {"type": "boolean"},
    "prohibitions": {
      "child_labor": {"type": "boolean"},
      "forced_labor": {"type": "boolean"},
      "discrimination": {"type": "boolean"}
    },
    "enforcement_mechanism": {
      "exists": {"type": "boolean"},
      "description": {"type": "string"}
    },
    "signature": {
      "signed": {"type": "boolean"},
      "signatory": {"type": "string"},
      "date": {"type": "string", "format": "date"}
    }
  }
}

For certifications, extract the certificate type, number, dates, certification body, scope, facilities covered, and scores.

certification_schema = {
  "type": "object",
  "properties": {
    "certificate_type": {
      "enum": ["ISO_14001", "ISO_45001", "SA8000", "EcoVadis", "SEDEX", "Other"]
    },
    "certificate_number": {"type": "string"},
    "issued_date": {"type": "string", "format": "date"},
    "expiry_date": {"type": "string", "format": "date"},
    "certification_body": {"type": "string"},
    "scope": {"type": "string"},
    "facilities_covered": {"type": "array", "items": {"type": "string"}},
    "score": {
      "type": "object",
      "properties": {
        "overall": {"type": "number"},
        "environmental": {"type": "number"},
        "social": {"type": "number"}
      }
    }
  }
}

For questionnaires, capture the questionnaire type, each question and response, whether evidence was provided, self-assigned scores, and overall completeness.

questionnaire_schema = {
  "type": "object",
  "properties": {
    "questionnaire_type": {
      "enum": ["EcoVadis", "CDP", "Custom"]
    },
    "responses": {
      "type": "array",
      "items": {
        "type": "object",
        "properties": {
          "question_id": {"type": "string"},
          "question": {"type": "string"},
          "response": {"type": "string"},
          "evidence_provided": {"type": "boolean"},
          "self_score": {"type": "number"}
        }
      }
    },
    "overall_score": {"type": "number"},
    "completeness_percentage": {"type": "number"}
  }
}

Step 3: Calculate Risk Scores

With extracted data in hand, calculate a social compliance risk score for each supplier. This example uses a 0-100 scale where higher scores indicate greater risk.

Risk scoring matrix showing how policy gaps and missing certifications increase risk score FIG 2.0 — Algorithm for calculating supplier risk scores

def calculate_social_risk_score(extracted_data: dict) -> dict:
  """Calculate social compliance risk score (0-100, higher = riskier)."""

  risk_score = 50  # Base score

  # Code of conduct (20 points)
  if not extracted_data.get("code_of_conduct", {}).get("policy_exists"):
    risk_score += 20

  # Certifications (15 points)
  if not extracted_data.get("certifications", []):
    risk_score += 15

  # Questionnaire responses (15 points)
  completeness = extracted_data.get("questionnaire", {}).get("completeness_percentage", 100)
  if completeness < 80:
    risk_score += 15

  # Policy gaps (10 points)
  missing_policies = extracted_data.get("missing_policies", [])
  risk_score += len(missing_policies) * 2

  # Audit findings (15 points)
  critical_findings = extracted_data.get("audit_findings", {}).get("critical", 0)
  risk_score += critical_findings * 5

  risk_score = min(100, risk_score)

  # Determine risk level
  if risk_score >= 80:
    risk_level = "critical"
    recommendation = "Immediate action required, consider replacement"
  elif risk_score >= 60:
    risk_level = "high"
    recommendation = "Require improvement plan, monitor closely"
  elif risk_score >= 40:
    risk_level = "medium"
    recommendation = "Monitor and encourage improvement"
  else:
    risk_level = "low"
    recommendation = "Standard monitoring"

  return {
    "risk_score": risk_score,
    "risk_level": risk_level,
    "recommendation": recommendation
  }

Building the Pipeline

Here’s how the complete workflow fits together:

Automation pipeline diagram showing flow from portal to classification to risk scoring FIG 1.0 — End-to-end automated due diligence pipeline

Complete Implementation

This Python function shows the complete workflow for processing supplier documents.

from leapocr import LeapOCR

client = LeapOCR(api_key=os.getenv("LEAPOCR_API_KEY"))

def process_supplier_documents(supplier_id: str, documents: list[str]):
  """Process all compliance documents for a supplier."""

  extracted_data = {
    "supplier_id": supplier_id,
    "documents": []
  }

  for doc_path in documents:
    # Classify document
    classification = client.ocr.process_file(
      file_path=doc_path,
      format="structured",
      template_slug="supplier-doc-classifier"
    )

    class_result = client.ocr.wait_until_done(classification["job_id"])
    doc_type = class_result["pages"][0]["result"]["document_type"]

    # Extract based on type
    extraction_template = f"supplier-{doc_type}-extractor"
    extraction = client.ocr.process_file(
      file_path=doc_path,
      format="structured",
      template_slug=extraction_template
    )

    extract_result = client.ocr.wait_until_done(extraction["job_id"])
    data = extract_result["pages"][0]["result"]

    extracted_data["documents"].append({
      "type": doc_type,
      "data": data,
      "confidence": extract_result["pages"][0].get("confidence_score", 0)
    })

  # Calculate risk score
  risk_score = calculate_social_risk_score(extracted_data)

  # Save to database
  save_supplier_assessment(extracted_data, risk_score)

  return {
    "supplier_id": supplier_id,
    "risk_score": risk_score["risk_score"],
    "risk_level": risk_score["risk_level"]
  }

Real-World Example

An automotive company needed to assess 150 suppliers for German LkSG compliance. Each supplier submitted 5-8 documents including codes of conduct, certifications, policies, and questionnaires.

Manual processing would have required 45 minutes per supplier, totaling 112.5 person-days. At €1,000 per day, that’s €112,500 and a four-month timeline.

Using automated processing, they spent 5 minutes per supplier (upload plus review), totaling 12.5 person-days. The cost came to €12,500 for labor plus €750 for API usage—a €13,250 total. The entire project took three weeks.

The results: an 89% cost reduction, 75% time savings, 12 high-risk suppliers identified for engagement, and 37 suppliers flagged as missing critical certifications.

Costs and Benefits

For a project involving 150 suppliers, here’s how the numbers compare:

Cost comparison chart showing 86% savings with automation FIG 3.0 — Cost comparison: Manual vs. Automated Due Diligence

Cost ComponentManualAutomatedSavings
Labor€112,500€12,500€100,000
API costs€0€750-€750
Tools€0€2,500-€2,500
Total€112,500€15,750€96,750 (86% savings)

Beyond cost savings, automation provides several strategic advantages:

Risk mitigation: Identify high-risk suppliers before problems occur, address compliance gaps proactively, and protect your brand reputation.

Regulatory readiness: Generate LkSG compliance documentation, maintain an audit trail for all assessments, and apply a standardized scoring methodology.

Supplier development: Engage with suppliers based on their risk scores, track improvements over time, and recognize leaders.

Scalability: Assess new suppliers in hours rather than days, re-assess existing suppliers annually, and expand the program to Tier 2 and Tier 3 suppliers.

Getting Started

30-Day Implementation Plan

Week 1: Template Development

  • Days 1-3: Build classification template
  • Days 4-5: Build extraction templates (code of conduct, certifications, questionnaires)
  • Days 6-7: Test on 20 sample documents

Week 2: Pipeline Integration

  • Days 8-10: Build risk scoring logic
  • Days 11-12: Set up database schema and API
  • Days 13-14: Develop dashboard

Week 3: Pilot Testing

  • Days 15-17: Process 50 suppliers as a pilot
  • Days 18-19: Validate results against manual review
  • Days 20-21: Refine templates and scoring based on findings

Week 4: Rollout

  • Days 22-24: Process remaining suppliers
  • Days 25-26: Generate reports for procurement team
  • Days 27-28: Plan engagement with high-risk suppliers
  • Days 29-30: Present to stakeholders and plan Phase 2

Wrapping Up

Social compliance due diligence is both a regulatory requirement and a business necessity. Manual review doesn’t scale, but automation makes it feasible.

For 150 suppliers, automation delivers an 86% cost reduction (from €112K to €16K) and 75% faster assessments (from four months to three weeks). You get consistent scoring without human bias and audit-ready documentation with full traceability.

Companies that automate supplier due diligence now gain a competitive advantage: lower risk, faster supplier onboarding, and better supplier relationships.


Next Steps:

Try LeapOCR on your own documents

Start with 100 free credits and see how your workflow holds up on real files.

Eligible paid plans include a 3-day trial with 100 credits after you add a credit card, so you can test actual PDFs, scans, and forms before committing to a rollout.

Keep reading

Related notes for the same operating context

More implementation guides, benchmarks, and workflow notes for teams building document pipelines.