PDF Vector

PDF Extraction API with JSON Schema

Extract structured data from PDF documents using AI and JSON Schema. Perfect for invoices, forms, research papers, and more.

  • Schema-Driven Extraction – Define your data structure with JSON Schema and get perfectly formatted results every time
  • AI-Powered Understanding – Advanced AI analyzes documents to extract exactly the data you need, handling variations gracefully
  • Type-Safe Integration – JSON Schema validation ensures consistent, predictable data structure for your applications

PDF Extract API

API Docs
import { readFile } from "fs/promises";
import { PDFVector } from "pdfvector";

const client = new PDFVector({
  apiKey: "pdfvector_xxxxxxx"
});

// Extract invoice data with schema
const invoiceResult = await client.extract({
  url: "https://example.com/invoice.pdf",
  prompt: "Extract all invoice details from this document",
  schema: {
    type: "object",
    properties: {
      invoiceNumber: { type: "string" },
      totalAmount: { type: "number" },
    },
    required: ["invoiceNumber", "totalAmount"],
    additionalProperties: false
  }
});

// Extract file data with schema
const paperResult = await client.extract({
  data: await readFile("research.pdf"),
  contentType: "application/pdf",
  prompt: "Summarize this file",
  schema: {
    type: "object",
    properties: {
      title: { type: "string" },
      summary: { type: "string" },
    },
    required: ["title", "summary"],
    additionalProperties: false
  }
});

Structured Data Extraction

Extract exactly the data you need from PDFs using JSON Schema. Perfect for automating invoice processing, form data extraction, contract analysis, and converting unstructured documents into structured, actionable data.

Get started

Schema-Driven Extraction

Define exactly what data you need using JSON Schema. Get perfectly structured, validated data from invoices, forms, contracts, research papers, and any PDF document.

AI-Powered Intelligence

Advanced AI understands document context and structure. Handles variations, missing fields, and complex layouts gracefully while maintaining data accuracy.

Type-Safe Integration

JSON Schema validation ensures consistent data structure. Generate TypeScript types from schemas for compile-time safety in your applications.

Enterprise Ready

Production-grade API with high availability, detailed error handling, and comprehensive documentation. Your documents are processed securely and never stored.

Example Extractions

Real examples of structured data extraction from various PDF documents

Original Document

Invoice

Output

AI-extracted structured data

Extraction Prompt

Extract all invoice details

JSON Schema

{
  "type": "object",
  "properties": {
    "invoiceNumber": {
      "type": "string"
    },
    "totalAmount": {
      "type": "number"
    },
    "Basic Fee wmView": {
      "type": "string"
    }
  },
  "required": [
    "invoiceNumber",
    "totalAmount"
  ],
  "additionalProperties": false
}

Extracted Data

{
  "data": {
    "invoiceNumber": "123100401",
    "totalAmount": 453.53,
    "Basic Fee wmView": "130,00 €"
  },
  "pageCount": 3,
  "creditCount": 9
}

One subscription, all APIs

Start for free, then scale as you grow. No hidden fees.

Save one month

Free

$0

Credit Card Required

Perfect for testing and small projects

  • Access to all APIs
  • 100 credits
Subscribe to Free

Basic

$15/month

$176 billed annually

Great for personal projects and small businesses

  • Access to all APIs
  • 3,000 credits
Subscribe to Basic
Most Popular

Pro

$72/month

$869 billed annually

Most popular plan for growing businesses

  • Access to all APIs
  • 100,000 credits
Subscribe to Pro

Enterprise

$305/month

$3663 billed annually

For large-scale applications and enterprises

  • Access to all APIs
  • 500,000 credits
Subscribe to Enterprise

Ready to Extract Structured Data from PDFs?

Join developers who use our Extract API to automatically convert PDF documents into structured JSON data. Define your schema once and get consistent, validated data from every document.

No setup fees • Integrate in minutes • Cancel anytime

Frequently asked questions