Small BusinessReal EstateFinanceMarketingLegalOperationsSalesTemplatesPricingDocs
Get started

Step Guide: PDF

Data & Files

Generate, extract, merge, or fill PDFs

Overview

The PDF step handles all PDF operations in your workflow. It supports five actions: Extract pulls data from an uploaded PDF in automatic mode, Extract Structured parses a PDF into user-defined fields, Merge combines multiple PDFs into one, Form Fill populates a PDF template with data, and Create PDF creates a styled PDF from HTML content.

When to Use

  • Generate polished PDF reports from workflow data
  • Extract text or form field data from uploaded PDFs for downstream processing
  • Merge multiple generated or uploaded PDFs into a single document
  • Fill out PDF form templates with dynamic data from previous steps

How It Works

Each action has different input/output field sets and configuration options. Extract reads a pdf_file upload UUID and outputs extracted content. Merge takes an array of pdf_files and combines them. Form Fill uses a pre-uploaded PDF template and fills it with input field values. Create PDF converts HTML content to a styled PDF with layout options. All actions output a pdf_url for the resulting file along with metadata.

Actions

PDF Extract
Extract data from an uploaded PDF
How it works
Takes a pdf_file upload UUID as input. In automatic mode, outputs ocr_text (full text content) and form_fields (detected form field values). In structured mode, the output field set is unlocked. Define custom fields and the content is parsed into them. The pdf_file must be a valid upload UUID.
  • Best for general-purpose text extraction when you don't need specific fields
  • Connect to an AI Text Generate step to further process extracted text
PDF Extract Structured
Extract structured data from a PDF into user-defined fields
How it works
Takes a pdf_file upload UUID as input. The output field set is unlocked — define custom fields and the content is parsed into them. Takes the PDF directly, no separate Extract step needed.
  • Use when you need specific fields like invoice_number, total, or line_items rather than raw text
  • Name your output fields clearly so the parser understands what to extract
Merge PDF
Combine multiple PDFs into one document
How it works
Takes a pdf_files array of upload UUIDs as input. Downloads all PDFs, merges them in order, and uploads the result. Both input and output field sets are locked. Outputs {pdf_url, file_name, file_size, page_count, success, error}.
  • Use array_append write mode in field mappings to build the pdf_files array from multiple sources
  • Pages appear in the order of the pdf_files array
PDF Form Fill
Fill a PDF form template with data
How it works
Requires a templateUploadId in the config pointing to a pre-uploaded PDF form template. The input field set is unlocked. Define fields matching the PDF form fields and map data from upstream steps. The flatten option removes form interactivity in the output. Outputs {pdf_url, file_name, file_size, page_count, success, error}.
  • Use the PDF form field discovery feature to see what fields are available in your template
  • Set flatten: true to produce a non-editable PDF
Create PDF
Create a styled PDF from HTML
How it works
Takes a body as input and converts it to a PDF using wkhtmltopdf. Both input and output field sets are locked. Configuration options include orientation (portrait/landscape) and paper size. Outputs {pdf_url, file_name, file_size, page_count, success, error}.
  • Use AI Text Generate to build HTML content, then pipe it into this step
  • Configure paper size and orientation in the step config
  • The body field must be non-empty
Tips
  • Choose the right action when adding the step. Each has different field sets and behavior
  • All PDF actions output a pdf_url that can be mapped to email attachments or other file inputs
  • For report generation, the HTML content supports full CSS styling