Guide

Export an Email to PDF from the CLI

Legal hold, expense receipts, and compliance audits all need email frozen as a fixed document. A PDF is that document. Pull a message HTML body as JSON, extract it with jq, and hand the HTML to a headless renderer (wkhtmltopdf or Chromium --print-to-pdf) to produce an archival PDF. This guide builds a one-command email-to-PDF pipeline that runs unattended and names files by date and sender.

Written by Qasim Muhammad Staff SRE

Updated June 9, 2026

Verified — CLI 3.1.17 · Gmail, Outlook · last tested June 9, 2026

Command references used in this guide: nylas email read, nylas email search, and nylas email attachments.

Why export an email to PDF instead of saving the raw message?

A PDF is a fixed, self-contained snapshot that renders identically in 10 years, which a raw .eml file or a live HTML body cannot guarantee. Legal hold, expense reimbursement, and SOC 2 audits all ask for email as a frozen document. PDF/A, the ISO 19005 archival profile, embeds fonts and forbids external links so the file stays readable offline.

The raw message format has the opposite goal. An .eml file references remote images by URL and depends on a mail client to render its MIME parts, so it can look different depending on what opens it. Courts in the United States accept PDF exhibits as documentary evidence, and most expense tools reject anything but PDF receipts. Freezing the message removes that ambiguity. The pipeline below pulls one message and produces one archival file in a single pass, under two seconds for a typical 50 KB email.

How do I pull an email's HTML body from the CLI?

Read the message with nylas email read <message-id> --json and the .body field holds the full HTML. The command returns one JSON object per message; jq -r '.body' strips the JSON quoting and unescapes the HTML so it lands on disk as a valid document. Authentication uses OAuth tokens the tool refreshes automatically, so no SMTP or IMAP setup is needed.

You need a message ID first. Run nylas email search with a query and the --json flag, then read the .id of the result you want. The search command auto-paginates past 200 results and accepts --from, --subject, --after, and --before filters, so you can scope an export to a single sender or date range. The two commands below find an invoice and write its HTML body to a file.

# Find the message and grab its ID
MSG_ID=$(nylas email search "invoice" --from "billing@vendor.com" --json --limit 1 | jq -r '.[0].id')

# Pull the HTML body to a file
nylas email read "$MSG_ID" --json | jq -r '.body' > email.html

How do I render the HTML to PDF with a headless browser?

Hand the HTML file to a headless renderer. wkhtmltopdf is a standalone WebKit binary that needs no browser install, which suits servers and CI. Headless Chromium uses the same engine users read mail in, so layouts match the inbox more closely. Both turn a local HTML file into a PDF in one command and run with no windowing system attached.

Pick by environment. On a bare Linux server or a Docker image, install wkhtmltopdf from your package manager because it carries no Chrome dependency. On a laptop or any CI runner with Chrome present, use the --headless --print-to-pdf flags that Google documents as part of Chromium's headless mode. Chromium added native PDF printing in version 59, released in 2017, so any modern install supports it.

# Option A: wkhtmltopdf (standalone, no browser needed)
wkhtmltopdf email.html invoice.pdf

# Option B: headless Chromium (matches inbox rendering)
chromium --headless --disable-gpu --print-to-pdf=invoice.pdf email.html

# On macOS the Chrome binary lives inside the app bundle:
"/Applications/Google Chrome.app/Contents/MacOS/Google Chrome" \
  --headless --print-to-pdf=invoice.pdf email.html

How do I batch-export many emails to PDF at once?

Loop over a search result and render one PDF per message. Pull the matching messages once with nylas email search --json, iterate the array with jq -c, and name each file by date and message ID so output sorts chronologically and never collides. A 50-message export finishes in well under a minute because each render is independent and the inbox is fetched a single time.

Scope the run by date to stay incremental. The script below filters with --after and --before, writes each HTML body to a temp file, and renders it with wkhtmltopdf. Bounding a quarterly archive run this way keeps it from re-processing mail you already filed. Because every filename embeds the message ID, re-running the job overwrites cleanly instead of duplicating.

mkdir -p pdf-archive
nylas email search "*" --from "billing@vendor.com" \
  --after 2026-01-01 --before 2026-04-01 --json --limit 100 |
  jq -c '.[]' | while read -r msg; do
    id=$(echo "$msg"   | jq -r '.id')
    date=$(echo "$msg" | jq -r '.date // "undated"')
    name="pdf-archive/${date}-${id}.pdf"
    nylas email read "$id" --json | jq -r '.body' > /tmp/msg.html
    wkhtmltopdf /tmp/msg.html "$name"
  done

Why do inline images break, and how do I fix them?

Inline images break because the HTML body references them by Content-ID, not by URL. An email that shows a logo carries a tag like <img src="cid:logo123">, and that cid: scheme — defined in RFC 2392 — resolves only against the message's own MIME parts. A headless renderer has no message context, so it draws a broken-image icon for every cid: reference.

Fix it by downloading the attachments and rewriting each cid: reference to a local file or an inlined data URI before you render. The CLI lists a message's attachments with nylas email attachments list and pulls each one with nylas email attachments download. The two-step sequence below downloads the parts into a working directory, then a small sed rewrite points the cid: sources at the downloaded files so the renderer can find them. About 30% of marketing and receipt emails use Content-ID images, so this step is worth wiring in for any archive that must look like the original.

# 1. List and download the inline parts for a message
nylas email attachments list "$MSG_ID" --json |
  jq -r '.[].id' | while read -r att; do
    nylas email attachments download "$att" "$MSG_ID" -o "parts/$att"
  done

# 2. Rewrite cid: references to the downloaded files, then render
sed 's#cid:#parts/#g' email.html > email-fixed.html
wkhtmltopdf --enable-local-file-access email-fixed.html invoice.pdf

Next steps

Back up emails to JSON — keep the structured source alongside the PDFs
Extract email data with jq — the JSON-shaping toolkit behind the body extraction
Sync email to S3 — push the rendered PDFs to object storage for retention
Export email to CSV — a tabular index to sit beside the PDF archive
Email to Postgres — store metadata and a PDF path per message
Full command reference — every flag and subcommand documented

Provider notes: Gmail serves the HTML body documented in the Gmail API guides, and Outlook serves its equivalent through the Microsoft Graph mail API. The CLI normalizes both into the same .body field, so the render step is identical regardless of mailbox.