Source: https://cli.nylas.com/guides/email-to-pdf-cli

# Export an Email to PDF from the CLI

Legal hold, expense receipts, and compliance audits all need email frozen as a fixed document. A PDF is that document. Pull a message HTML body as JSON, extract it with jq, and hand the HTML to a headless renderer (wkhtmltopdf or Chromium --print-to-pdf) to produce an archival PDF. This guide builds a one-command email-to-PDF pipeline that runs unattended and names files by date and sender.

Written by [Qasim Muhammad](https://cli.nylas.com/authors/qasim-muhammad) Staff SRE

Updated June 9, 2026

> **TL;DR:** Read a message with `nylas email read <message-id> --json`, pull the `.body` field with `jq -r`, and pipe the HTML into a headless renderer. Two renderers cover almost every machine: `wkhtmltopdf in.html out.pdf` on servers, or `chromium --headless --print-to-pdf=out.pdf in.html` where Chrome is installed. The catch most people hit — inline images render as broken icons — is solved at the end.

Command references used in this guide: [`nylas email read`](https://cli.nylas.com/docs/commands/email-read), [`nylas email search`](https://cli.nylas.com/docs/commands/email-search), and [`nylas email attachments`](https://cli.nylas.com/docs/commands/email-attachments-list).

## Why export an email to PDF instead of saving the raw message?

A PDF is a fixed, self-contained snapshot that renders identically in 10 years, which a raw `.eml` file or a live HTML body cannot guarantee. Legal hold, expense reimbursement, and SOC 2 audits all ask for email as a frozen document. PDF/A, the ISO 19005 archival profile, embeds fonts and forbids external links so the file stays readable offline.

The raw message format has the opposite goal. An `.eml` file references remote images by URL and depends on a mail client to render its MIME parts, so it can look different depending on what opens it. Courts in the United States accept PDF exhibits as documentary evidence, and most expense tools reject anything but PDF receipts. Freezing the message removes that ambiguity. The pipeline below pulls one message and produces one archival file in a single pass, under two seconds for a typical 50 KB email.

## How do I pull an email's HTML body from the CLI?

Read the message with `nylas email read <message-id> --json` and the `.body` field holds the full HTML. The command returns one JSON object per message; `jq -r '.body'` strips the JSON quoting and unescapes the HTML so it lands on disk as a valid document. Authentication uses OAuth tokens the tool refreshes automatically, so no SMTP or IMAP setup is needed.

You need a message ID first. Run `nylas email search` with a query and the `--json` flag, then read the `.id` of the result you want. The search command auto-paginates past 200 results and accepts `--from`, `--subject`, `--after`, and `--before` filters, so you can scope an export to a single sender or date range. The two commands below find an invoice and write its HTML body to a file.

```bash
# Find the message and grab its ID
MSG_ID=$(nylas email search "invoice" --from "billing@vendor.com" --json --limit 1 | jq -r '.[0].id')

# Pull the HTML body to a file
nylas email read "$MSG_ID" --json | jq -r '.body' > email.html
```

## How do I render the HTML to PDF with a headless browser?

Hand the HTML file to a headless renderer. `wkhtmltopdf` is a standalone WebKit binary that needs no browser install, which suits servers and CI. Headless Chromium uses the same engine users read mail in, so layouts match the inbox more closely. Both turn a local HTML file into a PDF in one command and run with no windowing system attached.

Pick by environment. On a bare Linux server or a Docker image, install `wkhtmltopdf` from your package manager because it carries no Chrome dependency. On a laptop or any CI runner with Chrome present, use the `--headless --print-to-pdf` flags that Google documents as part of [Chromium's headless mode](https://developer.chrome.com/docs/chromium/new-headless). Chromium added native PDF printing in version 59, released in 2017, so any modern install supports it.

```bash
# Option A: wkhtmltopdf (standalone, no browser needed)
wkhtmltopdf email.html invoice.pdf

# Option B: headless Chromium (matches inbox rendering)
chromium --headless --disable-gpu --print-to-pdf=invoice.pdf email.html

# On macOS the Chrome binary lives inside the app bundle:
"/Applications/Google Chrome.app/Contents/MacOS/Google Chrome" \
  --headless --print-to-pdf=invoice.pdf email.html
```

## How do I batch-export many emails to PDF at once?

Loop over a search result and render one PDF per message. Pull the matching messages once with `nylas email search --json`, iterate the array with `jq -c`, and name each file by date and message ID so output sorts chronologically and never collides. A 50-message export finishes in well under a minute because each render is independent and the inbox is fetched a single time.

Scope the run by date to stay incremental. The script below filters with `--after` and `--before`, writes each HTML body to a temp file, and renders it with `wkhtmltopdf`. Bounding a quarterly archive run this way keeps it from re-processing mail you already filed. Because every filename embeds the message ID, re-running the job overwrites cleanly instead of duplicating.

```bash
mkdir -p pdf-archive
nylas email search "*" --from "billing@vendor.com" \
  --after 2026-01-01 --before 2026-04-01 --json --limit 100 |
  jq -c '.[]' | while read -r msg; do
    id=$(echo "$msg"   | jq -r '.id')
    date=$(echo "$msg" | jq -r '.date // "undated"')
    name="pdf-archive/${date}-${id}.pdf"
    nylas email read "$id" --json | jq -r '.body' > /tmp/msg.html
    wkhtmltopdf /tmp/msg.html "$name"
  done
```

## Why do inline images break, and how do I fix them?

Inline images break because the HTML body references them by Content-ID, not by URL. An email that shows a logo carries a tag like `<img src="cid:logo123">`, and that `cid:` scheme — defined in [RFC 2392](https://datatracker.ietf.org/doc/html/rfc2392) — resolves only against the message's own MIME parts. A headless renderer has no message context, so it draws a broken-image icon for every `cid:` reference.

Fix it by downloading the attachments and rewriting each `cid:` reference to a local file or an inlined data URI before you render. The CLI lists a message's attachments with `nylas email attachments list` and pulls each one with `nylas email attachments download`. The two-step sequence below downloads the parts into a working directory, then a small `sed` rewrite points the `cid:` sources at the downloaded files so the renderer can find them. About 30% of marketing and receipt emails use Content-ID images, so this step is worth wiring in for any archive that must look like the original.

```bash
# 1. List and download the inline parts for a message
nylas email attachments list "$MSG_ID" --json |
  jq -r '.[].id' | while read -r att; do
    nylas email attachments download "$att" "$MSG_ID" -o "parts/$att"
  done

# 2. Rewrite cid: references to the downloaded files, then render
sed 's#cid:#parts/#g' email.html > email-fixed.html
wkhtmltopdf --enable-local-file-access email-fixed.html invoice.pdf
```

## Next steps

- [Back up emails to JSON](https://cli.nylas.com/guides/backup-emails-to-json) — keep the structured source alongside the PDFs
- [Extract email data with jq](https://cli.nylas.com/guides/extract-email-data-jq) — the JSON-shaping toolkit behind the body extraction
- [Sync email to S3](https://cli.nylas.com/guides/sync-email-to-s3) — push the rendered PDFs to object storage for retention
- [Export email to CSV](https://cli.nylas.com/guides/email-to-csv-export) — a tabular index to sit beside the PDF archive
- [Email to Postgres](https://cli.nylas.com/guides/email-to-postgres) — store metadata and a PDF path per message
- [Full command reference](https://cli.nylas.com/docs/commands) — every flag and subcommand documented

Provider notes: Gmail serves the HTML body documented in the [Gmail API guides](https://developers.google.com/workspace/gmail/api/guides), and Outlook serves its equivalent through the [Microsoft Graph mail API](https://learn.microsoft.com/en-us/graph/api/resources/mail-api-overview). The CLI normalizes both into the same `.body` field, so the render step is identical regardless of mailbox.