Source: https://cli.nylas.com/guides/email-to-postgres

# Load Email into Postgres from the Terminal

When you need to query, join, or analyze email — not just read it — a database beats a mailbox. PostgreSQL handles email well, especially with JSONB for the raw message and typed columns for the fields you filter on. The Nylas CLI gives you each message as JSON; psql loads it. This guide builds an email-to-Postgres loader with an idempotent upsert keyed on the message ID, so re-runs never duplicate rows, across six providers.

Written by [Caleb Geene](https://cli.nylas.com/authors/caleb-geene) Director, Site Reliability Engineering

Updated June 8, 2026

> **TL;DR:** Pull messages with `nylas email search --json`, then load them into Postgres with `psql` using a JSONB column for the raw message plus typed columns for the fields you query. Key the table on the message ID and use `INSERT... ON CONFLICT DO UPDATE` so re-runs upsert instead of duplicating. The CLI handles the inbox across six providers; Postgres handles storage and queries.

Command references used in this guide: [`nylas email search`](https://cli.nylas.com/docs/commands/email-search), [`nylas email list`](https://cli.nylas.com/docs/commands/email-list), and [`nylas email read`](https://cli.nylas.com/docs/commands/email-read).

## What schema should you use for email?

Use a hybrid schema: a `JSONB` column for the full raw message plus typed columns for the few fields you filter and sort on. JSONB keeps every field without forcing a rigid schema, and Postgres can index inside it with a GIN index, per the [PostgreSQL JSON documentation](https://www.postgresql.org/docs/current/datatype-json.html). Promote subject, sender, and date to real columns so common queries use a btree index instead of scanning JSON.

Make the message ID the primary key. The CLI's message IDs are stable per message, so a primary key on that column is what makes loading idempotent — a re-run upserts the same row rather than inserting a copy. This one decision is the difference between a loader you can run safely on a schedule and one that doubles your row count every night.

```sql
CREATE TABLE IF NOT EXISTS email (
  id        text PRIMARY KEY,
  thread_id text,
  sender    text,
  subject   text,
  received  timestamptz,
  raw       jsonb NOT NULL
);
CREATE INDEX IF NOT EXISTS email_received_idx ON email (received DESC);
CREATE INDEX IF NOT EXISTS email_raw_gin ON email USING gin (raw);
```

## How do you load messages with psql?

Load by piping the CLI's JSON into a transform that emits one row per message, then insert with `psql`. The cleanest path stages the JSON into a temporary table as JSONB, then inserts into the typed table by extracting fields with `->>` operators. That keeps the raw message intact while populating the columns you query, in one SQL statement.

Wrap the insert in `ON CONFLICT (id) DO UPDATE` so a message already present is refreshed rather than rejected. Run it inside a transaction so a malformed batch rolls back cleanly instead of leaving a half-loaded table. The whole loader is a pull, a stage, and an upsert — three steps you can put behind a cron job.

```bash
# Pull, stage as JSONB, and upsert into the typed table
nylas email search "newer_than:1d" --json --limit 200 \
  | jq -c '.[]' \
  | psql "$DATABASE_URL" -v ON_ERROR_STOP=1 -c "
      CREATE TEMP TABLE stage (doc jsonb);
      COPY stage (doc) FROM STDIN;
      INSERT INTO email (id, thread_id, sender, subject, received, raw)
      SELECT doc->>'id', doc->>'thread_id',
             doc#>>'{from,0,email}', doc->>'subject',
             to_timestamp((doc->>'date')::bigint), doc
      FROM stage
      ON CONFLICT (id) DO UPDATE
        SET subject = EXCLUDED.subject, raw = EXCLUDED.raw;
  "
```

## How do you query the loaded email?

Query it like any table: filter on the typed columns for speed, and reach into `raw` with JSONB operators for fields you didn't promote. Counting messages per sender, finding everything in a date range, or joining email to your application's users becomes plain SQL — the thing a mailbox can't do. A GIN index on `raw` keeps containment queries fast even on large tables.

This is the payoff over a flat export: a database answers analytical questions a JSON file can't. Top senders this week, average response time, threads with no reply — all are one query away once the data is in Postgres. Keep the loader idempotent and scheduled, and the table stays a current, queryable mirror of the inbox.

```sql
-- Top 10 senders in the last 7 days
SELECT sender, count(*) AS n
FROM email
WHERE received > now() - interval '7 days'
GROUP BY sender ORDER BY n DESC LIMIT 10;
```

## Next steps

- [Back up emails to JSON](https://cli.nylas.com/guides/backup-emails-to-json) — a file export instead of a database
- [Import email to a graph database](https://cli.nylas.com/guides/import-email-graph-database) — model threads as a graph
- [Sync email to S3](https://cli.nylas.com/guides/sync-email-to-s3) — object storage for archives
- [Extract email data with jq](https://cli.nylas.com/guides/extract-email-data-jq) — the JSON-shaping toolkit
- [Export email to CSV](https://cli.nylas.com/guides/email-to-csv-export) — a flat file instead of a database
- [Full command reference](https://cli.nylas.com/docs/commands) — every flag and subcommand documented
