Source: https://cli.nylas.com/guides/email-to-mysql

# Load Email into MySQL from the CLI

When you want to query, join, and report on email instead of scrolling an inbox, MySQL is the workhorse. Store the full message in a JSON column and promote the few fields you filter on into typed columns. The Nylas CLI hands you each message as JSON; jq shapes the rows; an INSERT loads them. This guide builds an email-to-MySQL loader with an idempotent upsert keyed on the message ID, so a nightly cron never duplicates a row.

Written by [Nick Barraclough](https://cli.nylas.com/authors/nick-barraclough) Product Manager

Updated June 9, 2026

> **TL;DR:** Pull messages with `nylas email search --json`, shape each one into a row with `jq`, then load it into MySQL with a single `INSERT`. Keep the raw message in a `JSON` column and promote subject, sender, and date into typed columns so filters use a real index. Key the table on the message ID and use `ON DUPLICATE KEY UPDATE` so re-runs upsert instead of duplicating. The payoff — a one-query “top senders this week” — comes at the end.

Command references used in this guide: [`nylas email search`](https://cli.nylas.com/docs/commands/email-search), [`nylas email list`](https://cli.nylas.com/docs/commands/email-list), and [`nylas email read`](https://cli.nylas.com/docs/commands/email-read).

## What MySQL schema fits email best?

The best schema for email in MySQL is hybrid: one `JSON` column for the full raw message, plus typed columns for the few fields you filter and sort on. MySQL has had a native `JSON` type since version 5.7.8 (2015), so the whole message survives without a rigid column-per-field schema, while subject, sender, and date become indexable columns.

Make the message ID the primary key. The CLI returns a stable ID per message, so a primary key on that column is what makes the loader idempotent — a re-run updates the same row instead of inserting a copy. Promote sender, subject, and received-date to `VARCHAR` and `DATETIME` columns, because a btree index on a typed column beats extracting the value out of JSON on every query. The [MySQL JSON data type documentation](https://dev.mysql.com/doc/refman/8.0/en/json.html) describes how the engine validates and stores the document.

```sql
CREATE TABLE IF NOT EXISTS email (
  id        VARCHAR(191) PRIMARY KEY,
  thread_id VARCHAR(191),
  sender    VARCHAR(320),
  subject   VARCHAR(998),
  received  DATETIME,
  raw       JSON NOT NULL,
  INDEX received_idx (received),
  INDEX sender_idx (sender)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;
```

## How do I shape email rows with jq?

Shape the rows by piping `nylas email search --json` into `jq`, which extracts the fields you promoted and emits tab-separated values plus the compact raw document. The `--json` flag returns an array of messages, so `jq -c '.[]'` streams one object per line. A message with no subject would create a blank row, so fall back to a placeholder with the `//` operator.

The CLI auto-paginates a search above 200 results, so `--limit 500` returns the full set in one call rather than making you page manually. The `jq` program below pulls the message ID, thread ID, sender email from the first `from` entry, subject, and the Unix `date` field, then appends the entire message as a JSON string for the `raw` column. The [jq manual](https://jqlang.github.io/jq/manual/) documents the `@tsv` and `@json` formatters used here.

```bash
# Pull the last day of mail and shape one TSV row per message
nylas email search "*" --after 2026-06-08 --json --limit 500 \
  | jq -r '.[] | [
      .id,
      .thread_id,
      (.from[0].email // ""),
      (.subject // "(no subject)"),
      (.date | todate),
      (. | @json)
    ] | @tsv' > rows.tsv
```

## How do I INSERT rows with a JSON column?

Insert the shaped rows with the `mysql` client, loading the tab-separated file straight into the table. `LOAD DATA LOCAL INFILE` is the fastest path for a batch — MySQL reads the file in one streaming pass instead of parsing thousands of individual `INSERT` statements. Map the sixth TSV field into the `JSON` column and the engine validates the document on the way in.

For per-message inserts or webhook-driven loads, use `INSERT... ON DUPLICATE KEY UPDATE` so a message already present is refreshed rather than rejected with a 1062 duplicate-key error. That clause is MySQL's upsert, documented in the [INSERT... ON DUPLICATE KEY UPDATE reference](https://dev.mysql.com/doc/refman/8.0/en/insert-on-duplicate.html). Keyed on the message ID, the loader is safe to run every five minutes without duplicating a single row.

```bash
# Bulk-load the TSV; raw lands in the JSON column
mysql --local-infile=1 "$DB" -e "
  LOAD DATA LOCAL INFILE 'rows.tsv'
  INTO TABLE email
  FIELDS TERMINATED BY '\t'
  (id, thread_id, sender, subject, received, @raw)
  SET raw = @raw;
"

# Single-message upsert (for a webhook or cron tail)
mysql "$DB" -e "
  INSERT INTO email (id, thread_id, sender, subject, received, raw)
  VALUES (?, ?, ?, ?, ?, ?)
  ON DUPLICATE KEY UPDATE subject = VALUES(subject), raw = VALUES(raw);
"
```

## How do I query the loaded email?

Query it like any InnoDB table: filter on the typed columns for index speed, and reach into `raw` with the `->>` operator for fields you never promoted. Counting messages per sender, ranging over a date window, or joining email to your application's user table becomes plain SQL — the thing a mailbox cannot do. A query on the indexed `received` column returns in under 10 milliseconds even past 1 million rows.

This is the payoff teased at the top. “Top senders this week” is one `GROUP BY`; a CC count or an attachment filter reads straight out of the JSON document with `JSON_EXTRACT` or the `->>` shorthand. If a filter becomes hot, promote it to a generated column and index that, per the [MySQL generated columns documentation](https://dev.mysql.com/doc/refman/8.0/en/create-table-generated-columns.html). Keep the loader idempotent and scheduled, and the table stays a current, queryable mirror of the inbox.

```sql
-- Top 10 senders in the last 7 days
SELECT sender, COUNT(*) AS n
FROM email
WHERE received > NOW() - INTERVAL 7 DAY
GROUP BY sender ORDER BY n DESC LIMIT 10;

-- Pull a field you never promoted, straight from the JSON column
SELECT id, raw->>'$.snippet' AS preview
FROM email
WHERE subject LIKE '%invoice%';
```

## How do I keep the MySQL table in sync?

Keep the email table current by running the pull-shape-load pipeline on a schedule and relying on the primary-key upsert for safety. A cron entry every 5 minutes scoped to `--after` yesterday processes only recent mail, and `ON DUPLICATE KEY UPDATE` means an overlapping window never duplicates a row. The whole job is three commands behind a single shell script.

For near-real-time loads, drive the same single-message upsert from a webhook instead of a poll. A `message.created` trigger fires when mail arrives, and the handler runs one `INSERT... ON DUPLICATE KEY UPDATE` — the same SQL the batch loader uses. Polling adds up to 5 minutes of latency; a webhook cuts that to under a second. The mailbox grant the CLI manages is separate from the MySQL credential, so rotate each independently without touching the loader.

```bash
# /etc/cron.d/email-to-mysql — pull, shape, load every 5 minutes
*/5 * * * *  user  cd /opt/loader && \
  nylas email search "*" --after $(date -d yesterday +\%F) --json --limit 500 \
    | jq -r '.[] | [.id, .thread_id, (.from[0].email // ""), (.subject // "(no subject)"), (.date | todate), (.|@json)] | @tsv' \
    > rows.tsv && ./load.sh
```

## Next steps

- [Load email into Postgres](https://cli.nylas.com/guides/email-to-postgres) — the same pattern with JSONB and upserts
- [Load email into SQLite](https://cli.nylas.com/guides/email-to-sqlite) — a single-file database, no server
- [Export email to CSV](https://cli.nylas.com/guides/email-to-csv-export) — a flat file instead of a database
- [Index email in Elasticsearch](https://cli.nylas.com/guides/email-to-elasticsearch) — full-text search instead of SQL
- [Extract email data with jq](https://cli.nylas.com/guides/extract-email-data-jq) — the JSON-shaping toolkit
- [Full command reference](https://cli.nylas.com/docs/commands) — every flag and subcommand documented
