Guide

Load Email into MySQL from the CLI

When you want to query, join, and report on email instead of scrolling an inbox, MySQL is the workhorse. Store the full message in a JSON column and promote the few fields you filter on into typed columns. The Nylas CLI hands you each message as JSON; jq shapes the rows; an INSERT loads them. This guide builds an email-to-MySQL loader with an idempotent upsert keyed on the message ID, so a nightly cron never duplicates a row.

Written by Nick Barraclough Product Manager

Updated June 9, 2026

Verified — CLI 3.1.17 · Gmail, Outlook · last tested June 9, 2026

Command references used in this guide: nylas email search, nylas email list, and nylas email read.

What MySQL schema fits email best?

The best schema for email in MySQL is hybrid: one JSON column for the full raw message, plus typed columns for the few fields you filter and sort on. MySQL has had a native JSON type since version 5.7.8 (2015), so the whole message survives without a rigid column-per-field schema, while subject, sender, and date become indexable columns.

Make the message ID the primary key. The CLI returns a stable ID per message, so a primary key on that column is what makes the loader idempotent — a re-run updates the same row instead of inserting a copy. Promote sender, subject, and received-date to VARCHAR and DATETIME columns, because a btree index on a typed column beats extracting the value out of JSON on every query. The MySQL JSON data type documentation describes how the engine validates and stores the document.

CREATE TABLE IF NOT EXISTS email (
  id        VARCHAR(191) PRIMARY KEY,
  thread_id VARCHAR(191),
  sender    VARCHAR(320),
  subject   VARCHAR(998),
  received  DATETIME,
  raw       JSON NOT NULL,
  INDEX received_idx (received),
  INDEX sender_idx (sender)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;

How do I shape email rows with jq?

Shape the rows by piping nylas email search --json into jq, which extracts the fields you promoted and emits tab-separated values plus the compact raw document. The --json flag returns an array of messages, so jq -c '.[]' streams one object per line. A message with no subject would create a blank row, so fall back to a placeholder with the // operator.

The CLI auto-paginates a search above 200 results, so --limit 500 returns the full set in one call rather than making you page manually. The jq program below pulls the message ID, thread ID, sender email from the first from entry, subject, and the Unix date field, then appends the entire message as a JSON string for the raw column. The jq manual documents the @tsv and @json formatters used here.

# Pull the last day of mail and shape one TSV row per message
nylas email search "*" --after 2026-06-08 --json --limit 500 \
  | jq -r '.[] | [
      .id,
      .thread_id,
      (.from[0].email // ""),
      (.subject // "(no subject)"),
      (.date | todate),
      (. | @json)
    ] | @tsv' > rows.tsv

How do I INSERT rows with a JSON column?

Insert the shaped rows with the mysql client, loading the tab-separated file straight into the table. LOAD DATA LOCAL INFILE is the fastest path for a batch — MySQL reads the file in one streaming pass instead of parsing thousands of individual INSERT statements. Map the sixth TSV field into the JSON column and the engine validates the document on the way in.

For per-message inserts or webhook-driven loads, use INSERT ... ON DUPLICATE KEY UPDATE so a message already present is refreshed rather than rejected with a 1062 duplicate-key error. That clause is MySQL's upsert, documented in the INSERT ... ON DUPLICATE KEY UPDATE reference. Keyed on the message ID, the loader is safe to run every five minutes without duplicating a single row.

# Bulk-load the TSV; raw lands in the JSON column
mysql --local-infile=1 "$DB" -e "
  LOAD DATA LOCAL INFILE 'rows.tsv'
  INTO TABLE email
  FIELDS TERMINATED BY '\t'
  (id, thread_id, sender, subject, received, @raw)
  SET raw = @raw;
"

# Single-message upsert (for a webhook or cron tail)
mysql "$DB" -e "
  INSERT INTO email (id, thread_id, sender, subject, received, raw)
  VALUES (?, ?, ?, ?, ?, ?)
  ON DUPLICATE KEY UPDATE subject = VALUES(subject), raw = VALUES(raw);
"

How do I query the loaded email?

Query it like any InnoDB table: filter on the typed columns for index speed, and reach into raw with the ->> operator for fields you never promoted. Counting messages per sender, ranging over a date window, or joining email to your application's user table becomes plain SQL — the thing a mailbox cannot do. A query on the indexed received column returns in under 10 milliseconds even past 1 million rows.

This is the payoff teased at the top. “Top senders this week” is one GROUP BY; a CC count or an attachment filter reads straight out of the JSON document with JSON_EXTRACT or the ->> shorthand. If a filter becomes hot, promote it to a generated column and index that, per the MySQL generated columns documentation. Keep the loader idempotent and scheduled, and the table stays a current, queryable mirror of the inbox.

-- Top 10 senders in the last 7 days
SELECT sender, COUNT(*) AS n
FROM email
WHERE received > NOW() - INTERVAL 7 DAY
GROUP BY sender ORDER BY n DESC LIMIT 10;

-- Pull a field you never promoted, straight from the JSON column
SELECT id, raw->>'$.snippet' AS preview
FROM email
WHERE subject LIKE '%invoice%';

How do I keep the MySQL table in sync?

Keep the email table current by running the pull-shape-load pipeline on a schedule and relying on the primary-key upsert for safety. A cron entry every 5 minutes scoped to --after yesterday processes only recent mail, and ON DUPLICATE KEY UPDATE means an overlapping window never duplicates a row. The whole job is three commands behind a single shell script.

For near-real-time loads, drive the same single-message upsert from a webhook instead of a poll. A message.created trigger fires when mail arrives, and the handler runs one INSERT ... ON DUPLICATE KEY UPDATE — the same SQL the batch loader uses. Polling adds up to 5 minutes of latency; a webhook cuts that to under a second. The mailbox grant the CLI manages is separate from the MySQL credential, so rotate each independently without touching the loader.

# /etc/cron.d/email-to-mysql — pull, shape, load every 5 minutes
*/5 * * * *  user  cd /opt/loader && \
  nylas email search "*" --after $(date -d yesterday +\%F) --json --limit 500 \
    | jq -r '.[] | [.id, .thread_id, (.from[0].email // ""), (.subject // "(no subject)"), (.date | todate), (.|@json)] | @tsv' \
    > rows.tsv && ./load.sh

Next steps

Load email into Postgres — the same pattern with JSONB and upserts
Load email into SQLite — a single-file database, no server
Export email to CSV — a flat file instead of a database
Index email in Elasticsearch — full-text search instead of SQL
Extract email data with jq — the JSON-shaping toolkit
Full command reference — every flag and subcommand documented