Guide

Build an AI Email Triage Agent

You get 100+ emails a day. Most are noise. An AI agent can scan your inbox, classify each message by urgency, draft replies for the important ones, and archive the rest. This guide walks through building one in Python using Nylas CLI as the email backend and any LLM (OpenAI, Anthropic, or local Ollama) as the brain. Works with Gmail, Outlook, Exchange, Yahoo, iCloud, and IMAP.

By Qasim Muhammad

The problem

The average professional receives 121 emails per day, according to the Radicati Group's 2024 Email Statistics Report. About 49% of that volume is noise: newsletters, automated notifications, marketing, and spam that made it past the filter. Manually sorting through it eats 2.5 hours per day.

An AI triage agent fixes this. It reads your unread messages, classifies each one into a priority bucket, drafts replies for anything that needs a response, and archives the rest. You review the drafts, hit send on the ones that look right, and move on.

The architecture is simple: Nylas CLI fetches the email, Python orchestrates the logic, and an LLM handles classification and drafting.

Architecture

┌─────────────┐     ┌──────────────┐     ┌─────────────┐
│  Nylas CLI   │────▶│ Python Agent │────▶│    LLM      │
│ (email I/O)  │◀────│ (orchestrator│◀────│ (classifier │
│              │     │  + actions)  │     │  + drafter) │
└─────────────┘     └──────────────┘     └─────────────┘

Flow:
1. nylas email list --unread --json  →  fetch unread messages
2. Python sends each email to LLM   →  classify as URGENT/ACTION/FYI/NOISE
3. LLM drafts replies for URGENT/ACTION emails
4. nylas email send --yes            →  send auto-replies (optional)
5. nylas email archive               →  archive NOISE
6. Repeat on cron every 15 minutes

Prerequisites

  • Nylas CLI installed and authenticated (nylas auth whoami should show your account)
  • Python 3.10+ with openai or anthropic package installed
  • An LLM API key from OpenAI, Anthropic, or a local Ollama instance running Llama 3.1
  • A connected email account (any provider works)
# Install Nylas CLI
brew install nylas/nylas-cli/nylas

# Authenticate
nylas auth login

# Install Python dependency (pick one)
pip install openai       # for OpenAI
pip install anthropic    # for Anthropic

Step 1: Fetch unread emails

The CLI's nylas email list --unread --json returns structured JSON for every unread message. In Python, call it with subprocess.run:

import subprocess
import json

def fetch_unread_emails(limit=20):
    """Fetch unread emails via Nylas CLI."""
    result = subprocess.run(
        ["nylas", "email", "list", "--unread", "--limit", str(limit), "--json"],
        capture_output=True,
        text=True
    )
    if result.returncode != 0:
        print(f"Error fetching emails: {result.stderr}")
        return []
    try:
        return json.loads(result.stdout)
    except json.JSONDecodeError:
        print(f"Failed to parse CLI output: {result.stdout[:200]}")
        return []

Each email in the returned array includes id, subject, from, snippet, date, and folders. The snippet is the first ~200 characters of the body, which is usually enough for classification without fetching the full message.

Step 2: Classify emails with an LLM

Send each email's subject and snippet to the LLM with a classification prompt. Four categories work well in practice:

  • URGENT — needs a response within 1 hour (e.g., production incidents, time-sensitive requests from your manager)
  • ACTION — needs a response today (e.g., code review requests, meeting follow-ups)
  • FYI — read later, no response needed (e.g., team updates, shared documents)
  • NOISE — archive immediately (e.g., newsletters, automated alerts, marketing)
from openai import OpenAI

client = OpenAI()  # reads OPENAI_API_KEY from env

CLASSIFY_PROMPT = """Classify this email into exactly one category.
Return ONLY the category name, nothing else.

Categories:
- URGENT: needs response within 1 hour (production issues, exec requests, time-sensitive)
- ACTION: needs response today (code reviews, meeting follow-ups, direct questions)
- FYI: read later, no response needed (team updates, shared docs, status reports)
- NOISE: archive immediately (newsletters, marketing, automated notifications, noreply@)

Email:
From: {sender}
Subject: {subject}
Preview: {snippet}
"""

def classify_email(email):
    """Classify a single email using the LLM."""
    sender = email["from"][0]["email"] if email.get("from") else "unknown"
    prompt = CLASSIFY_PROMPT.format(
        sender=sender,
        subject=email.get("subject", "(no subject)"),
        snippet=email.get("snippet", ""),
    )
    response = client.chat.completions.create(
        model="gpt-4o-mini",
        messages=[{"role": "user", "content": prompt}],
        max_tokens=10,
        temperature=0,
    )
    category = response.choices[0].message.content.strip().upper()
    # Guard against unexpected LLM output
    if category not in ("URGENT", "ACTION", "FYI", "NOISE"):
        print(f"Unexpected classification '{category}' for: {email.get('subject')}")
        category = "FYI"  # safe default
    return category

Setting temperature=0 and max_tokens=10 keeps classification deterministic and fast. GPT-4o-mini handles this at ~$0.15 per 1M input tokens, so classifying 100 emails costs about $0.002.

For Anthropic, swap the API call:

import anthropic

client = anthropic.Anthropic()  # reads ANTHROPIC_API_KEY from env

def classify_email_anthropic(email):
    """Classify using Claude."""
    sender = email["from"][0]["email"] if email.get("from") else "unknown"
    prompt = CLASSIFY_PROMPT.format(
        sender=sender,
        subject=email.get("subject", "(no subject)"),
        snippet=email.get("snippet", ""),
    )
    response = client.messages.create(
        model="claude-3-5-haiku-latest",
        max_tokens=10,
        messages=[{"role": "user", "content": prompt}],
    )
    category = response.content[0].text.strip().upper()
    if category not in ("URGENT", "ACTION", "FYI", "NOISE"):
        category = "FYI"
    return category

Step 3: Draft replies for urgent emails

For emails classified as URGENT or ACTION, generate a draft reply. The prompt includes the original email so the LLM has context:

DRAFT_PROMPT = """Write a short, professional reply to this email.
Keep it under 3 sentences. Be direct. Don't start with "I hope this email finds you well."

Original email:
From: {sender}
Subject: {subject}
Body preview: {snippet}

Reply:"""

def draft_reply(email):
    """Generate a draft reply for an email."""
    sender = email["from"][0]["email"] if email.get("from") else "unknown"
    prompt = DRAFT_PROMPT.format(
        sender=sender,
        subject=email.get("subject", "(no subject)"),
        snippet=email.get("snippet", ""),
    )
    response = client.chat.completions.create(
        model="gpt-4o",
        messages=[{"role": "user", "content": prompt}],
        max_tokens=200,
        temperature=0.7,
    )
    return response.choices[0].message.content.strip()

Using temperature=0.7 for drafts (versus 0 for classification) gives the replies a more natural tone. You review every draft before it gets sent.

Step 4: Take action

Now wire the classification results to Nylas CLI commands:

def archive_email(email_id):
    """Archive an email (move out of inbox)."""
    result = subprocess.run(
        ["nylas", "email", "archive", email_id],
        capture_output=True,
        text=True
    )
    if result.returncode != 0:
        print(f"Failed to archive {email_id}: {result.stderr}")
    return result.returncode == 0

def create_draft(to, subject, body):
    """Create a draft reply via Nylas CLI."""
    result = subprocess.run(
        ["nylas", "email", "draft", "--to", to, "--subject", subject,
         "--body", body, "--json"],
        capture_output=True,
        text=True
    )
    if result.returncode != 0:
        print(f"Failed to create draft: {result.stderr}")
    return result.returncode == 0

def process_email(email):
    """Classify and act on a single email."""
    category = classify_email(email)
    subject = email.get("subject", "(no subject)")
    sender = email["from"][0]["email"] if email.get("from") else "unknown"
    email_id = email["id"]

    print(f"  [{category}] {subject} (from {sender})")

    if category == "NOISE":
        archive_email(email_id)
        print(f"    -> archived")

    elif category in ("URGENT", "ACTION"):
        reply = draft_reply(email)
        reply_subject = f"Re: {subject}" if not subject.startswith("Re:") else subject
        create_draft(sender, reply_subject, reply)
        print(f"    -> draft created")

    # FYI emails: leave in inbox, no action needed
    return category

Step 5: Run on a schedule

Add a cron entry to run the triage agent every 15 minutes:

# Edit crontab
crontab -e

# Add this line (runs every 15 minutes)
*/15 * * * * /usr/bin/python3 /path/to/triage.py >> /var/log/email-triage.log 2>&1

For a simpler setup during development, use a bash loop:

# Run every 10 minutes in a terminal
while true; do
  python3 triage.py
  echo "--- sleeping 10 minutes ---"
  sleep 600
done

The script only processes unread emails, so re-runs are safe. An email that was already classified and archived won't appear in the next batch.

Full script

Here's the complete triage agent. Save it as triage.py:

#!/usr/bin/env python3
"""AI email triage agent. Classifies unread emails and takes action."""

import subprocess
import json
import sys
from datetime import datetime

from openai import OpenAI

client = OpenAI()  # set OPENAI_API_KEY in your environment

CLASSIFY_PROMPT = """Classify this email into exactly one category.
Return ONLY the category name, nothing else.

Categories:
- URGENT: needs response within 1 hour (production issues, exec requests, time-sensitive)
- ACTION: needs response today (code reviews, meeting follow-ups, direct questions)
- FYI: read later, no response needed (team updates, shared docs, status reports)
- NOISE: archive immediately (newsletters, marketing, automated notifications, noreply@)

Email:
From: {sender}
Subject: {subject}
Preview: {snippet}
"""

DRAFT_PROMPT = """Write a short, professional reply to this email.
Keep it under 3 sentences. Be direct.

Original email:
From: {sender}
Subject: {subject}
Body preview: {snippet}

Reply:"""

VALID_CATEGORIES = {"URGENT", "ACTION", "FYI", "NOISE"}


def fetch_unread(limit=20):
    result = subprocess.run(
        ["nylas", "email", "list", "--unread", "--limit", str(limit), "--json"],
        capture_output=True, text=True
    )
    if result.returncode != 0:
        print(f"Error: {result.stderr}", file=sys.stderr)
        return []
    try:
        return json.loads(result.stdout)
    except json.JSONDecodeError:
        print(f"Bad JSON from CLI: {result.stdout[:200]}", file=sys.stderr)
        return []


def classify(email):
    sender = email["from"][0]["email"] if email.get("from") else "unknown"
    resp = client.chat.completions.create(
        model="gpt-4o-mini",
        messages=[{"role": "user", "content": CLASSIFY_PROMPT.format(
            sender=sender,
            subject=email.get("subject", ""),
            snippet=email.get("snippet", ""),
        )}],
        max_tokens=10, temperature=0,
    )
    cat = resp.choices[0].message.content.strip().upper()
    return cat if cat in VALID_CATEGORIES else "FYI"


def draft_reply(email):
    sender = email["from"][0]["email"] if email.get("from") else "unknown"
    resp = client.chat.completions.create(
        model="gpt-4o",
        messages=[{"role": "user", "content": DRAFT_PROMPT.format(
            sender=sender,
            subject=email.get("subject", ""),
            snippet=email.get("snippet", ""),
        )}],
        max_tokens=200, temperature=0.7,
    )
    return resp.choices[0].message.content.strip()


def archive(email_id):
    r = subprocess.run(["nylas", "email", "archive", email_id],
                       capture_output=True, text=True)
    return r.returncode == 0


def create_draft(to, subject, body):
    r = subprocess.run(
        ["nylas", "email", "draft", "--to", to, "--subject", subject,
         "--body", body, "--json"],
        capture_output=True, text=True
    )
    return r.returncode == 0


def main():
    print(f"\n[{datetime.now().strftime('%Y-%m-%d %H:%M')}] Fetching unread emails...")
    emails = fetch_unread()
    if not emails:
        print("No unread emails.")
        return

    counts = {"URGENT": 0, "ACTION": 0, "FYI": 0, "NOISE": 0}

    for email in emails:
        subject = email.get("subject", "(no subject)")
        sender = email["from"][0]["email"] if email.get("from") else "unknown"
        email_id = email["id"]

        category = classify(email)
        counts[category] += 1
        print(f"  [{category}] {subject}")

        if category == "NOISE":
            if archive(email_id):
                print(f"    -> archived")
        elif category in ("URGENT", "ACTION"):
            reply = draft_reply(email)
            reply_subj = f"Re: {subject}" if not subject.startswith("Re:") else subject
            if create_draft(sender, reply_subj, reply):
                print(f"    -> draft created for {sender}")

    print(f"\nSummary: {counts['URGENT']} urgent, {counts['ACTION']} action, "
          f"{counts['FYI']} fyi, {counts['NOISE']} noise "
          f"(total: {len(emails)})")


if __name__ == "__main__":
    main()

Alternative: use MCP instead

If you don't want to write a Python script, you can let Claude Code or Cursor be your triage agent interactively. Install the Nylas MCP server:

# One command gives your assistant email access
nylas mcp install --assistant claude-code
# or: cursor, windsurf, vscode

Then prompt your assistant directly:

# In Claude Code or Cursor, just ask:
"Read my unread emails. Classify each one as urgent, action, FYI,
or noise. Draft replies for the urgent ones. Archive the noise."

The MCP path is better for ad-hoc triage (when you want a human in the loop for every decision). The Python script is better for automated, scheduled triage that runs while you sleep. See Give AI Agents Email Access via MCP for full MCP setup.

Running with a local LLM

If email privacy matters (and it should), point the agent at a local Ollama instance instead of a cloud API. Email content never leaves your machine:

# Start Ollama with Llama 3.1
ollama run llama3.1

# In your script, point OpenAI client at Ollama
client = OpenAI(
    base_url="http://localhost:11434/v1",
    api_key="ollama",  # Ollama doesn't need a real key
)

Llama 3.1 8B handles classification well. For higher-quality draft replies, use the 70B model if your hardware supports it (48GB+ VRAM).

FAQ

How accurate is AI email classification?

With GPT-4o or Claude 3.5 Sonnet, classification accuracy on the four categories typically exceeds 90% after prompt tuning. The main failure mode is misclassifying marketing emails as FYI instead of NOISE. Adding sender-based rules (e.g., always classify noreply@ addresses as NOISE before hitting the LLM) pushes accuracy above 95%.

Can I run this with a local LLM for privacy?

Yes. Replace the OpenAI client with a local Ollama endpoint at http://localhost:11434/v1. Llama 3.1 8B handles classification well. Draft quality improves with larger models. Email content never leaves your machine when using a local model.

How do I automate this on a schedule?

Add a cron entry: */15 * * * * /usr/bin/python3 /path/to/triage.py. The script processes only unread emails, so re-runs are idempotent. Emails already archived or drafted won't appear in the next batch.

Does this work with multiple email accounts?

Yes. Nylas CLI supports multiple grants. Set NYLAS_GRANT_ID before running the script, or loop over grants from nylas auth list --json. Each grant can be a different provider.

Next steps