Guide

Build a Pydantic AI Email Agent (CLI Tool)

Pydantic AI brings the type safety of Pydantic to agents — tools are typed functions, and outputs are validated against a schema before your code sees them. Giving such an agent email usually means a provider SDK and OAuth. The lighter path wraps the Nylas CLI as a tool: each call is one subprocess returning JSON, and the same tool reaches Gmail, Outlook, and four more providers. This guide builds a typed triage agent and keeps sends behind a human.

Written by Nick Barraclough Product Manager

Updated June 8, 2026

Verified — CLI 3.1.16 · Gmail, Outlook · last tested June 8, 2026

Command references used in this guide: nylas email list, nylas email search, and nylas email drafts create.

How do you give a Pydantic AI agent email?

You give a Pydantic AI agent email by decorating a typed function with @agent.tool that calls the Nylas CLI. Pydantic AI reads the function's type hints to build the tool schema the model sees, so the parameters are validated before the subprocess runs. Inside, you run a CLI command and return its stdout — and because nylas email list --json emits structured data, the agent receives clean JSON.

Authenticate the CLI once with nylas auth login; the stored grant is reused on every call, so the tool never handles credentials. This is the same subprocess boundary other Python frameworks use, keeping provider details out of agent code. Pydantic AI's typed-tool and validated-output model is documented in the Pydantic AI docs.

Why wrap the CLI instead of a provider SDK?

Wrapping the CLI replaces six SDKs with one command surface. A direct build means the Gmail API, Microsoft Graph, and IMAP, each with its own OAuth app, token refresh, and pagination. The CLI handles all six providers and refreshes OAuth tokens — which expire every 3,600 seconds on most providers — without token code in your agent.

It also plays to Pydantic AI's strength. The CLI returns JSON, and you can parse that stdout into a Pydantic model so the rest of your code works with validated types instead of raw dicts. The combination — a typed tool in, a typed result out — gives an email agent end-to-end type safety that a loose SDK wrapper doesn't.

import subprocess
from pydantic_ai import Agent, RunContext

agent = Agent(
    "anthropic:claude-sonnet-4-6",
    system_prompt="You triage email. Read first, never send without approval.",
)

@agent.tool
def list_unread(ctx: RunContext[None], limit: int = 10) -> str:
    """List unread emails as JSON across the connected mailbox."""
    out = subprocess.run(
        ["nylas", "email", "list", "--unread", "--json", "--limit", str(limit)],
        capture_output=True, text=True, check=True,
    )
    return out.stdout

How do you run the agent with a typed result?

Run the agent with output_type set to a Pydantic model, and the framework validates the model's final answer against your schema. For triage, that means the agent returns a structured verdict — say, a list of message IDs with a priority and reason — that your code can act on without re-parsing free text. If the model produces something off-schema, Pydantic AI rejects it and retries.

This is the difference from a plain prompt: the output is a typed object, not a paragraph. A triage run over an unread inbox returns, for example, three flagged messages with priorities you can route directly into a queue. The model does the judgment; the schema guarantees the shape.

from pydantic import BaseModel

class Triage(BaseModel):
    message_id: str
    priority: str  # "high" | "normal" | "low"
    reason: str

triage_agent = Agent(
    "anthropic:claude-sonnet-4-6",
    output_type=list[Triage],
    tools=[list_unread],
)

result = triage_agent.run_sync("Triage my unread inbox.")
for item in result.output:
    print(item.priority, item.message_id, item.reason)

How do you keep sends safe?

Keep outbound actions behind a human. Rather than a send tool, give the agent a draft tool that runs nylas email drafts create, which composes a message without sending it and returns a draft ID. A person reviews and sends, so a misclassification can't reach a customer. Type safety validates the shape of an action, but it can't judge whether the action should happen — that's what the review step is for.

The agent reads untrusted content, and a prompt injection in a message body can try to redirect it. Containment that lives outside the model's reasoning — a review checkpoint, or connector-level rules — holds even when injected text tries to talk past it. For deterministic enforcement at the connector, see stopping a rogue agent at the connector layer.

Next steps

LangGraph email agent — the same pattern in a stateful graph
LlamaIndex email agent — FunctionTool wrapping the CLI
Build an AI email triage agent — classification and routing end to end
Human-in-the-loop email agent — draft-and-approve guardrails
Anthropic tool use email — the same pattern with Claude's Messages API
Full command reference — every flag and subcommand documented