Guide

Build a LangGraph Email Agent (CLI Tool)

LangGraph models an agent as a stateful graph — nodes, edges, and a shared state object that survives across steps. Giving one of those nodes email usually means a provider SDK and an OAuth flow. There's a lighter path: wrap the Nylas CLI as a LangGraph tool. Each call is one subprocess that returns JSON, and the same tool reaches Gmail, Outlook, and four more providers. This guide builds a triage graph around that tool, with sends kept behind a human.

Written by Caleb Geene Director, Site Reliability Engineering

VerifiedCLI 3.1.16 · Gmail, Outlook · last tested June 8, 2026

Command references used in this guide: nylas email list, nylas email search, and nylas email drafts create.

How do you give a LangGraph agent email?

You give a LangGraph agent email by defining a tool that calls the Nylas CLI as a subprocess and binding it to the model. A LangGraph tool is a plain Python function decorated with @tool; inside, you run a CLI command, capture stdout, and return it. Because nylas email list --json emits structured data, the node receives clean JSON it can reason over — no HTML parsing and no SDK objects to serialize into state.

The CLI must be installed and authenticated once with nylas auth login; the stored grant is reused on every call, so the subprocess never handles credentials. This is the same subprocess boundary used by other Python frameworks, which keeps provider details out of your graph. LangGraph's tool and state model is documented in the LangGraph docs.

Why wrap the CLI instead of a provider SDK?

Wrapping the CLI collapses six integrations into one. A direct build would pull in the Gmail API, Microsoft Graph, and separate IMAP handling, each with its own OAuth app, token refresh, and pagination. The CLI presents one command surface across all six providers and refreshes OAuth tokens — which expire every 3,600 seconds on most providers — without any code in your graph.

The subprocess boundary also makes the tool auditable. Each action is one command with explicit arguments, so you can log exactly what the agent ran and read back JSON you can validate before it enters state. A thin wrapper around nylas email list --json is easier to reason about than a clever SDK abstraction, and it survives provider API changes because the CLI absorbs them.

import subprocess, json
from langchain_core.tools import tool

@tool
def list_unread(limit: int = 10) -> str:
    """List unread emails as JSON across the connected mailbox."""
    out = subprocess.run(
        ["nylas", "email", "list", "--unread", "--json", "--limit", str(limit)],
        capture_output=True, text=True, check=True,
    )
    return out.stdout

@tool
def search_email(query: str) -> str:
    """Search the mailbox with a provider-agnostic query, return JSON."""
    out = subprocess.run(
        ["nylas", "email", "search", query, "--json", "--limit", "20"],
        capture_output=True, text=True, check=True,
    )
    return out.stdout

How do you build the triage graph?

Build the graph with StateGraph, a model bound to the tools, and a ToolNode that executes any tool the model calls. The conditional edge tools_condition routes back to the model after each tool runs, so the agent can read, then search, then decide across multiple steps while LangGraph carries the message history in state. This loop is the entire control flow for a triage agent.

Two nodes do the work: the model node decides what to call, and the tool node runs it. With three tools bound — list, search, and a draft tool — the agent can classify an inbox in a handful of steps. Keep the graph small; a triage flow rarely needs more than the read-decide-act loop plus a human checkpoint before any send.

from langgraph.graph import StateGraph, MessagesState, START
from langgraph.prebuilt import ToolNode, tools_condition
from langchain_anthropic import ChatAnthropic

tools = [list_unread, search_email]
model = ChatAnthropic(model="claude-sonnet-4-6").bind_tools(tools)

def call_model(state: MessagesState):
    return {"messages": [model.invoke(state["messages"])]}

g = StateGraph(MessagesState)
g.add_node("model", call_model)
g.add_node("tools", ToolNode(tools))
g.add_edge(START, "model")
g.add_conditional_edges("model", tools_condition)
g.add_edge("tools", "model")
agent = g.compile()

How do you keep sends safe?

Keep every outbound action behind a human. Instead of a send tool, give the agent a draft tool that runs nylas email drafts create, which composes a message without sending it and returns a draft ID. A person reviews the draft and sends it, so a misclassification can't put mail in a customer's inbox. LangGraph's checkpointer makes this natural — interrupt before the send node, surface the draft, and resume on approval.

This human-in-the-loop pattern is the single most important guardrail for an email agent, because the model reads untrusted content and a prompt injection can try to redirect it. Containment that lives outside the model's decision loop — a review step, or connector-level rules — can't be argued away by injected text. For deterministic enforcement, see stopping a rogue agent at the connector layer.

Which providers does this cover?

The same two tools reach all six providers the CLI supports: Gmail, Outlook, Microsoft Exchange, Yahoo, iCloud, and IMAP. Because the agent never names a provider — it calls nylas email list --json, not a Gmail endpoint — switching the connected account changes nothing in the graph. One codebase triages a Gmail inbox today and an Outlook inbox tomorrow.

That provider neutrality is the payoff of the subprocess design. Your LangGraph state schema, nodes, and edges stay identical while the CLI absorbs each backend's quirks. For the MCP-based alternative — exposing the same actions as Model Context Protocol tools instead of subprocess calls — see the MCP email server setup guide.

Next steps