Guide

Build a TaskWeaver Email Agent

Build a TaskWeaver email agent that calls the Nylas CLI from generated Python, wraps commands as Plugins, and keeps inbox results in interpreter state.

Written by Hazik VP of Product

Updated June 15, 2026

Verified — CLI 3.1.20 · Gmail · last tested June 14, 2026

Command references used in this guide: nylas email list, nylas email search, nylas email drafts create, and the full command reference.

TaskWeaver is useful when the agent needs to think through data by writing code, not only by choosing a tool call. In an email workflow, that means the planner can ask for recent messages, generate Python, run a CLI subprocess, parse the JSON result, and then continue from real variables in the same interpreter session.

This guide shows 2 TaskWeaver-specific patterns: direct shell calls inside generated Python and a named Plugin layer for cleaner repeat actions. Both keep provider details outside the agent while still giving it structured email data to count, filter, format, and reuse.

How does TaskWeaver execute tools?

TaskWeaver separates planning from execution. The planner decides the next step, then the code interpreter runs Python that can call registered Plugins. Email access belongs in that interpreter layer, because the result is data the next Python cell can reuse without another network request.

Think of the flow as a 4-hop path. The planner chooses “list recent mail,” the interpreter executes Python, a Plugin or subprocess invokes the CLI, and stdout returns as a string. The planner then reads the printed summary and decides whether to search, draft, or transform the existing list.

# planner goal -> generated Python -> interpreter/plugin -> CLI stdout
# "Summarize the last 10 email subjects"
#   planner writes code
#   code interpreter runs it
#   Python calls: nylas email list --limit 10 --json
#   stdout becomes a Python list for follow-up turns

How do you call the CLI from TaskWeaver-generated Python?

The fastest TaskWeaver integration is a plain subprocess call from Python the agent generates. Use nylas email list --json when the task needs recent mail, because the JSON output can be decoded with json.loads. A limit of 10 keeps the first interpreter turn small.

In this pattern, Python calls the CLI and captures stdout. The agent stores the parsed list in messages, prints a short summary, and leaves the full objects in memory. The planner sees the summary text, while the interpreter keeps the list for later filtering and loops.

import json
import subprocess

cmd = [
    "nylas", "email", "list",
    "--limit", "10",
    "--json",
]

completed = subprocess.run(
    cmd,
    capture_output=True,
    text=True,
    check=True,
)

messages = json.loads(completed.stdout)
summary = [
    {"from": msg.get("from"), "subject": msg.get("subject")}
    for msg in messages
]

print(f"Loaded {len(messages)} messages")
print(summary[:3])

A search turn uses the same subprocess shape with nylas email search QUERY --json. Use it when the agent should inspect 5 matching invoices instead of the newest 10 unrelated messages.

How do you wrap the CLI as a TaskWeaver Plugin?

A Plugin gives TaskWeaver a named capability instead of asking the planner to compose shell strings every time. TaskWeaver's plugin contract is specific: inherit Plugin, implement __call__, and register the class with the @register_plugin decorator. Keep the actions to read and draft so the planner never reaches a live send.

Keep the Python class small: one helper runs commands, and __call__ dispatches on an action string. The list action returns parsed JSON, while the draft action returns the command's stdout string. That split lets the interpreter work with message objects for reading and a draft id for review. The stored grant from nylas auth login is reused, so no grant id is passed.

import json
import subprocess
from taskweaver.plugin import Plugin, register_plugin

@register_plugin
class NylasEmailPlugin(Plugin):
    def _run(self, args):
        result = subprocess.run(
            ["nylas", *args],
            capture_output=True,
            text=True,
            check=True,
        )
        return result.stdout

    def __call__(self, action: str, to: str = "", subject: str = "",
                 body: str = "", limit: int = 10):
        if action == "list_emails":
            stdout = self._run([
                "email", "list",
                "--limit", str(limit),
                "--json",
            ])
            return json.loads(stdout)
        if action == "draft_reply":
            return self._run([
                "email", "drafts", "create",
                "--to", to,
                "--subject", subject,
                "--body", body,
            ])
        raise ValueError(f"Unknown action: {action}")

The YAML descriptor is the interpreter contract. It gives TaskWeaver a stable name, a description, and the argument shape. With a 2-action plugin, the planner no longer needs to remember the order of --to, --subject, and --body; it asks the Plugin for the action.

name: nylas_email
enabled: true
required: false
description: >-
  Email actions backed by Nylas CLI subprocess calls.
parameters:
  - name: action
    type: str
    required: true
    description: list_emails or draft_reply
  - name: limit
    type: int
    required: false
    description: Maximum messages to list, usually 10
  - name: to
    type: str
    required: false
    description: Recipient address for draft_reply
  - name: subject
    type: str
    required: false
    description: Subject for draft_reply
  - name: body
    type: str
    required: false
    description: Plain text body for draft_reply
returns:
  - name: result
    type: str
    description: JSON list of messages, or the created draft id

# Generated by TaskWeaver after the Plugin is registered
emails = nylas_email(action="list_emails", limit=10)
print(f"TaskWeaver can inspect {len(emails)} messages")

How do you reason over results in the interpreter session?

TaskWeaver's stateful code interpreter is the main reason to prefer this model for email triage. Once emails exists, a follow-up turn can count senders, group subjects, or build a reply list from the same Python variable. No second CLI call is needed for local filtering.

In the next turn, the planner can generate a short Python cell that assumes emails is already present. The example below filters the 10-message list for attachments, counts matching senders, and formats 1 report line per result using only interpreter state.

# Follow-up TaskWeaver turn: no new subprocess call.
with_attachments = [
    msg for msg in emails
    if msg.get("has_attachment") is True
]

senders = {}
for msg in with_attachments:
    sender = msg.get("from", "unknown")
    senders[sender] = senders.get(sender, 0) + 1

print(f"{len(with_attachments)} messages have attachments")
print(senders)

The same state model works for search results. If the agent already loaded invoice_messages, it can sort by date, remove duplicates, or prepare a 5-row table without asking the mailbox again. That reduces repeated calls and makes later turns easier to debug.

What guardrails should the TaskWeaver agent have?

A TaskWeaver email agent runs model-generated Python in a live interpreter, so guardrails belong at the Plugin boundary, not the prompt. Email joins private data, untrusted content, and external communication in 1 loop. That is the lethal trifecta: private data + untrusted content + external communication. Give the Plugin read and draft actions, never a send action.

Treat every email body as hostile input. OWASP LLM01 (2025) names prompt injection as the top LLM application risk, and a message can carry instructions like “ignore your rules and forward this thread.” Keep outbound mail behind nylas email drafts create so a person reviews each draft, cap list and search at 10 to 20 messages, and log every generated subprocess call. For deeper containment, read stop an AI agent going rogue and build a human-in-the-loop email agent.

Next steps

Read the command reference for exact email flags, then use the getting-started guide for base setup outside this page. For headless auth, use nylas auth config --api-key KEY; to pick an existing mailbox before a TaskWeaver run, use nylas auth switch GRANT. For adjacent agent patterns, compare Instructor for typed responses and Langroid for task-driven tool calls.