Guide

Build a BabyAGI Email Agent

Give BabyAGI email through a draft-first execution tool: task-loop wrappers call the Nylas CLI, return JSON, and leave sending for human approval safely.

Written by Prem Keshari Senior SRE

VerifiedCLI 3.1.20 · Gmail · last tested June 14, 2026

Command references used in this guide: nylas email list, nylas email search, and nylas email drafts create.

How do you give a BabyAGI task loop email?

You give a BabyAGI task loop email by adding one execution function that shells out to the CLI and returns JSON. Keep the function narrow: read messages, search messages, or create exactly 1 draft. The loop can then reason over mailbox data without owning provider auth or send authority.

The official BabyAGI GitHub repository by Yohei Nakajima now centers on a self-building “functionz” framework. This guide uses the classic task-loop pattern: task creation, prioritization, and execution. Email belongs in the execution step, where the next queue item becomes one subprocess call.

That separation matters because BabyAGI can create new work from old results. If the task is “send weekly digest,” the execution agent should create a draft and return the draft ID. It should not decide delivery on its own.

What is the classic BabyAGI pattern?

The classic BabyAGI pattern is an autonomous 3-part loop: a task creation agent proposes follow-up work, a prioritization agent orders the queue, and an execution agent completes the next task. The objective stays fixed, but the queue changes after every result.

That design is different from a chat agent that waits for each user turn. BabyAGI keeps moving. A mailbox task can therefore become part of an extended plan: list recent messages, summarize 20 unread items, draft the digest, then create another task from the result. The newer “functionz” project is useful context, but the classic loop is the architecture to keep in mind when email enters the system.

How does email fit into the task queue?

Email fits into BabyAGI as a normal executable task, not a special channel. The queue item names the work, the execution agent picks a function, and the function returns output. A task like “send weekly digest” maps to a draft action, while “find unread invoices” maps to search.

This keeps the task queue honest. The planner can still decide that a digest is needed, but the only outbound primitive it can reach is draft creation. A human sends later. For a 5-step loop, that means reading, summarizing, and drafting can be automated while delivery remains a separate approval step outside BabyAGI.

What prerequisites does the execution agent need?

The execution agent needs an already authenticated CLI profile, Python access to subprocess, and a task objective tight enough to finish in a bounded number of iterations. Use the getting-started flow for setup; this guide starts after auth already works.

Keep two practical limits in the objective. First, cap reads with a number such as 20 messages so the loop does not fill context with a whole mailbox. Second, define the approval boundary before the run: BabyAGI may create drafts, but a person sends them. That boundary is more important than any prompt wording because it removes direct delivery from the agent's tool set.

How do you wire Nylas into the execution agent?

Wire Nylas into BabyAGI by giving the execution agent small Python functions that call email commands and return stdout. The wrapper below exposes 3 actions: list, search, and draft. Each one uses JSON or a draft ID so the loop can feed results back into prioritization.

The first command is nylas email list, which reads recent mailbox metadata for the next queue decision. The second is nylas email search, used when a task names a sender or subject. The draft function calls nylas email drafts create; one command can create a reviewable draft across 6 providers.

import subprocess

def read_inbox(limit: int = 10) -> str:
    """Return recent message metadata as JSON for the BabyAGI loop."""
    result = subprocess.run(
        ["nylas", "email", "list", "--json", "--limit", str(limit)],
        capture_output=True, text=True, check=True,
    )
    return result.stdout

def search_inbox(query: str) -> str:
    """Return matching messages as JSON for a queued search task."""
    result = subprocess.run(
        ["nylas", "email", "search", query, "--json", "--limit", "20"],
        capture_output=True, text=True, check=True,
    )
    return result.stdout

def create_review_draft(to: str, subject: str, body: str) -> str:
    """Create a draft only. A human sends it after review."""
    result = subprocess.run(
        [
            "nylas", "email", "drafts", "create",
            "--to", to,
            "--subject", subject,
            "--body", body,
            "--json",
        ],
        capture_output=True, text=True, check=True,
    )
    return result.stdout

Why must BabyAGI draft before sending?

BabyAGI must draft before sending because autonomous loops repeat actions without waiting for a person. A bad instruction in 1 email can become a task, then a prioritized action, then an outbound message. Draft creation breaks that chain before delivery.

The lethal trifecta is private data + untrusted content + external communication. Email agents touch all 3 unless you remove sending from the tool list. Prompt injection is OWASP LLM01 in the 2025 LLM Top 10, and mailbox bodies are untrusted input. The fix is structural: register read, search, and draft functions only.

In practice, the review cost is small. A 30-second check catches wrong recipients, injected text, and fabricated claims before they reach a real inbox. BabyAGI can still finish the objective by returning the draft JSON, reprioritizing the queue, and stopping when the digest draft exists.

How do you run a full task-queue example?

A full BabyAGI-style example keeps the classic queue visible: create tasks, prioritize them, and execute the first item until the objective is done. This version starts with 1 seed task and permits only reading and drafting. The weekly digest becomes a draft, not a sent message.

The example calls nylas email list for 20 recent messages, then calls nylas email drafts create when the execution agent reaches the digest task. That mirrors the production pattern: the task queue can evolve for many iterations, but only 1 draft is created for review.

from dataclasses import dataclass, field
import subprocess

OBJECTIVE = "Prepare a weekly inbox digest draft for a manager."

@dataclass(order=True)
class Task:
    priority: int
    name: str = field(compare=False)

def run_cli(args: list[str]) -> str:
    result = subprocess.run(args, capture_output=True, text=True, check=True)
    return result.stdout

def execution_agent(task: Task) -> str:
    task_name = task.name.lower()

    if "read" in task_name or "inbox" in task_name:
        return run_cli([
            "nylas", "email", "list",
            "--json",
            "--limit", "20",
            "--unread",
        ])

    if "draft" in task_name or "digest" in task_name:
        digest_body = (
            "Draft summary from the BabyAGI task loop. "
            "Review mailbox facts before sending."
        )
        return run_cli([
            "nylas", "email", "drafts", "create",
            "--to", "manager@example.com",
            "--subject", "Weekly inbox digest draft",
            "--body", digest_body,
            "--json",
        ])

    return f"No email action allowed for task: {task.name}"

def task_creation_agent(completed_task: Task, last_result: str) -> list[Task]:
    if "draft" in completed_task.name.lower():
        return []
    if not last_result.strip():
        return [Task(2, "Read 20 unread inbox messages")]
    return [Task(2, "Draft weekly digest from the inbox summary")]

def prioritization_agent(tasks: list[Task]) -> list[Task]:
    return sorted(tasks)

def babyagi_loop(seed_tasks: list[Task], max_iterations: int = 3) -> None:
    tasks = prioritization_agent(seed_tasks)
    for _ in range(max_iterations):
        if not tasks:
            break
        current = tasks.pop(0)
        result = execution_agent(current)
        print(f"completed={current.name}")
        tasks.extend(task_creation_agent(current, result))
        tasks = prioritization_agent(tasks)

initial_tasks = [Task(1, "Read 20 unread inbox messages")]

babyagi_loop(initial_tasks)

How do you verify the loop without sending email?

Verify the BabyAGI loop by proving 2 facts: the read task returns JSON and the digest task creates a draft. Do not add a send function during testing. The absence of a send path is the safety property, not a missing feature.

Run the script after authentication and inspect the printed task names. Then check the mailbox Drafts folder in the provider UI. You should see 1 draft addressed to the review recipient. If the loop reads zero unread messages, remove --unread for a local test run or seed the mailbox with a test message.

For command-level checks, run nylas email list --json --limit 1 first and confirm the output is JSON. Then run the Python file with a test recipient you control. The guide never calls nylas email send, so verification cannot dispatch mail by accident.

Next steps