Guide

Build a BabyAGI Email Agent

BabyAGI runs a task-driven loop: it creates tasks, executes one, then reprioritizes the rest based on the result. To let that loop touch email, register the Nylas CLI as the tool its execution step calls. Each call is one subprocess that returns JSON, and the same tool reaches Gmail, Outlook, and four more providers. Here's how to wire the CLI into BabyAGI's execution step and keep sends behind a person.

Written by Prem Keshari Senior SRE

VerifiedCLI 3.1.20 · Gmail · last tested June 14, 2026

Command references used in this guide: nylas email list, nylas email search, and nylas email drafts create.

How do you give a BabyAGI agent email?

You give a BabyAGI agent email by registering a function the execution step can call and pointing it at the Nylas CLI. BabyAGI's loop is three steps: create tasks, pick the next one, execute it. The execution step is where a tool runs, and a tool here is just Python that shells out to a CLI command and returns stdout.

Because nylas email list --json emits structured data, the loop gets clean JSON it can fold back into the next task — no HTML parsing. Each call is one CLI invocation, so the agent reads structured output in a single round trip. The subprocess boundary keeps provider details out of the agent loop entirely. The CLI is installed and authenticated once with nylas auth login, and the stored grant is reused on every call. BabyAGI's function-registration pattern is documented in the project's newer function-framework README, and the original task-driven loop is described in Yohei Nakajima's “Birth of BabyAGI” write-up.

How do you register the email tool?

Register one function per action so the execution step has a narrow, named capability. The reader runs nylas email list --json and returns the messages; a search function runs nylas email search for a server-side query. Keep each function thin and let the JSON pass through — the model reads structured output well, and a small wrapper is easier to audit than a clever one.

import subprocess
import babyagi  # registers functions on BabyAGI's function framework

@babyagi.register_function()
def read_inbox(limit: int = 10) -> str:
    """List recent emails as JSON for the task loop to reason over."""
    out = subprocess.run(
        ["nylas", "email", "list", "--json", "--limit", str(limit)],
        capture_output=True, text=True, check=True,
    )
    return out.stdout  # already JSON — hand it straight to the agent

@babyagi.register_function()
def search_inbox(query: str) -> str:
    """Search the mailbox server-side and return matching messages as JSON."""
    out = subprocess.run(
        ["nylas", "email", "search", query, "--json", "--limit", "20"],
        capture_output=True, text=True, check=True,
    )
    return out.stdout

How do you drive the task loop?

The objective seeds the first task, and BabyAGI expands it. Give the loop a tightly scoped objective so the agent reads and classifies rather than wandering. The execution step calls your registered functions, the result reprioritizes the queue, and the loop continues until the objective is met. A 20-message triage objective usually resolves in a handful of iterations.

# Pseudocode for the BabyAGI loop calling the registered tools
OBJECTIVE = "Triage the 20 most recent emails into urgent, routine, and ignore."

def execute_task(task: str) -> str:
    if "read" in task or "list" in task:
        return read_inbox(20)
    if "search" in task:
        return search_inbox(extract_query(task))
    return reason_over(task)  # LLM step; no email side effects

# create_tasks -> prioritize -> execute_task -> repeat
run_babyagi(OBJECTIVE, executor=execute_task)

What guardrails should the loop have?

BabyAGI runs an autonomous loop that spawns its own follow-up tasks from what it reads, so an injected email can seed a task like “email every contact”. Bound the loop's reach: register read and draft functions only — no send. The draft function runs nylas email drafts create, which composes a message without sending it and returns a draft ID, so a runaway loop can stage drafts but never deliver them. A person reviews each draft before it leaves.

Treat email bodies as untrusted input. Prompt injection ranks #1 in the OWASP LLM Top 10 (2025) as LLM01, and a task-driven loop is especially exposed because a message can mint a new objective the planner then chases. Giving the same agent read and send capability over private mailbox data forms a “lethal trifecta” — private data, untrusted content, and external communication — so withholding send is what defuses it. Log each execution and verify before acting. See stop an AI agent going rogue and build a human-in-the-loop email agent for the full pattern.

Next steps