Source: https://cli.nylas.com/guides/monitor-agent-account-health

# Monitor Agent Account Health

An agent account that silently goes invalid takes its agent down with it — the next send fails, the next inbox read returns nothing. Catching that before the agent does means probing health on a schedule, not by hand. This guide builds a health monitor from the CLI: read connector and account status as JSON, flag any account whose grant_status isn't valid, alert a human when one breaks, and track the healthy count over time. The whole probe is a cron job and a few lines of jq.

Written by [Caleb Geene](https://cli.nylas.com/authors/caleb-geene) Director, Site Reliability Engineering

Updated June 7, 2026

> **TL;DR:** `nylas agent status --json` reports connector readiness and every account's `grant_status`. Run it on a cron, flag any account that isn't `valid` with jq, and `nylas email send` an alert before the agent fails its next task.

## What does monitoring agent account health cover?

Monitoring agent account health means continuously answering one question: can every agent still send and receive? Two things can break that — the managed connector behind all accounts, or an individual account's grant going invalid. The CLI surfaces both as structured data, so a health monitor is a scheduled probe that reads status, compares it to the expected `valid` state, and raises an alert on any drift.

The reason to automate this is that a broken grant fails silently. An agent doesn't announce that its account went invalid; it just starts failing sends, and you find out when a customer doesn't get a reply. Probing health on a schedule turns a silent failure into a page — the difference between catching a broken account in minutes and discovering it from a complaint.

Health loop: probe status on a schedule, evaluate grant_status, alert when an account is not validProbe (cron)agent status --jsonEvaluategrant_status == valid?valid → quietinvalid → alert

## How do I check connector and account health?

The `nylas agent status --json` command is the single health endpoint. It returns `connector_configured` (the managed connector behind every account), `account_count`, and an `accounts` array where each entry carries a `grant_status`. A healthy account reads `valid`:

```bash
nylas agent status --json \
  | jq '{connector: .connector_configured, accounts: .account_count}'
```

Check the connector first, because if `connector_configured` is false, every account is affected and per-account checks are noise. Treat that as the top-level alarm. When the connector is healthy, drill into individual accounts. This pairs with the broader [email integration health monitoring](https://cli.nylas.com/guides/monitor-email-integration-health-cli) patterns for the rest of your Nylas setup.

## How do I find unhealthy accounts across the fleet?

With a fleet of accounts, you want the exceptions, not the full list. Both `nylas agent status --json` and `nylas agent account list --json` expose `grant_status` per account. Filter with [jq](https://jqlang.github.io/jq/manual/) for anything that isn't `valid` and you get just the broken ones:

```bash
# Print each account that isn't valid, with its grant_status
nylas agent account list --json \
  | jq -r '.[] | select(.grant_status != "valid") | "\(.email): \(.grant_status)"'
```

An empty result means the fleet is healthy — the quiet success a monitor should produce most of the time. A non-empty result is your work queue: each line is an account email and its bad `grant_status`. Feed that list into an alert or a remediation step. The per-account fix — rotating a credential or recreating the grant — is covered in the [agent account lifecycle guide](https://cli.nylas.com/guides/agent-account-lifecycle).

## How do I run the health check on a schedule?

A health check that runs by hand isn't monitoring. Wrap the probe in a script and schedule it with cron — every 5 minutes is a reasonable cadence for catching a broken grant fast without hammering the API. The script also runs `nylas doctor` to confirm the CLI's own credentials are valid before it trusts any status output:

```bash
#!/usr/bin/env bash
set -euo pipefail

# Confirm the CLI itself is authenticated before trusting status
nylas doctor >/dev/null

# Top-level: is the managed connector up?
if [ "$(nylas agent status --json | jq -r '.connector_configured')" != "true" ]; then
  echo "ALERT: agent connector is not configured" >&2
  exit 1
fi

# Per-account: anything not valid?
bad=$(nylas agent account list --json \
  | jq -r '.[] | select(.grant_status != "valid") | .email')

if [ -n "$bad" ]; then
  echo "ALERT: unhealthy agent accounts:" >&2
  echo "$bad" >&2
  exit 2
fi

echo "ok: all agent accounts valid"
```

The script returns distinct exit codes — 0 healthy, 1 connector down, 2 account broken — so cron, a CI job, or a wrapper can treat a connector outage differently from a single bad grant. A 5-minute schedule means a broken account is detected within 5 minutes of failing, not whenever someone next looks.

```bash
# crontab: run the health check every 5 minutes
*/5 * * * * /usr/local/bin/agent-health-check.sh || /usr/local/bin/notify.sh
```

## How do I alert when an account goes invalid?

Detection is only useful if it reaches a human. When the probe finds a bad account, send the alert from a healthy agent account with `nylas email send` — the same CLI doing the monitoring delivers the page. Good alerts are actionable and rare: name the broken account, say what changed, and don't fire when everything is fine.

```bash
bad=$(nylas agent account list --json \
  | jq -r '.[] | select(.grant_status != "valid") | "\(.email): \(.grant_status)"')

if [ -n "$bad" ]; then
  nylas email send \
    --to oncall@yourcompany.com \
    --subject "Agent account health: $(echo "$bad" | wc -l | tr -d ' ') account(s) invalid" \
    --body "$bad"
fi
```

Sending only when `$bad` is non-empty is what keeps the alert meaningful — a monitor that emails "all healthy" every 5 minutes trains people to ignore it. The Google SRE practice of alerting on symptoms, not noise, is documented in [Monitoring Distributed Systems](https://sre.google/sre-book/monitoring-distributed-systems/): page on what's broken and actionable, stay quiet otherwise.

## How do I track health over time?

A point-in-time check tells you the fleet is healthy now; a trend tells you it's degrading. Append a timestamped count of valid versus total accounts to a log on every run, and you have the raw data for a dashboard or a weekly review. The valid ratio is the one number that summarizes fleet health:

```bash
status=$(nylas agent status --json)
total=$(echo "$status" | jq '.account_count')
valid=$(echo "$status" | jq '[.accounts[] | select(.grant_status == "valid")] | length')
echo "$(date -u +%FT%TZ) valid=$valid total=$total" >> agent-health.log
```

Over weeks, this log shows whether the valid ratio holds at 100% or drifts down as grants expire — the kind of slow degradation a single check never reveals. Plot it, or just `grep` for any line where `valid` dropped below `total`. For agent-action error rates alongside account health, see [auditing AI agent activity](https://cli.nylas.com/guides/audit-ai-agent-activity).

## Which check runs on which schedule?

Not every check belongs on the same cadence. The connector probe is cheap and high-value, so it runs often; the trend log is a once-a-run append; a deeper reconciliation against your tenant list can run nightly. Match frequency to how fast each signal changes:

| Check | Command | Cadence |
| --- | --- | --- |
| Connector up | `agent status --json` | **Every 5 minutes** |
| Account grant_status | `agent account list --json` | **Every 5 minutes** |
| Valid-ratio trend | `agent status --json` → log | **Every run (append)** |
| Tenant reconciliation | `agent account list --json` | **Nightly** |

Keep the 5-minute checks lightweight so they never become the thing that breaks. The nightly reconciliation — diffing provisioned accounts against your tenant source of truth — belongs with the [provisioning-at-scale](https://cli.nylas.com/guides/provision-agent-accounts-at-scale) workflow, which already lists the fleet as JSON.

## Next steps

- [Manage the Agent Account Lifecycle](https://cli.nylas.com/guides/agent-account-lifecycle) — rotate a credential or recreate a grant once the monitor flags it
- [Provision Agent Accounts at Scale](https://cli.nylas.com/guides/provision-agent-accounts-at-scale) — the nightly tenant reconciliation that complements these health checks
- [Monitor Email Integration Health](https://cli.nylas.com/guides/monitor-email-integration-health-cli) — the broader doctor, audit-log, and webhook health patterns
- [Trigger Agents on Inbound Email with Webhooks](https://cli.nylas.com/guides/agent-account-webhooks) — a grant.expired webhook pages you the moment an account breaks, between polls
- [Audit AI Agent Activity](https://cli.nylas.com/guides/audit-ai-agent-activity) — agent-action error rates alongside account health
- [Full command reference](https://cli.nylas.com/docs/commands) — every `nylas agent` flag and subcommand
- [Nylas v3 API documentation](https://developer.nylas.com/) — the grants API behind agent account status
