Source: https://cli.nylas.com/guides/email-api-429-rate-limit-fix

# Fix 429 Rate-Limit Errors on Email APIs

A 429 isn't a bug — it's the provider telling you to slow down. The mistake is retrying immediately, which extends the throttling window and can escalate to longer blocks. The fix is the same everywhere: honor the Retry-After header, retry with exponential backoff and jitter, and cut the request volume that triggered it. This guide explains the 429, the correct backoff, and how Gmail, Microsoft Graph, and the Nylas CLI differ in how they signal and absorb limits.

Written by [Aaron de Mello](https://cli.nylas.com/authors/aaron-de-mello) Senior Engineering Manager

Updated June 8, 2026

> **TL;DR:** A `429 Too Many Requests` means you're throttled. If a `Retry-After` header is present, wait exactly that long; otherwise retry with exponential backoff plus random jitter. Then reduce volume — batch reads, request fewer fields, use delta sync. Never retry a 429 immediately. The Nylas CLI normalizes provider limits, so you reason about one backoff instead of three.

Command references used in this guide: [`nylas email list`](https://cli.nylas.com/docs/commands/email-list), [`nylas email search`](https://cli.nylas.com/docs/commands/email-search), and [`nylas doctor`](https://cli.nylas.com/docs/commands/doctor).

## What does a 429 mean?

A `429 Too Many Requests` means you exceeded the API's rate limit and the provider is throttling you. It's a deliberate signal, standardized in [RFC 6585](https://datatracker.ietf.org/doc/html/rfc6585#section-4), and it often carries a `Retry-After` header telling you exactly how long to wait. The error is recoverable — the request will succeed once you respect the limit — but only if you stop hammering the endpoint.

The single biggest mistake is retrying immediately. A tight retry loop on a `429` keeps you over the limit, extends the throttling window, and on some providers escalates to a longer block. Treat `429` as “wait, then retry,” never “try again now.” The wait is the fix.

## How do you back off correctly?

Back off in two cases. If the response includes `Retry-After`, sleep for exactly that many seconds — the provider is telling you the precise wait. If it doesn't, use exponential backoff with jitter: double the delay each attempt (1s, 2s, 4s, 8s) and add a random fraction so many clients don't retry in lockstep and create a thundering herd. Cap the total attempts so a persistent limit surfaces as an error instead of an infinite loop.

Jitter matters more than it looks. Without it, every throttled client retries at the same doubling intervals and slams the endpoint together, re-triggering the limit; AWS's widely cited analysis showed jittered backoff markedly reduces contention. The pattern below honors `Retry-After` when present and falls back to jittered exponential backoff otherwise.

```python
import time, random

def with_backoff(call, max_attempts=6):
    for attempt in range(max_attempts):
        resp = call()
        if resp.status_code != 429:
            return resp
        ra = resp.headers.get("Retry-After")
        if ra:
            time.sleep(float(ra))                      # obey the provider
        else:
            time.sleep((2 ** attempt) + random.random())  # exp backoff + jitter
    raise RuntimeError("rate limited after retries")
```

## How do you reduce the volume that triggers it?

Backoff handles a `429` after it happens; reducing volume stops it happening. The three highest-impact moves are batching multiple operations into one request, requesting only the fields you need rather than full message bodies, and using delta or incremental sync instead of repeatedly re-reading the whole mailbox. Each cuts the number of calls per unit of work, which is what the limit actually counts.

Limits differ by provider, so the same code can be fine on one and throttled on another: Gmail budgets in quota units with a per-user ceiling of 250 units per second, while Microsoft Graph throttles per app and per mailbox with workload-specific limits. Designing for the tighter limit — fewer, fatter requests — keeps you under both. The provider docs spell out the exact numbers.

## How does the CLI help with rate limits?

The CLI gives you one rate-limit model instead of three. Because it normalizes Gmail, Graph, and IMAP behind one interface, you reason about a single backoff strategy rather than learning each provider's throttling quirks. A loop over `nylas email list --json` with the backoff wrapper above behaves the same whether the connected account is Gmail or Outlook.

For genuinely high-volume work, pace your own loops — add a small delay between calls and prefer search-scoped pulls over full-mailbox scans — and use `nylas doctor` to confirm the integration is healthy rather than throttled. The same discipline that keeps you under a native API's limit applies here; the difference is you write it once for all providers.

```bash
# Pace a bulk pull: scope the search, add a small inter-call delay
nylas email search "newer_than:1d" --json --limit 100 \
  | jq -r '.[].id' \
  | while read -r id; do
      nylas email read "$id" --json >/dev/null
      sleep 0.2     # stay comfortably under provider limits
    done
```

## Next steps

- [Gmail API error codes](https://cli.nylas.com/guides/gmail-api-error-codes) — 403 and 429 in context
- [Microsoft Graph error codes](https://cli.nylas.com/guides/graph-api-error-codes) — Graph throttling specifics
- [Email API rate limits compared](https://cli.nylas.com/guides/email-api-rate-limits-compared) — the numbers by provider
- [Handle email API outages](https://cli.nylas.com/guides/handle-email-api-outages) — backoff for 5xx and downtime
- [Debug email delivery](https://cli.nylas.com/guides/debug-email-delivery-cli) — when a sent message never arrives
- [Full command reference](https://cli.nylas.com/docs/commands) — every flag and subcommand documented
