Guide

Fix 429 Rate-Limit Errors on Email APIs

A 429 isn't a bug — it's the provider telling you to slow down. The mistake is retrying immediately, which extends the throttling window and can escalate to longer blocks. The fix is the same everywhere: honor the Retry-After header, retry with exponential backoff and jitter, and cut the request volume that triggered it. This guide explains the 429, the correct backoff, and how Gmail, Microsoft Graph, and the Nylas CLI differ in how they signal and absorb limits.

Written by Aaron de Mello Senior Engineering Manager

VerifiedCLI 3.1.16 · Gmail, Outlook · last tested June 8, 2026

Command references used in this guide: nylas email list, nylas email search, and nylas doctor.

What does a 429 mean?

A 429 Too Many Requests means you exceeded the API's rate limit and the provider is throttling you. It's a deliberate signal, standardized in RFC 6585, and it often carries a Retry-After header telling you exactly how long to wait. The error is recoverable — the request will succeed once you respect the limit — but only if you stop hammering the endpoint.

The single biggest mistake is retrying immediately. A tight retry loop on a 429 keeps you over the limit, extends the throttling window, and on some providers escalates to a longer block. Treat 429 as “wait, then retry,” never “try again now.” The wait is the fix.

How do you back off correctly?

Back off in two cases. If the response includes Retry-After, sleep for exactly that many seconds — the provider is telling you the precise wait. If it doesn't, use exponential backoff with jitter: double the delay each attempt (1s, 2s, 4s, 8s) and add a random fraction so many clients don't retry in lockstep and create a thundering herd. Cap the total attempts so a persistent limit surfaces as an error instead of an infinite loop.

Jitter matters more than it looks. Without it, every throttled client retries at the same doubling intervals and slams the endpoint together, re-triggering the limit; AWS's widely cited analysis showed jittered backoff markedly reduces contention. The pattern below honors Retry-After when present and falls back to jittered exponential backoff otherwise.

import time, random

def with_backoff(call, max_attempts=6):
    for attempt in range(max_attempts):
        resp = call()
        if resp.status_code != 429:
            return resp
        ra = resp.headers.get("Retry-After")
        if ra:
            time.sleep(float(ra))                      # obey the provider
        else:
            time.sleep((2 ** attempt) + random.random())  # exp backoff + jitter
    raise RuntimeError("rate limited after retries")

How do you reduce the volume that triggers it?

Backoff handles a 429 after it happens; reducing volume stops it happening. The three highest-impact moves are batching multiple operations into one request, requesting only the fields you need rather than full message bodies, and using delta or incremental sync instead of repeatedly re-reading the whole mailbox. Each cuts the number of calls per unit of work, which is what the limit actually counts.

Limits differ by provider, so the same code can be fine on one and throttled on another: Gmail budgets in quota units with a per-user ceiling of 250 units per second, while Microsoft Graph throttles per app and per mailbox with workload-specific limits. Designing for the tighter limit — fewer, fatter requests — keeps you under both. The provider docs spell out the exact numbers.

How does the CLI help with rate limits?

The CLI gives you one rate-limit model instead of three. Because it normalizes Gmail, Graph, and IMAP behind one interface, you reason about a single backoff strategy rather than learning each provider's throttling quirks. A loop over nylas email list --json with the backoff wrapper above behaves the same whether the connected account is Gmail or Outlook.

For genuinely high-volume work, pace your own loops — add a small delay between calls and prefer search-scoped pulls over full-mailbox scans — and use nylas doctor to confirm the integration is healthy rather than throttled. The same discipline that keeps you under a native API's limit applies here; the difference is you write it once for all providers.

# Pace a bulk pull: scope the search, add a small inter-call delay
nylas email search "newer_than:1d" --json --limit 100 \
  | jq -r '.[].id' \
  | while read -r id; do
      nylas email read "$id" --json >/dev/null
      sleep 0.2     # stay comfortably under provider limits
    done

Next steps