Guide

Email API Rate Limits: Every Provider

Gmail, Microsoft Graph, Yahoo, Exchange EWS, and iCloud each enforce different rate limits, quota systems, and throttling rules. This guide compares them in one table, covers the SMTP error codes you'll hit when you exceed them, and shows how to handle retries without writing backoff logic yourself.

Written by Aaron de Mello Senior Engineering Manager

Updated May 21, 2026

Verified — CLI 3.1.11 · Gmail, Outlook · last tested May 21, 2026

Command references: nylas email list, nylas email send, nylas email search.

What are Gmail API rate limits?

Gmail API rate limits are measured in quota units, not raw request counts. Each Gmail API method costs a different number of units, and every user gets 250 units per second by default. A messages.list call costs 5 units, a messages.get costs 5 units, and a messages.send costs 100 units. That means you can send at most 2 messages per second per user, but list 50 pages per second.

According to Google's Gmail API quota documentation, the per-user rate limit was updated on May 1, 2026 to 250 units/user/second (previously it tracked daily quotas only). The per-project daily limit is 1 billion quota units. Exceeding either triggers HTTP 429 with a Retry-After header.

Gmail API method	Quota cost	Max calls/sec (per user)
`messages.list`	5 units	50
`messages.get`	5 units	50
`messages.send`	100 units	2
`messages.modify`	5 units	50
`messages.trash`	10 units	25
`history.list`	2 units	125
`messages.batchGet`	5 units each	50 (per message in batch)

The quota unit system means a naive sync loop that calls messages.list then messages.get on each result burns through 10 units per message. Syncing 1,000 messages costs 10,000 units total, which a single user can exhaust in 40 seconds. Batch endpoints reduce the round-trip overhead but don't change the per-message unit cost. The CLI batches these calls internally and respects Retry-After headers, so a nylas email list --limit 1000 call handles pagination and throttling without any extra code.

# List up to 200 Gmail messages — the CLI handles pagination and rate limits
nylas email list --limit 200 --json

What are Microsoft Graph email rate limits?

Microsoft Graph enforces a per-app-per-mailbox throttle of 10,000 requests per 10 minutes. That works out to roughly 16 requests per second per mailbox. When you exceed the limit, Graph returns HTTP 429 with a Retry-After header specifying the number of seconds to wait. Unlike Gmail, Graph doesn't use a unit-cost system; every request counts as 1 regardless of endpoint complexity.

According to Microsoft's Graph API throttling documentation, the 10,000/10-min limit applies to Outlook mail, calendar, and contacts endpoints. Tenant-wide limits are higher (undisclosed, service-dependent). Batch requests via the $batch endpoint count as 1 request for throttling but are limited to 20 operations per batch. The practical ceiling for mail sync is about 3,200 messages per 10 minutes if each message needs a separate GET.

Graph also imposes a send limit of 10,000 recipients per day for Microsoft 365 business accounts, and 300 messages per day for Outlook.com consumer accounts. The recipient count, not the message count, is what matters. A single message to 50 recipients counts as 50 against the daily limit.

The CLI abstracts these limits behind a single command. Running nylas email send against an Outlook mailbox handles 429 responses automatically and retries with the delay specified in the Retry-After header.

# Send an email through Outlook — retries on 429 are handled by the CLI
nylas email send --to "recipient@example.com" --subject "Quarterly report" --body "Attached."}

How do rate limits compare across email providers?

Rate limit enforcement varies widely across the 5 major email providers. Gmail uses quota units with per-method costs. Microsoft Graph uses flat request counts. Yahoo and iCloud enforce undocumented IMAP-level connection limits. Exchange EWS uses concurrent request throttling. The table below shows the numbers that matter most for automated email workflows: requests per second, messages per day, and attachment size caps.

Provider	Rate limit model	Effective reads/sec	Send limit/day	Max attachment
Gmail API	Quota units (250/user/sec)	~50	2,000 messages	25 MB
Microsoft Graph	Flat requests (10,000/10 min)	~16	10,000 recipients	150 MB
Yahoo IMAP	Connection-based (undocumented)	~5-10	500 messages	25 MB
Exchange EWS	Concurrent requests (27 max)	~27	10,000 recipients	35 MB (default)
iCloud IMAP	Connection-based (undocumented)	~5	1,000 messages	20 MB

The "effective reads/sec" column reflects realistic throughput for a list + get sync pattern, not the theoretical maximum from the provider's documentation. Gmail's unit system makes it the most permissive for read-heavy workloads. Graph's flat-count model is simpler but more restrictive for bulk sync. Yahoo and iCloud don't publish exact numbers, so the estimates above come from empirical testing with the CLI across accounts of varying ages.

What SMTP error codes indicate rate limiting?

SMTP rate limit errors fall into two categories: HTTP status codes from REST APIs and SMTP enhanced status codes from mail servers. The HTTP 429 "Too Many Requests" response is the standard signal from Gmail API and Microsoft Graph. SMTP servers use the 4.7.x family of enhanced status codes. Knowing which code you're looking at determines whether you should retry immediately or wait hours.

Code	Provider	Meaning	Retry strategy
`HTTP 429`	Gmail, Graph	Quota or request limit exceeded	Wait for `Retry-After` header value
`4.7.28`	Gmail SMTP	Too many messages sent in a rolling 24-hour window	Wait until the 24h window resets
`4.7.0`	Yahoo SMTP	Temporary rate limit on connection or send	Exponential backoff, 30-second base
`5.7.3`	Exchange / Microsoft 365	Daily recipient limit exceeded	No retry possible until next calendar day
`421`	iCloud SMTP	Too many concurrent connections	Reduce connections, retry after 60 seconds
`HTTP 503`	Graph	Service temporarily unavailable (often throttle-related)	Exponential backoff, 5-second base

The distinction between 4xx and 5xx enhanced status codes matters. A 4.7.28 from Gmail is temporary and will resolve on its own. A 5.7.3 from Exchange means you've hit a hard daily cap. According to Google's Workspace sending limits documentation, free Gmail accounts are capped at 500 messages per day, while Google Workspace accounts get 2,000. The CLI parses these error codes and adjusts its retry behavior accordingly.

How should you retry after hitting rate limits?

Exponential backoff with jitter is the standard retry strategy for email API rate limits. The pattern doubles the wait time after each failed attempt (1s, 2s, 4s, 8s) and adds a random jitter of 0 to 1 second to prevent thundering-herd problems when multiple clients hit the same limit simultaneously. Without jitter, synchronized retries from parallel workers pile up and extend the throttle window.

According to Google's API error handling documentation, the recommended maximum retry count is 5 attempts with a 32-second cap on the backoff interval. Microsoft's Graph documentation recommends honoring the Retry-After header value exactly rather than computing your own delay. In practice, the Retry-After value from Gmail is typically 1-60 seconds, while Graph returns 5-300 seconds depending on the severity of the throttle.

If you're calling the Gmail or Graph API directly, you need to implement this loop yourself. The Python snippet below shows a minimal implementation with jitter. Each retry doubles the base delay and adds up to 1 second of random jitter. After 5 failures the function raises instead of looping forever.

import time, random, requests

def call_with_backoff(url, headers, max_retries=5):
    delay = 1
    for attempt in range(max_retries):
        resp = requests.get(url, headers=headers)
        if resp.status_code != 429:
            return resp
        retry_after = int(resp.headers.get("Retry-After", delay))
        jitter = random.uniform(0, 1)
        wait = retry_after + jitter
        print(f"Rate limited. Retry {attempt + 1}/{max_retries} in {wait:.1f}s")
        time.sleep(wait)
        delay = min(delay * 2, 32)
    raise Exception("Max retries exceeded")

How does the CLI handle rate limits internally?

Nylas CLI handles rate limits at the platform layer so you don't write retry logic yourself. When the underlying Nylas API receives a 429 from Gmail, Graph, or any connected provider, it retries with exponential backoff automatically. The CLI inherits this behavior. A single nylas email list --limit 500 command may trigger dozens of paginated API calls behind the scenes, and every one respects the provider's throttle signals without surfacing errors to you.

The platform processes over 1.2 billion API calls per month across Gmail, Outlook, Exchange, Yahoo, iCloud, and IMAP providers. That volume means the retry logic has been tested against every rate limit pattern in the table above. Provider-side behavior described here is based on documented provider behavior and our testing with Gmail and Outlook; verify locally before deploying provider-specific assumptions for Yahoo, iCloud, or EWS.

To see the raw API response including rate limit headers, add --json to any command. The JSON output includes metadata fields that surface the provider's response headers. This is useful for debugging scripts that chain multiple CLI commands together, because you can inspect whether a 429 was encountered and how long the retry waited.

# Fetch 500 messages with full JSON output including response metadata
nylas email list --limit 500 --json | jq '.[0:3]'

For bulk operations like syncing an entire mailbox or sending a campaign of 2,000 messages, the CLI queues requests internally and stays within provider limits. A full Gmail inbox sync of 10,000 messages takes about 3 minutes at the sustained rate the quota allows, compared to roughly 40 seconds if you could ignore limits entirely.

What are the batch operation limits per provider?

Batch operations let you group multiple API calls into a single HTTP request, reducing round-trip overhead and sometimes sidestepping per-request throttles. Gmail supports batch requests of up to 100 calls per batch, while Microsoft Graph caps batches at 20 operations. Exchange EWS doesn't have a formal batch endpoint but supports grouped operations through FindItem with paging.

According to Google's batch request documentation, each individual operation inside a Gmail batch still costs its full quota units. A batch of 100 messages.get calls costs 500 units, the same as 100 individual calls. The benefit is latency, not quota savings. Microsoft's $batch endpoint is different: the batch itself counts as a single request against the 10,000/10-min throttle, but each operation inside still runs independently and can fail individually.

Provider	Max ops per batch	Quota savings?	Throttle counting
Gmail API	100	No (full unit cost per op)	Each op counted individually
Microsoft Graph	20	Yes (1 request for throttle)	Batch counts as 1 request
Exchange EWS	No formal batch	N/A	Each request counts individually
Yahoo IMAP	N/A (IMAP protocol)	N/A	Connection-level throttle
iCloud IMAP	N/A (IMAP protocol)	N/A	Connection-level throttle

The CLI uses batching where the provider supports it. For Gmail, it groups messages.get calls into batches of 100 to reduce latency during pagination. For Graph, it packs up to 20 operations per $batch request to maximize the throttle advantage. You don't need to configure this; it happens automatically behind nylas email list.

How do you monitor your rate limit usage?

Gmail exposes quota usage through the Google Cloud Console under APIs & Services > Gmail API > Quotas, where you can see real-time unit consumption broken down by method. Microsoft Graph provides usage data in the Azure portal under App Registrations > your app > Performance. Both dashboards refresh every 5 minutes and show 30 days of history, which is enough to spot patterns in throttle events.

For CLI-based workflows, add --json to any command and pipe the output through jq to count results and estimate API call volume. The command below lists 100 messages, counts them, and calculates the approximate Gmail quota units consumed. At 5 units per messages.list page (100 results per page) plus 5 units per messages.get for each message, 100 messages costs roughly 505 quota units.

# Count messages returned and estimate Gmail quota cost
COUNT=$(nylas email list --limit 100 --json | jq 'length')
echo "$COUNT messages fetched"
echo "Estimated quota cost: $((5 + COUNT * 5)) units"

If you're running periodic sync scripts, log the count and timestamp after each run. A cron job that syncs every 5 minutes and pulls 200 messages per run consumes about 1,005 units per cycle, or roughly 289,440 units per day. That's well within the 1-billion-unit daily project limit but worth tracking if you scale to thousands of users.

Next steps

Browse the full command reference to see every email, calendar, and contacts command
Gmail API Pagination and Sync Explained for a deep dive on nextPageToken and historyId
Send email from the terminal for a quick start on nylas email send
Best email infrastructure for AI agents to compare API platforms for automated email workflows