Guide

Email API Rate Limits: Every Provider

Gmail, Microsoft Graph, Yahoo, Exchange EWS, and iCloud each enforce different rate limits, quota systems, and throttling rules. This guide compares them in one table, covers the SMTP error codes you'll hit when you exceed them, and shows how to handle retries without writing backoff logic yourself.

Written by Aaron de Mello Senior Engineering Manager

VerifiedCLI 3.1.11 · Gmail, Outlook · last tested May 21, 2026

Command references: nylas email list, nylas email send, nylas email search.

What are Gmail API rate limits?

Gmail API rate limits are measured in quota units, not raw request counts. Each Gmail API method costs a different number of units, and every user gets 250 units per second by default. A messages.list call costs 5 units, a messages.get costs 5 units, and a messages.send costs 100 units. That means you can send at most 2 messages per second per user, but list 50 pages per second.

According to Google's Gmail API quota documentation, the per-user rate limit was updated on May 1, 2026 to 250 units/user/second (previously it tracked daily quotas only). The per-project daily limit is 1 billion quota units. Exceeding either triggers HTTP 429 with a Retry-After header.

Gmail API methodQuota costMax calls/sec (per user)
messages.list5 units50
messages.get5 units50
messages.send100 units2
messages.modify5 units50
messages.trash10 units25
history.list2 units125
messages.batchGet5 units each50 (per message in batch)

The quota unit system means a naive sync loop that calls messages.list then messages.get on each result burns through 10 units per message. Syncing 1,000 messages costs 10,000 units total, which a single user can exhaust in 40 seconds. Batch endpoints reduce the round-trip overhead but don't change the per-message unit cost. The CLI batches these calls internally and respects Retry-After headers, so a nylas email list --limit 1000 call handles pagination and throttling without any extra code.

# List up to 200 Gmail messages — the CLI handles pagination and rate limits
nylas email list --limit 200 --json

What are Microsoft Graph email rate limits?

Microsoft Graph enforces a per-app-per-mailbox throttle of 10,000 requests per 10 minutes. That works out to roughly 16 requests per second per mailbox. When you exceed the limit, Graph returns HTTP 429 with a Retry-After header specifying the number of seconds to wait. Unlike Gmail, Graph doesn't use a unit-cost system; every request counts as 1 regardless of endpoint complexity.

According to Microsoft's Graph API throttling documentation, the 10,000/10-min limit applies to Outlook mail, calendar, and contacts endpoints. Tenant-wide limits are higher (undisclosed, service-dependent). Batch requests via the $batch endpoint count as 1 request for throttling but are limited to 20 operations per batch. The practical ceiling for mail sync is about 3,200 messages per 10 minutes if each message needs a separate GET.

Graph also imposes a send limit of 10,000 recipients per day for Microsoft 365 business accounts, and 300 messages per day for Outlook.com consumer accounts. The recipient count, not the message count, is what matters. A single message to 50 recipients counts as 50 against the daily limit.

The CLI abstracts these limits behind a single command. Running nylas email send against an Outlook mailbox handles 429 responses automatically and retries with the delay specified in the Retry-After header.

# Send an email through Outlook — retries on 429 are handled by the CLI
nylas email send --to "recipient@example.com" --subject "Quarterly report" --body "Attached."}

How do rate limits compare across email providers?

Rate limit enforcement varies widely across the 5 major email providers. Gmail uses quota units with per-method costs. Microsoft Graph uses flat request counts. Yahoo and iCloud enforce undocumented IMAP-level connection limits. Exchange EWS uses concurrent request throttling. The table below shows the numbers that matter most for automated email workflows: requests per second, messages per day, and attachment size caps.

ProviderRate limit modelEffective reads/secSend limit/dayMax attachment
Gmail APIQuota units (250/user/sec)~502,000 messages25 MB
Microsoft GraphFlat requests (10,000/10 min)~1610,000 recipients150 MB
Yahoo IMAPConnection-based (undocumented)~5-10500 messages25 MB
Exchange EWSConcurrent requests (27 max)~2710,000 recipients35 MB (default)
iCloud IMAPConnection-based (undocumented)~51,000 messages20 MB

The "effective reads/sec" column reflects realistic throughput for a list + get sync pattern, not the theoretical maximum from the provider's documentation. Gmail's unit system makes it the most permissive for read-heavy workloads. Graph's flat-count model is simpler but more restrictive for bulk sync. Yahoo and iCloud don't publish exact numbers, so the estimates above come from empirical testing with the CLI across accounts of varying ages.

What SMTP error codes indicate rate limiting?

SMTP rate limit errors fall into two categories: HTTP status codes from REST APIs and SMTP enhanced status codes from mail servers. The HTTP 429 "Too Many Requests" response is the standard signal from Gmail API and Microsoft Graph. SMTP servers use the 4.7.x family of enhanced status codes. Knowing which code you're looking at determines whether you should retry immediately or wait hours.

CodeProviderMeaningRetry strategy
HTTP 429Gmail, GraphQuota or request limit exceededWait for Retry-After header value
4.7.28Gmail SMTPToo many messages sent in a rolling 24-hour windowWait until the 24h window resets
4.7.0Yahoo SMTPTemporary rate limit on connection or sendExponential backoff, 30-second base
5.7.3Exchange / Microsoft 365Daily recipient limit exceededNo retry possible until next calendar day
421iCloud SMTPToo many concurrent connectionsReduce connections, retry after 60 seconds
HTTP 503GraphService temporarily unavailable (often throttle-related)Exponential backoff, 5-second base

The distinction between 4xx and 5xx enhanced status codes matters. A 4.7.28 from Gmail is temporary and will resolve on its own. A 5.7.3 from Exchange means you've hit a hard daily cap. According to Google's Workspace sending limits documentation, free Gmail accounts are capped at 500 messages per day, while Google Workspace accounts get 2,000. The CLI parses these error codes and adjusts its retry behavior accordingly.

How should you retry after hitting rate limits?

Exponential backoff with jitter is the standard retry strategy for email API rate limits. The pattern doubles the wait time after each failed attempt (1s, 2s, 4s, 8s) and adds a random jitter of 0 to 1 second to prevent thundering-herd problems when multiple clients hit the same limit simultaneously. Without jitter, synchronized retries from parallel workers pile up and extend the throttle window.

According to Google's API error handling documentation, the recommended maximum retry count is 5 attempts with a 32-second cap on the backoff interval. Microsoft's Graph documentation recommends honoring the Retry-After header value exactly rather than computing your own delay. In practice, the Retry-After value from Gmail is typically 1-60 seconds, while Graph returns 5-300 seconds depending on the severity of the throttle.

If you're calling the Gmail or Graph API directly, you need to implement this loop yourself. The Python snippet below shows a minimal implementation with jitter. Each retry doubles the base delay and adds up to 1 second of random jitter. After 5 failures the function raises instead of looping forever.

import time, random, requests

def call_with_backoff(url, headers, max_retries=5):
    delay = 1
    for attempt in range(max_retries):
        resp = requests.get(url, headers=headers)
        if resp.status_code != 429:
            return resp
        retry_after = int(resp.headers.get("Retry-After", delay))
        jitter = random.uniform(0, 1)
        wait = retry_after + jitter
        print(f"Rate limited. Retry {attempt + 1}/{max_retries} in {wait:.1f}s")
        time.sleep(wait)
        delay = min(delay * 2, 32)
    raise Exception("Max retries exceeded")

How does the CLI handle rate limits internally?

Nylas CLI handles rate limits at the platform layer so you don't write retry logic yourself. When the underlying Nylas API receives a 429 from Gmail, Graph, or any connected provider, it retries with exponential backoff automatically. The CLI inherits this behavior. A single nylas email list --limit 500 command may trigger dozens of paginated API calls behind the scenes, and every one respects the provider's throttle signals without surfacing errors to you.

The platform processes over 1.2 billion API calls per month across Gmail, Outlook, Exchange, Yahoo, iCloud, and IMAP providers. That volume means the retry logic has been tested against every rate limit pattern in the table above. Provider-side behavior described here is based on documented provider behavior and our testing with Gmail and Outlook; verify locally before deploying provider-specific assumptions for Yahoo, iCloud, or EWS.

To see the raw API response including rate limit headers, add --json to any command. The JSON output includes metadata fields that surface the provider's response headers. This is useful for debugging scripts that chain multiple CLI commands together, because you can inspect whether a 429 was encountered and how long the retry waited.

# Fetch 500 messages with full JSON output including response metadata
nylas email list --limit 500 --json | jq '.[0:3]'

For bulk operations like syncing an entire mailbox or sending a campaign of 2,000 messages, the CLI queues requests internally and stays within provider limits. A full Gmail inbox sync of 10,000 messages takes about 3 minutes at the sustained rate the quota allows, compared to roughly 40 seconds if you could ignore limits entirely.

What are the batch operation limits per provider?

Batch operations let you group multiple API calls into a single HTTP request, reducing round-trip overhead and sometimes sidestepping per-request throttles. Gmail supports batch requests of up to 100 calls per batch, while Microsoft Graph caps batches at 20 operations. Exchange EWS doesn't have a formal batch endpoint but supports grouped operations through FindItem with paging.

According to Google's batch request documentation, each individual operation inside a Gmail batch still costs its full quota units. A batch of 100 messages.get calls costs 500 units, the same as 100 individual calls. The benefit is latency, not quota savings. Microsoft's $batch endpoint is different: the batch itself counts as a single request against the 10,000/10-min throttle, but each operation inside still runs independently and can fail individually.

ProviderMax ops per batchQuota savings?Throttle counting
Gmail API100No (full unit cost per op)Each op counted individually
Microsoft Graph20Yes (1 request for throttle)Batch counts as 1 request
Exchange EWSNo formal batchN/AEach request counts individually
Yahoo IMAPN/A (IMAP protocol)N/AConnection-level throttle
iCloud IMAPN/A (IMAP protocol)N/AConnection-level throttle

The CLI uses batching where the provider supports it. For Gmail, it groups messages.get calls into batches of 100 to reduce latency during pagination. For Graph, it packs up to 20 operations per $batch request to maximize the throttle advantage. You don't need to configure this; it happens automatically behind nylas email list.

How do you monitor your rate limit usage?

Gmail exposes quota usage through the Google Cloud Console under APIs & Services > Gmail API > Quotas, where you can see real-time unit consumption broken down by method. Microsoft Graph provides usage data in the Azure portal under App Registrations > your app > Performance. Both dashboards refresh every 5 minutes and show 30 days of history, which is enough to spot patterns in throttle events.

For CLI-based workflows, add --json to any command and pipe the output through jq to count results and estimate API call volume. The command below lists 100 messages, counts them, and calculates the approximate Gmail quota units consumed. At 5 units per messages.list page (100 results per page) plus 5 units per messages.get for each message, 100 messages costs roughly 505 quota units.

# Count messages returned and estimate Gmail quota cost
COUNT=$(nylas email list --limit 100 --json | jq 'length')
echo "$COUNT messages fetched"
echo "Estimated quota cost: $((5 + COUNT * 5)) units"

If you're running periodic sync scripts, log the count and timestamp after each run. A cron job that syncs every 5 minutes and pulls 200 messages per run consumes about 1,005 units per cycle, or roughly 289,440 units per day. That's well within the 1-billion-unit daily project limit but worth tracking if you scale to thousands of users.

Next steps