Guide

Gmail API Quotas in 2026

Gmail API quotas changed on May 1, 2026. This guide explains the new per-minute limits, daily billing threshold, per-method costs, and when a CLI workflow is safer than direct API plumbing.

Written by Qasim Muhammad Staff SRE

Reviewed by Qasim Muhammad

VerifiedCLI 3.1.1 · Gmail · last tested May 14, 2026

What changed in Gmail API quotas on May 1, 2026?

Gmail API quota limits changed on May 1, 2026 for new Cloud projects, while projects that used Gmail API between November 2025 and April 2026 keep their existing quotas for now. Google now documents 1,200,000 quota units per minute per project and 6,000 per minute per user per project.

Google also added an 80,000,000 quota-unit daily billing threshold per project. The threshold does not trigger billing yet; Google says full billing details will come later in 2026 with at least 90 days of notice. Read the current Gmail API usage limits before sizing a sync job.

Which Gmail methods cost the most units?

Gmail method costs now make body reads and sends much more expensive than simple list calls. The current table prices messages.list at 5 units, history.list at 2, messages.get at 20, threads.get at 40, and messages.send at 100 units per request.

A naive sync that lists 10,000 message IDs and then calls messages.get for each body can spend 200,000 units on reads after the list phase. That is why mailbox agents should search narrowly, page carefully, and avoid fetching full bodies unless the task needs them.

The method-specific behavior matters as much as the quota table. Google documents messages.list as the ID-listing step and messages.get as the body-read step, so the expensive part starts after pagination succeeds.

Gmail API methodQuota unitsPlanning note
history.list2Best for incremental sync after a checkpoint
messages.list5Cheap, but returns IDs rather than full bodies
messages.get20Use only for messages the agent must inspect
threads.get40Useful for conversations, costly at scale
messages.send100Protect with human approval or policy gates

Why do quota numbers matter for agents?

Agents turn quota mistakes into loops faster than human scripts because a planner may retry, inspect adjacent threads, or request full bodies for every candidate. A 25-message search that reads every body can spend 500 units before the agent writes a single reply, and repeated runs multiply that cost.

The safest pattern is to search first, cap result counts, and only read the messages the model needs. Nylas CLI makes that pattern explicit with --limit, JSON output, and a separate read command, so the agent can inspect IDs and subjects before deciding to fetch bodies.

This command searches only matching invoice messages after a fixed 2026 date. It keeps the first pass small, returns JSON for ranking, and avoids a full mailbox scan before the agent decides whether any message deserves a body read.

The exact command reference is nylas email search. Link it whenever this pattern appears, because the flags are the contract: --after bounds the time window, --limit bounds result count, and --json keeps the output parseable for an agent loop.

nylas email search "invoice" --after 2026-05-01 --limit 25 --json

What does a quota-safe agent loop look like?

A quota-safe Gmail agent loop uses 4 steps: search narrowly, list only metadata, read one selected message, then send or defer. The expensive calls happen after the model has a reason to inspect a message. That ordering avoids the common mistake where an agent reads 100 full bodies just to decide that 97 were irrelevant.

The CLI command map mirrors that loop. Use nylas email list for a bounded inbox sample, nylas email search for server-side narrowing, nylas email read for the selected body, and nylas email send only after approval or policy checks.

# 1. Find candidates
nylas email search "invoice" --after 2026-05-01 --limit 25 --json

# 2. Read only the selected message
nylas email read <message-id> --json

# 3. Send only after the workflow approves the reply
nylas email send --to finance@example.com --subject "Invoice received" --body "Thanks, received." --yes --json

How can a CLI workflow avoid quota plumbing?

A CLI workflow does not remove provider quotas, but it removes the retry, pagination, OAuth refresh, and output-shaping code from the agent. The result is a smaller tool contract: list or search messages, choose a message ID, read only that message, and send only after approval.

Use this 3-command flow as the baseline for Gmail agents. It validates authentication, fetches a bounded inbox sample, and reads one selected message. The same commands also run against Outlook, Exchange, Yahoo, iCloud, and IMAP accounts, so the agent code does not fork by provider.

Keep the authentication link specific too. nylas auth status is the health check page, while nylas auth config covers the API-key setup used by CI, cron jobs, and hosted agent sandboxes.

nylas auth status --json
nylas email list --limit 50 --json
nylas email read <message-id> --json

When should you avoid Gmail quota entirely?

Avoid Gmail quota when the agent does not need to act through a user's Gmail mailbox. For app-owned workflows such as support intake, QA signups, agent-to-agent messages, and transactional replies, a Nylas Agent Account gives the agent its own provider=nylas mailbox and calendar instead of consuming Gmail API quota from a Workspace user.

This is a different product decision from quota optimization. Connected Gmail grants are still right when the agent is reading a person's real inbox. Agent Accounts are right when the product owns the address and wants policies, rules, webhooks, thread history, and a separate identity. The creation command is documented at nylas agent account create.

What should you log before scaling?

Before scaling a Gmail agent past a pilot, log 5 counters for each run: search calls, list calls, body reads, sends, and retries. These numbers map directly to Gmail's quota table and reveal whether the agent is reading too much context. A daily dashboard with those 5 fields is enough to catch most quota regressions before users notice latency.

Also log the command surface, not only the model prompt. If the agent called nylas email read 400 times for one ticket, the bug is probably in retrieval planning. If it called nylas email send repeatedly, the write path needs an approval gate, a policy rule, or both. The command names become the fastest debugging signal.

How should teams estimate quota before launch?

Estimate quota with a worksheet before the first production rollout. Multiply expected agent runs per user by searches, body reads, thread reads, and sends per run. A support agent that handles 200 tickets per day, searches once per ticket, reads 3 message bodies, and sends 1 reply spends roughly 33,000 Gmail quota units before retries: 1,000 for search/list work, 12,000 for body reads, and 20,000 for sends.

The estimate does not need to be perfect. It needs to expose the expensive operations before the model is allowed to explore freely. If the model prompt says "read every related thread," convert that sentence into a maximum number of messages.get or threads.get calls. If nobody can name the cap, the cap is not real.

Build the cap into command selection. nylas email search should run with --limit, nylas email list should run with a small default result count, and nylas email read should require a selected message ID.

Next steps