Source: https://cli.nylas.com/guides/gmail-api-quotas-2026

# Gmail API Quotas in 2026

Gmail API quotas changed on May 1, 2026. This guide explains the new per-minute limits, daily billing threshold, per-method costs, and when a CLI workflow is safer than direct API plumbing.

Written by [Qasim Muhammad](https://cli.nylas.com/authors/qasim-muhammad) Staff SRE

Reviewed by [Qasim Muhammad](https://cli.nylas.com/authors/qasim-muhammad)

Updated May 14, 2026

> **TL;DR:** Google updated Gmail API usage limits on May 1, 2026. New projects get 1,200,000 quota units per minute per project, 6,000 per minute per user per project, and new per-method costs such as 20 units for `messages.get`.

## What changed in Gmail API quotas on May 1, 2026?

Gmail API quota limits changed on May 1, 2026 for new Cloud projects, while projects that used Gmail API between November 2025 and April 2026 keep their existing quotas for now. Google now documents 1,200,000 quota units per minute per project and 6,000 per minute per user per project.

Google also added an 80,000,000 quota-unit daily billing threshold per project. The threshold does not trigger billing yet; Google says full billing details will come later in 2026 with at least 90 days of notice. Read the current [Gmail API usage limits](https://developers.google.com/workspace/gmail/api/reference/quota) before sizing a sync job.

## Which Gmail methods cost the most units?

Gmail method costs now make body reads and sends much more expensive than simple list calls. The current table prices `messages.list` at 5 units, `history.list` at 2, `messages.get` at 20, `threads.get` at 40, and `messages.send` at 100 units per request.

A naive sync that lists 10,000 message IDs and then calls `messages.get` for each body can spend 200,000 units on reads after the list phase. That is why mailbox agents should search narrowly, page carefully, and avoid fetching full bodies unless the task needs them.

The method-specific behavior matters as much as the quota table. Google documents [`messages.list`](https://developers.google.com/workspace/gmail/api/reference/rest/v1/users.messages/list) as the ID-listing step and [`messages.get`](https://developers.google.com/workspace/gmail/api/reference/rest/v1/users.messages/get) as the body-read step, so the expensive part starts after pagination succeeds.

| Gmail API method | Quota units | Planning note |
| --- | --- | --- |
| `history.list` | 2 | Best for incremental sync after a checkpoint |
| `messages.list` | 5 | Cheap, but returns IDs rather than full bodies |
| `messages.get` | 20 | Use only for messages the agent must inspect |
| `threads.get` | 40 | Useful for conversations, costly at scale |
| `messages.send` | 100 | Protect with human approval or policy gates |

## Why do quota numbers matter for agents?

Agents turn quota mistakes into loops faster than human scripts because a planner may retry, inspect adjacent threads, or request full bodies for every candidate. A 25-message search that reads every body can spend 500 units before the agent writes a single reply, and repeated runs multiply that cost.

The safest pattern is to search first, cap result counts, and only read the messages the model needs. Nylas CLI makes that pattern explicit with `--limit`, JSON output, and a separate read command, so the agent can inspect IDs and subjects before deciding to fetch bodies.

This command searches only matching invoice messages after a fixed 2026 date. It keeps the first pass small, returns JSON for ranking, and avoids a full mailbox scan before the agent decides whether any message deserves a body read.

The exact command reference is [`nylas email search`](https://cli.nylas.com/docs/commands/email-search). Link it whenever this pattern appears, because the flags are the contract: `--after` bounds the time window, `--limit` bounds result count, and `--json` keeps the output parseable for an agent loop.

```bash
nylas email search "invoice" --after 2026-05-01 --limit 25 --json
```

## What does a quota-safe agent loop look like?

A quota-safe Gmail agent loop uses 4 steps: search narrowly, list only metadata, read one selected message, then send or defer. The expensive calls happen after the model has a reason to inspect a message. That ordering avoids the common mistake where an agent reads 100 full bodies just to decide that 97 were irrelevant.

The CLI command map mirrors that loop. Use [`nylas email list`](https://cli.nylas.com/docs/commands/email-list) for a bounded inbox sample, [`nylas email search`](https://cli.nylas.com/docs/commands/email-search) for server-side narrowing, [`nylas email read`](https://cli.nylas.com/docs/commands/email-read) for the selected body, and [`nylas email send`](https://cli.nylas.com/docs/commands/email-send) only after approval or policy checks.

```bash
# 1. Find candidates
nylas email search "invoice" --after 2026-05-01 --limit 25 --json

# 2. Read only the selected message
nylas email read <message-id> --json

# 3. Send only after the workflow approves the reply
nylas email send --to finance@example.com --subject "Invoice received" --body "Thanks, received." --yes --json
```

## How can a CLI workflow avoid quota plumbing?

A CLI workflow does not remove provider quotas, but it removes the retry, pagination, OAuth refresh, and output-shaping code from the agent. The result is a smaller tool contract: list or search messages, choose a message ID, read only that message, and send only after approval.

Use this 3-command flow as the baseline for Gmail agents. It validates authentication, fetches a bounded inbox sample, and reads one selected message. The same commands also run against Outlook, Exchange, Yahoo, iCloud, and IMAP accounts, so the agent code does not fork by provider.

Keep the authentication link specific too. [`nylas auth status`](https://cli.nylas.com/docs/commands/auth-status) is the health check page, while [`nylas auth config`](https://cli.nylas.com/docs/commands/auth-config) covers the API-key setup used by CI, cron jobs, and hosted agent sandboxes.

```bash
nylas auth status --json
nylas email list --limit 50 --json
nylas email read <message-id> --json
```

## When should you avoid Gmail quota entirely?

Avoid Gmail quota when the agent does not need to act through a user's Gmail mailbox. For app-owned workflows such as support intake, QA signups, agent-to-agent messages, and transactional replies, a Nylas Agent Account gives the agent its own `provider=nylas` mailbox and calendar instead of consuming Gmail API quota from a Workspace user.

This is a different product decision from quota optimization. Connected Gmail grants are still right when the agent is reading a person's real inbox. Agent Accounts are right when the product owns the address and wants policies, rules, webhooks, thread history, and a separate identity. The creation command is documented at [`nylas agent account create`](https://cli.nylas.com/docs/commands/agent-account-create).

## What should you log before scaling?

Before scaling a Gmail agent past a pilot, log 5 counters for each run: search calls, list calls, body reads, sends, and retries. These numbers map directly to Gmail's quota table and reveal whether the agent is reading too much context. A daily dashboard with those 5 fields is enough to catch most quota regressions before users notice latency.

Also log the command surface, not only the model prompt. If the agent called `nylas email read` 400 times for one ticket, the bug is probably in retrieval planning. If it called `nylas email send` repeatedly, the write path needs an approval gate, a policy rule, or both. The command names become the fastest debugging signal.

## How should teams estimate quota before launch?

Estimate quota with a worksheet before the first production rollout. Multiply expected agent runs per user by searches, body reads, thread reads, and sends per run. A support agent that handles 200 tickets per day, searches once per ticket, reads 3 message bodies, and sends 1 reply spends roughly 33,000 Gmail quota units before retries: 1,000 for search/list work, 12,000 for body reads, and 20,000 for sends.

The estimate does not need to be perfect. It needs to expose the expensive operations before the model is allowed to explore freely. If the model prompt says "read every related thread," convert that sentence into a maximum number of `messages.get` or `threads.get` calls. If nobody can name the cap, the cap is not real.

Build the cap into command selection. [`nylas email search`](https://cli.nylas.com/docs/commands/email-search) should run with `--limit`, [`nylas email list`](https://cli.nylas.com/docs/commands/email-list) should run with a small default result count, and [`nylas email read`](https://cli.nylas.com/docs/commands/email-read) should require a selected message ID.

## Next steps

- [Gmail API pagination and sync](https://cli.nylas.com/guides/gmail-api-pagination-sync) -- nextPageToken, historyId, and sync edge cases
- [Gmail API limits for AI agents](https://cli.nylas.com/guides/why-gmail-api-breaks-ai-agents) -- OAuth, MIME, retry, and provider lock-in details
- [Google Workspace MCP for AI agents](https://cli.nylas.com/guides/google-workspace-mcp-vs-nylas) -- where Google's new MCP preview fits
- [Best email infrastructure for AI agents](https://cli.nylas.com/guides/best-email-infrastructure-ai-agents) -- provider API, MCP, and CLI tradeoffs
- [Email search command](https://cli.nylas.com/docs/commands/email-search) -- exact flags for bounded Gmail searches
- [Email read command](https://cli.nylas.com/docs/commands/email-read) -- inspect one selected message body
- [Full command reference](https://cli.nylas.com/docs/commands) -- every flag and subcommand documented