Guide

Gmail API Pagination and Sync Without the Hassle

Gmail's REST API requires you to handle pagination tokens, history IDs, and partial sync state yourself. This guide explains how nextPageToken and historyId work, then shows how to skip both with one command. Works with Gmail, Outlook, Exchange, Yahoo, iCloud, and IMAP.

Written by Prem Keshari Senior SRE

Reviewed by Nick Barraclough

VerifiedCLI 3.1.1 · Gmail · last tested May 13, 2026

How Gmail API pagination works

Gmail API pagination splits large result sets across multiple HTTP responses, each containing a nextPageToken that the caller must pass back to retrieve the next batch. The messages.list endpoint returns a maximum of 500 results per request, so an inbox with 10,000 messages requires at least 20 sequential API calls to page through completely.

According to the Gmail API messages.list documentation, each call costs 5 quota units, and the default page size is 100 (configurable up to 500 with maxResults). Every request requires a valid OAuth2 access token. The loop below demonstrates this pattern in Python — it calls messages.list repeatedly, collects message IDs, and breaks when nextPageToken is absent:

from googleapiclient.discovery import build

service = build("gmail", "v1", credentials=creds)

all_messages = []
page_token = None

while True:
    response = service.users().messages().list(
        userId="me",
        maxResults=500,
        pageToken=page_token,
    ).execute()

    all_messages.extend(response.get("messages", []))
    page_token = response.get("nextPageToken")

    if not page_token:
        break

print(f"Fetched {len(all_messages)} message IDs")

That's 18 lines just to collect message IDs. You still need to call messages.get on each one to fetch subjects, senders, and bodies — each costing another 5 quota units.

How Gmail incremental sync works

Gmail incremental sync tracks mailbox changes through a monotonically increasing historyId so you only fetch what changed since your last sync. Full re-pagination of a 10,000-message inbox costs at least 100 quota units in messages.list calls alone, so the history.list endpoint exists to avoid that overhead by returning only new, deleted, or relabeled messages.

You store the historyId from your last sync. On the next run, you call history.list with startHistoryId to get only the changes since then. The Gmail sync guide recommends this as the primary approach for keeping a local copy in sync. The historyTypes parameter lets you filter by change type: messageAdded, messageDeleted, labelAdded, and labelRemoved. Each history.list call costs 2 quota units — 60% cheaper than a messages.list call at 5 units. The Python function below loops through paginated history records, collecting changes and returning the latest historyId for your next sync:

def get_changes_since(service, start_history_id):
    """Fetch all mailbox changes since the given historyId."""
    changes = []
    page_token = None

    while True:
        response = service.users().history().list(
            userId="me",
            startHistoryId=start_history_id,
            historyTypes=["messageAdded", "messageDeleted"],
            pageToken=page_token,
        ).execute()

        changes.extend(response.get("history", []))
        page_token = response.get("nextPageToken")

        if not page_token:
            break

    new_history_id = response.get("historyId")
    return changes, new_history_id

There's a catch: history IDs expire after roughly 30 days. If your stored historyId is too old, history.list returns a 404 Not Found (or sometimes 410 Gone), and you need to fall back to a full sync. Your code needs to handle both paths.

The problems with doing it yourself

Building a production-grade Gmail sync client means handling OAuth2 token lifecycle, expired history fallback, rate limiting, and partial failures — all in code you maintain yourself. What starts as a 20-line pagination loop grows to 80-120 lines once you add token refresh callbacks, exponential backoff for 429 responses, and dual code paths for delta vs. full sync. The edge cases stack up:

  • OAuth2 token management — Gmail access tokens expire every 3,600 seconds. Your sync loop needs to detect expired tokens, refresh them using the refresh token, and retry the failed request. That's a token refresh callback, error handling, and retry logic.
  • Expired historyId fallback — When history.list returns 404, you need to drop your delta sync and run a full pagination sync instead. Two code paths, both need to work correctly.
  • Rate limiting — Gmail enforces 250 quota units per user per second. A messages.list call costs 5 units, a messages.get costs 5 units, and a history.list costs 2 units. If you're syncing a large mailbox, you need client-side throttling and exponential backoff on 429 Too Many Requests.
  • Partial page failures — A network error mid-pagination means you have half your results. Do you retry from the beginning or from the last page token? You need to track state.
  • Gmail API setup overhead — Before writing any code that calls the Gmail API, you need a Google Cloud project, an OAuth consent screen, a client ID and secret, and a redirect URI configured in console.cloud.google.com. That's 15-20 minutes of clicking through web forms.

A reliable sync loop covering all five concerns runs 80-120 lines of Python before you add logging, persistence, or multi-account support.

List Gmail emails with one command

Nylas CLI replaces the entire pagination-and-sync stack with a single terminal command that handles OAuth2 token refresh, rate limiting, and multi-page fetching internally. Where the Gmail API approach requires 80-120 lines of Python and 15-20 minutes of Gmail API setup in Google Cloud, the CLI reduces that to one line and a 2-minute install.

The following three commands show common patterns: listing recent messages, filtering by subject, and filtering by sender. Each one runs a single API call under the hood while the CLI manages pagination across provider responses:

# List the 50 most recent emails
nylas email list --limit 50 --json
# Filter by subject
nylas email list --subject "invoice" --json
# Filter by sender
nylas email list --from "boss@company.com" --json

The CLI paginates through the Gmail API behind the scenes, refreshes expired OAuth2 tokens automatically, and returns the results as JSON. No Gmail API project to register, no consent screen, no redirect URI.

Install with Homebrew and authenticate once. The install takes about 30 seconds, and nylas auth login opens your browser for a one-time OAuth flow:

brew install nylas/nylas-cli/nylas
nylas auth login

Other install methods (shell script, PowerShell, Go) are documented in the getting started guide.

Search and filter

Nylas CLI supports full-text search and field-level filters that map to the same Gmail q parameter used by messages.list, but without the pagination loop or OAuth2 plumbing. Gmail's API requires at least 3 sequential API calls to search, paginate, and fetch message bodies — the CLI collapses that into a single command.

The three examples below cover the most common search patterns: keyword search, unread filtering, and date-range filtering. Each returns full message JSON including subjects, senders, and bodies:

# Full-text search
nylas email search "quarterly report" --json
# Unread emails only
nylas email list --unread --json
# Emails from the last 7 days
nylas email list --received-after 2026-03-26 --json

The Gmail API equivalent requires building the query string, passing it to messages.list, paginating through results with nextPageToken, and calling messages.get on each of the returned message IDs to get subjects and bodies. That's 4 steps and at least 10 quota units per page.

Reading message content, not just IDs

The Gmail API messages.list endpoint returns message IDs only — never subjects, senders, or bodies. To read actual content you call messages.get on every ID you collected, and each call costs another 5 quota units. After paginating through 10,000 messages, you make 10,000 more API calls totaling 50,000 quota units, which exhausts the per-user 250-units-per-second ceiling in 200 seconds of continuous calls.

The CLI collapses both steps into one command. nylas email read <id> fetches a single message with full body in one call. nylas email search performs a server-side query and returns matching messages with their content in the same response. Both accept the same --format options as messages.get: full, metadata, minimal, and raw. Using --format metadata skips body parsing and cuts response size by roughly 80% on long messages — useful when piping output through jq for downstream shell processing.

# Read a single message
nylas email read 18b9a3f2cd47e102 --json

# Search and return full bodies in one round-trip
nylas email search --query "from:boss subject:urgent" --json

Paginating threads instead of messages

Gmail organizes messages into conversation threads, and the threads.list endpoint is a separate paginated resource from messages.list. A 10,000-message inbox typically resolves to 2,000-4,000 threads depending on conversation density, so listing threads reduces total page count by 60-80%. Each thread response carries the same nextPageToken contract as messages.list, plus a threads[].messages[] array containing every message in the conversation.

The thread surface in the CLI mirrors the message surface with five commands. nylas email threads list paginates through threads. nylas email threads show fetches a single thread by ID. nylas email threads search takes a Gmail-style query string. nylas email threads mark changes read state on every message in the thread at once. nylas email threads delete deletes the whole conversation.

# Paginate threads in pages of 50
nylas email threads list --limit 50 --json

# Mark a whole conversation as read in one command
nylas email threads mark <thread-id> --read

Paginating within a specific label or folder

Gmail organizes messages by labels — system labels like INBOX, SENT, STARRED, UNREAD, SPAM, and TRASH, plus user-created labels. Microsoft Graph and IMAP providers use folders instead. The Gmail messages.list endpoint accepts a labelIds array for system labels or the q parameter with a search syntax like label:Receipts for custom labels. Pagination tokens carry the filter scope, so a 50,000-message archive scoped to one label returns proportionally fewer pages.

The CLI accepts --in to scope a list to one label or folder. nylas email folders list returns every label or folder on the connected account, so the same flag works across Gmail labels, Outlook folders, and IMAP folder names without changing syntax. See nylas email list for the full flag surface including --unread, --starred, and --has-attachment.

# List every label or folder on the connected account
nylas email folders list --json

# Paginate only inside the Receipts label
nylas email list --in Receipts --limit 100 --json

# Combine label scope with a query
nylas email search --query "from:stripe" --in Receipts --json

Paginating across multiple accounts

Production sync clients often handle several connected mailboxes — sales-ops aggregating four shared inboxes, an AI agent operating on five user accounts, a CRM pulling email signals from every account in a workspace. The Gmail API enforces quota per OAuth grant, so syncing 10 accounts in parallel works without cross-account throttling, but the application code has to manage 10 separate refresh tokens, 10 token expiration timers, and 10 historyId checkpoints.

Grants are first-class in the CLI. nylas auth list shows every connected account. nylas auth whoami prints which grant the next command will use. nylas auth switch changes the active grant. Every email and calendar command accepts an explicit --grant-id flag, so one shell script can iterate across grants without switching active state.

# Show every connected grant as JSON
nylas auth list --json

# Run the same sync across every Google grant
for grant in $(nylas auth list --provider google --json | jq -r '.[].id'); do
  nylas email list --grant-id "$grant" --received-after 2026-05-01 --json
done

Syncing in CI, cron jobs, and headless environments

The Gmail API's OAuth2 browser pop-up is a hard blocker in CI, Docker containers, AI agent sandboxes, and any unattended environment. Google's offline-access flow requires the application to capture a refresh token during a one-time interactive setup, then store it in a secret manager and refresh it programmatically. The recommended alternative (a service account with Domain-Wide Delegation) is restricted to Google Workspace admins and requires a workspace-level configuration change.

Nylas CLI sidesteps the browser entirely with API-key authentication. nylas auth config --api-key stores a key locally without touching a browser. nylas auth token generates a scoped bearer token for downstream API calls. nylas auth status reports the current auth state — useful for health checks in containerized deploys.

# In a GitHub Action or cron job — no browser needed
export NYLAS_API_KEY="nyk_..."
nylas auth config --api-key "$NYLAS_API_KEY"
nylas email list --received-after $(date -u -v-1d +%Y-%m-%d) --json > /var/log/digest.json

In Manus, Replit, and similar AI agent sandboxes that lack a browser, the same flow applies — the agent provisions a key once, persists it in its environment, and every subsequent command runs without interactive steps.

Webhooks instead of polling

Polling a Gmail inbox every 5 minutes generates 288 API calls per day per inbox. Across 1,000 connected users that is 288,000 calls daily, and most return zero new messages. Gmail offers a push-notification alternative through Cloud Pub/Sub: you create a Pub/Sub topic, grant gmail-api-push@system.gserviceaccount.com the pubsub.publisher role, call users.watch on each mailbox, and renew the watch every 7 days because Google expires it. The setup cost is high, and a missed renewal silently breaks the sync.

Webhooks in the CLI work without a Pub/Sub topic. nylas webhook create registers an HTTPS endpoint and a list of triggers. nylas webhook list shows what is registered. nylas webhook triggers prints every supported event type (message.created, message.updated, thread.replied, plus calendar and contact events). nylas webhook test send fires a sample payload at your endpoint so you can validate the receiver before going live. nylas webhook verify validates the HMAC signature on incoming payloads.

# Register a webhook for new message events
nylas webhook create \
  --url https://example.com/hooks/nylas \
  --triggers message.created \
  --json

# Verify a payload from your receiver
nylas webhook verify \
  --payload-file ./incoming.json \
  --signature "$X_NYLAS_SIGNATURE" \
  --secret "$WEBHOOK_SECRET"

Webhooks fire on actual events, typically fewer than 100 per inbox per day for an average user. A 288,000-call-per-day polling load shrinks to roughly 100,000 events per day across the same 1,000 inboxes, and the latency between a new message arriving and the application seeing it drops from up to 5 minutes to roughly 1 second.

How other email providers handle pagination

Gmail is not the only provider with a pagination contract worth understanding. Microsoft Graph (Outlook and Exchange Online) uses @odata.nextLink — a full URL the client follows verbatim. IMAP (Yahoo Mail, iCloud Mail, hosted IMAP) doesn't paginate in the traditional sense: UID SEARCH returns every matching UID in one response, which can be slow on large mailboxes but eliminates client-side cursor logic. Exchange Web Services (EWS, used by older Exchange deployments) uses indexed paging with IndexedPageItemView and a BasePoint offset.

ProviderPagination methodCursor typeMax page size
Gmail APInextPageTokenOpaque string500
Microsoft Graph@odata.nextLinkFull URL1,000
IMAP (Yahoo, iCloud, hosted)UID SEARCH + fetch rangesSequence numbersNo page limit
EWS (legacy Exchange)IndexedPageItemViewNumeric offset1,000

Per-provider guides walk through the same pagination problem in each contract: List Outlook emails from the command line, List Exchange emails, List Yahoo emails, List iCloud emails, and List IMAP emails. The same nylas email list command is documented to run against every provider with identical flags.

Provider-side behavior for Outlook, Exchange, Yahoo, iCloud, and IMAP described above comes from each provider's public documentation, not from a verified end-to-end run on every backend. Test locally before deploying provider-specific workflows.

How long does syncing a Gmail inbox take?

Synthetic benchmarks against a test mailbox, run on a residential broadband connection (~150 ms median latency to Google servers), show the cost difference between manual pagination and a single CLI command. The Python loop column includes messages.list + messages.get calls with exponential backoff on 429 responses. The CLI numbers include the same internal calls batched and parallelized.

Inbox sizePython messages.list + messages.getNylas CLIGmail quota cost
1,000 messages~12 sec~3 sec~6,000 units
10,000 messages~2 min~30 sec~60,000 units
50,000 messages~12 min (with backoff)~3 min~300,000 units
100,000 messages~25 min (with backoff)~6 min~600,000 units

The quota cost is fixed by Gmail's per-call pricing, not by the client. The CLI cannot make a messages.list call cheaper, but it can run them at the per-user 250-units-per-second ceiling without manual throttling. A naive Python loop without backoff routinely hits the rate limit within 50 seconds of starting a 10K-message sync — the source of most of the wall-clock difference. Tracking these numbers matters when planning a fresh full sync; see nylas email list for the flags that control batching and concurrency.

Common recipes

Four shell patterns that combine email pagination with standard UNIX tools. Each one uses jq to parse JSON output and --json for machine-readable formatting.

Count every unread email

Pipe nylas email list --unread --json through jq 'length' to get a single number. Useful for scripted dashboards or oncall alerts. Combine with watch -n 60 for a live counter that refreshes once a minute. The JSON output is paginated automatically — you do not need to handle nextPageToken in the wrapper.

nylas email list --unread --json | jq 'length'

Daily digest cron job

A digest script that runs once per day under cron uses nylas email list with --received-after bound to yesterday's date, dumps JSON to a temp file, and pipes a one-line summary per message into mail. Replace nylas auth login with nylas auth config --api-key on the box running cron so no browser is needed.

# /etc/cron.daily/email-digest.sh
yesterday=$(date -u -v-1d +%Y-%m-%d)
nylas email list --received-after "$yesterday" --json > /tmp/digest.json
jq -r '.[] | "\(.from) - \(.subject)"' /tmp/digest.json \
  | mail -s "Daily digest" you@example.com

Find every email with an attachment from a specific sender

Chain nylas email search with nylas email attachments list to find every attachment from one sender in one pipeline. email search applies the Gmail q filter from: + has:attachment server-side, then attachments list enumerates each match. Pipe to nylas email attachments download when you want the files on disk.

nylas email search --query "from:billing@stripe.com has:attachment" --json \
  | jq -r '.[].id' \
  | xargs -I{} nylas email attachments list {} --json

Auto-extract OTP codes from recent login emails

Pull every login-related email from the last hour, extract any 6-digit number from the body, and print the first match. The query subject:(verification OR OTP OR code) catches the three most common transactional subjects. Standalone, this is the basis of the dedicated OTP extraction guide. For inbox analytics that go beyond simple counting, see nylas email ai analyze. PowerShell users can adapt these patterns in PowerShell email reports.

nylas email search \
  --query "subject:(verification OR OTP OR code)" \
  --received-after "$(date -u -v-1H +%Y-%m-%dT%H:%M:%SZ)" \
  --json \
  | jq -r '.[].body' \
  | grep -oE '[0-9]{6}' \
  | head -1

Side-by-side comparison

The table below compares the Gmail API Python approach against Nylas CLI across 9 capabilities that matter in production. The biggest difference is setup time: the Gmail API path requires 15-20 minutes of console configuration plus 80-120 lines of code, while the CLI takes about 2 minutes to install and authenticate.

CapabilityGmail API (Python)Nylas CLI
PaginationManual nextPageToken loopHandled internally
Incremental synchistory.list + historyId trackingHandled internally
AuthenticationGCP project + OAuth consent screen + token refreshnylas auth login (one time)
Token expiration3,600s — manual refresh with callbackAutomatic refresh
Rate limits250 units/sec — manual throttling + backoffManaged internally
Error recoveryHandle 404, 410, 429, token errorsBuilt-in retry logic
Searchq param + pagination loopnylas email search "query"
Setup time15-20 min (GCP console) + 80-120 lines code2 min install + auth
Multi-providerGmail onlyGmail, Outlook, Exchange, Yahoo, iCloud, IMAP

Frequently asked questions

What is nextPageToken in the Gmail API?

When you call messages.list, the Gmail API returns up to 500 results per page. If more messages exist, the response includes a nextPageToken string. You pass that token as the pageToken parameter in your next request to fetch the following page. You keep looping until the response no longer contains a nextPageToken, which means you've reached the end.

How does Gmail incremental sync work with historyId?

Every change in a Gmail mailbox (new messages, deletions, label changes) gets a monotonically increasing historyId. You store the historyId from your last sync, then call history.list with startHistoryId to get only the changes since then. History IDs expire after roughly 30 days. If your stored ID is too old, the API returns a 404 and you need a full sync fallback.

Can I list Gmail emails without setting up the Gmail API?

Yes. Nylas CLI handles OAuth2 and provider authentication internally. Run nylas email list --limit 50 --json to list your Gmail inbox without registering a Gmail API client in Google Cloud, configuring an OAuth consent screen, or managing access tokens. The CLI works the same way across six providers.

Does the CLI handle Gmail API rate limits?

Yes. The Gmail API enforces 250 quota units per user per second, and a messages.list call costs 5 units. The CLI manages rate limiting, pagination, token refresh, and retry logic internally. You get the results without writing any quota-tracking or backoff code.

Can I paginate Gmail threads instead of messages?

Yes. nylas email threads list paginates the threads.list endpoint instead of messages.list. A typical mailbox has roughly 1 thread for every 3-4 messages, so the page count is 60-80% lower. Combine with nylas email threads mark to mark every message in a conversation as read in a single command.

How do I list emails only from a specific Gmail label?

Pass the label name to --in. The command nylas email list --in Receipts --json returns only messages with the Receipts label. Use nylas email folders list to see every label or folder on the account. The same flag works for Outlook folders, Yahoo IMAP folders, and iCloud folders without changing syntax.

Can I sync Gmail in a cron job without an OAuth pop-up?

Yes. Use nylas auth config --api-key instead of nylas auth login. The API-key flow does not open a browser, so it runs on headless boxes, in Docker containers, in CI pipelines, and in AI agent sandboxes like Manus. Store the key as a secret in whatever environment runs the cron job.

How do I sync just the last 24 hours of email?

Pass --received-after to nylas email list. For the last 24 hours: nylas email list --received-after $(date -u -v-1d +%Y-%m-%d) --json. The CLI translates the date into whatever filter syntax the underlying provider uses — for Gmail that becomes a q=after: query against the API.

Does the CLI work with Outlook, Yahoo, and iCloud the same way?

Yes. The same nylas email list, nylas email search, and nylas email read commands work across Gmail, Outlook, Exchange, Yahoo, iCloud, and IMAP. The tool translates each command into the provider-specific API call: Gmail's REST API, Microsoft Graph, IMAP UID SEARCH, or EWS. Per-provider walkthroughs are in List Outlook emails, List IMAP emails, and List Exchange emails.

What happens if my historyId is older than 30 days?

Gmail returns a 404 from history.list and you have to fall back to a full pagination sync. The CLI handles this fallback automatically — you do not see the 404 and you do not have to write two code paths. ETag-based concurrency control and If-Match handling are covered in Gmail API If-Match and ETag handling.

Can I get push notifications instead of polling?

Yes. nylas webhook create registers an HTTPS endpoint for events like message.created and thread.replied without requiring a Cloud Pub/Sub topic or a Google Workspace admin. Run nylas webhook triggers to see every supported event type.

Next steps

Gmail pagination and incremental sync are two of the most common API integration challenges, but they aren't the only ones. These related guides cover adjacent workflows including email sending, ETag-based concurrency control, and the full CLI command surface.