Source: https://cli.nylas.com/guides/playwright-email-testing

# Test Email Flows with Playwright

Poll a real test inbox from Playwright with nylas email search --json wrapped in expect.poll. Verify signup confirmation, password reset links, and OTP codes without a mock SMTP server.

Written by [Aaron de Mello](https://cli.nylas.com/authors/aaron-de-mello) Senior Engineering Manager

Reviewed by [Qasim Muhammad](https://cli.nylas.com/authors/qasim-muhammad)

Updated June 9, 2026

> **TL;DR:** Playwright email testing works without a mock SMTP server: wrap `nylas email search --json` in `expect.poll` and assert the message your app actually sent. One helper covers signup confirmation, password reset, and OTP — including the OTP regex detail most suites get wrong, resolved in the last section.

## Why does Playwright email testing need a real inbox?

Playwright email testing means asserting on a message your application actually delivered, not on a mocked transport. A mock SMTP server proves your code called send; it can't prove the message was routed, rendered, and readable by a recipient. Polling a real inbox from inside the test closes that gap for signup, reset, and OTP flows.

The timing mismatch is the real obstacle. Playwright's web-first assertions retry for 5 seconds by default, which suits DOM state but not transactional email that typically lands 5 to 30 seconds after the form submit. The Nylas CLI fixes the read side: [`nylas email search`](https://cli.nylas.com/docs/commands/email-search) and [`nylas email read`](https://cli.nylas.com/docs/commands/email-read) return JSON from Node, so the spec stays in TypeScript and never touches IMAP or the Gmail API.

The payoff is coverage a mock can't give you. A real-inbox assertion catches a template variable that rendered as a literal placeholder, a background worker that silently dropped the send job, and a provider that rejected the message outright. Each of those ships to production undetected when the test stops at the transport boundary, and each is a support ticket within hours of a release.

## How do I create an isolated inbox for Playwright workers?

A dedicated Nylas Agent Account gives a Playwright suite a real, API-readable mailbox without a Google Workspace seat or an OAuth consent screen. Create one address per suite, then put a unique run marker in each subject or recipient so parallel workers never match each other's messages. Provisioning takes one command and under 60 seconds.

Isolation matters because of how the runner schedules work. The [Playwright parallelism docs](https://playwright.dev/docs/test-parallel) state: “Playwright Test runs tests in parallel. In order to achieve that, it runs several worker processes that run at the same time.” Two workers polling one shared inbox for the same subject will match each other's mail. Embed `testInfo.workerIndex` in the marker, or provision one address per worker.

The `nylas agent account create` command provisions the mailbox and its grant in a single step, documented in the [Nylas Agent Accounts quickstart](https://developer.nylas.com/docs/v3/getting-started/agent-accounts/). Follow it with a one-message smoke search so a bad credential fails in 2 seconds instead of after a 60-second poll. Other install methods are in the [getting started guide](https://cli.nylas.com/guides/getting-started).

```bash
brew install nylas/nylas-cli/nylas

# One real inbox for the suite
nylas agent account create playwright@yourapp.nylas.email --json

# Smoke check: the grant can read mail
nylas email search "*" --json --limit 1
```

## How do I poll the inbox with expect.poll?

`expect.poll` turns a one-shot inbox check into a retrying assertion. The [Playwright assertions docs](https://playwright.dev/docs/test-assertions) put it directly: “You can convert any synchronous expect to an asynchronous polling one using expect.poll.” Run the CLI search inside the polled function and assert a match eventually appears.

The defaults need overriding for email. Out of the box, `expect.poll` probes at 100, 250, 500, then 1,000 millisecond intervals and gives up after 5 seconds. Set a 60-second timeout with 3-second intervals instead; 20 probes is plenty, and each probe is one short-lived subprocess. The helper below caps results at 5 messages per probe (the search default is 20, auto-paginating past 200), which keeps every JSON payload small.

```typescript
// tests/helpers/inbox.ts
import { execFileSync } from "node:child_process"

export function runNylas<T>(args: string[]): T {
  return JSON.parse(execFileSync("nylas", args, { encoding: "utf8" })) as T
}

export function searchInbox(subject: string) {
  return runNylas<Array<{ id: string; subject: string }>>([
    "email", "search", "*", "--subject", subject, "--json", "--limit", "5",
  ])
}
```

The spec below submits a signup form, then polls until the confirmation subject shows up. The `message` option labels the failure, so a timeout reads as a missing email rather than a generic assertion error after 60 seconds.

```typescript
import { test, expect } from "@playwright/test"
import { searchInbox } from "./helpers/inbox"

test("signup confirmation arrives", async ({ page }) => {
  await page.goto("/signup")
  await page.fill("input[name=email]", "playwright@yourapp.nylas.email")
  await page.click("button[type=submit]")

  await expect
    .poll(() => searchInbox("Confirm your account").length, {
      intervals: [3_000],
      timeout: 60_000,
      message: "no confirmation email after 60s",
    })
    .toBeGreaterThan(0)
})
```

## How do I verify signup confirmation and password reset links?

Verify a confirmation or reset link by reading the delivered message with `nylas email read --json`, extracting the first URL that targets your app host, and passing it to `page.goto()`. The browser then completes the flow exactly as a user would, in the same context.

Two details break naive extraction. HTML bodies escape query separators, so normalize `&amp;` back to a plain ampersand before visiting the URL. And reset tokens commonly expire in 15 to 60 minutes, so read the message immediately after the poll succeeds rather than batching link checks at the end of the suite. Assert the landing page heading, not just the URL shape—a 200 response on a dead token still renders an error page.

Scope the assertion to what would break the user journey. For a reset link that means the host, the path, and the presence of a token parameter; for a confirmation link it means the account actually flips to verified after the visit. Avoid snapshot-asserting the whole body—one footer copy change would fail every spec in the group for no real regression.

```typescript
const [message] = searchInbox("Reset your password")
const full = runNylas<{ body: string }>(["email", "read", message.id, "--json"])

const url = full.body.match(/https:\/\/yourapp\.com\/reset[^"\s]*/)?.[0]
expect(url, "reset URL in email body").toBeTruthy()

await page.goto(url!.replace(/&amp;/g, "&"))
await expect(
  page.getByRole("heading", { name: "Choose a new password" })
).toBeVisible()
```

## How do I extract OTP codes and run the suite in CI?

Extract an OTP by reading the newest matching message, stripping HTML tags, and matching a six-digit pattern with word boundaries. Skipping the strip step is the regex mistake teased above: raw HTML contains hex colors, pixel widths, and timestamps that a bare digit pattern happily matches, so suites fill the form with a fragment of a CSS rule.

```typescript
const [otpMail] = searchInbox("Your verification code")
const body = runNylas<{ body: string }>(["email", "read", otpMail.id, "--json"]).body

// Strip tags first, then require word boundaries
const code = body.replace(/<[^>]+>/g, " ").match(/\b\d{6}\b/)?.[0]
expect(code, "6-digit OTP in email body").toBeTruthy()

await page.fill("input[name=otp]", code!)
await page.click("button[type=submit]")
```

In CI the suite needs three inputs: the CLI binary, an API key, and a grant ID. The [Playwright CI setup docs](https://playwright.dev/docs/ci-intro) cover the browser side; the inbox side is the GitHub Actions fragment below. Store both secrets encrypted, set `NYLAS_DISABLE_KEYRING` because Linux runners have no keychain, and keep the smoke check so credential failures surface in 2 seconds.

```yaml
- name: Install Nylas CLI
  run: |
    curl -fsSL https://cli.nylas.com/install.sh | bash
    echo "$HOME/.config/nylas/bin" >> "$GITHUB_PATH"

- name: Smoke-check the test inbox
  env:
    NYLAS_API_KEY: ${{ secrets.NYLAS_API_KEY }}
    NYLAS_GRANT_ID: ${{ secrets.NYLAS_TEST_GRANT_ID }}
    NYLAS_DISABLE_KEYRING: "1"
  run: nylas email search "*" --json --limit 1

- name: Run Playwright tests
  env:
    NYLAS_API_KEY: ${{ secrets.NYLAS_API_KEY }}
    NYLAS_GRANT_ID: ${{ secrets.NYLAS_TEST_GRANT_ID }}
    NYLAS_DISABLE_KEYRING: "1"
  run: npx playwright test
```

Keep the real-inbox group small and focused. One signup spec, one reset spec, and one OTP spec cover the three highest-risk handoffs; together they add roughly 2 to 3 minutes to a pipeline at a 60-second poll ceiling. Run template unit tests on every commit, and reserve inbox polling for the flows where a missed email costs you a user.

## Next steps

- [E2E email testing with Playwright](https://cli.nylas.com/guides/e2e-email-testing) — inbox setup, fixtures, and a complete worked example
- [Cypress email testing with a real inbox](https://cli.nylas.com/guides/cypress-email-testing) — the same polling pattern behind `cy.task`
- [Extract OTP codes from email](https://cli.nylas.com/guides/extract-otp-codes-from-email) — OTP-focused flows, including `nylas otp watch`
- [Twilio vs Nylas](https://cli.nylas.com/guides/twilio-vs-nylas) — where verification APIs fit next to inbox-level testing
- [EmailEngine vs Nylas](https://cli.nylas.com/guides/emailengine-vs-nylas) — self-hosted IMAP bridges compared with a managed inbox
- [Command reference](https://cli.nylas.com/docs/commands) — every email, agent, and auth flag used in this guide
