Source: https://cli.nylas.com/guides/organize-emails-by-company

Guide

# Group Inbox by Corporate Email Domain

The part after the @ sign is the most reliable company identifier in email. It doesn't change when someone updates their display name, it doesn't get truncated like signatures, and it's present on every single message. Group your inbox by domain and you instantly see which companies you communicate with most.

Written by [Qasim Muhammad](https://cli.nylas.com/authors/qasim-muhammad) • Staff SRE

Reviewed by [Hazik](https://cli.nylas.com/authors/hazik)

Updated April 11, 2026

Verified

 —

CLI

3.1.1

 ·

Gmail, Outlook

 ·

last tested

April 11, 2026

> **TL;DR:** Pipe `nylas email list --json` through jq to extract sender domains. Filter out freemail (Gmail, Yahoo, Outlook), normalize subsidiary domains to parent companies, and export to CSV.

## Why domains beat every other company identifier

Display names are unreliable. “Sarah C” could be anyone. Signatures change across devices. But `sarah@acme.com` unambiguously identifies Acme Corp as the employer. According to the Radicati Group’s 2024 Email Statistics Report, 56% of email addresses use free providers like Gmail and Yahoo. The other 44% use corporate domains that map directly to company names. Filter out the freemail, and you have a clean corporate directory.

## Extract unique domains from your inbox

Start with a flat list of every unique sender domain from your recent messages:

File: `extract-domains.sh`

```bash
# List every unique sender domain
nylas email list --json --limit 500 \
  | jq '[.[] | .from[0].email | split("@")[1]] | unique | sort'

# Count messages per domain
nylas email list --json --limit 500 | jq '
  [.[] | .from[0].email | split("@")[1]] |
  group_by(.) | map({domain: .[0], count: length}) |
  sort_by(-.count)'
```

A typical business inbox with 500 messages yields 50-200 unique domains. Most of the volume comes from 10-15 domains at the top of the list.

## Filter out freemail providers

Consumer domains dominate raw counts. Define a blocklist and filter:

File: `filter-freemail.sh`

```bash
# Define freemail blocklist (covers 95%+ of consumer email)
FREEMAIL="gmail.com|yahoo.com|outlook.com|hotmail.com|icloud.com|aol.com|protonmail.com|live.com|mail.com|gmx.de|yandex.ru|qq.com"

nylas email list --json --limit 500 | jq --arg bl "$FREEMAIL" '
  [.[] | {
    email: .from[0].email,
    domain: (.from[0].email | split("@")[1]),
    name: .from[0].name,
    date: .date
  }] |
  [.[] | select(.domain | test($bl) | not)] |
  group_by(.domain) |
  map({
    domain: .[0].domain,
    email_count: length,
    unique_senders: ([.[].email] | unique | length),
    senders: ([.[].name] | unique),
    last_seen: (map(.date) | sort | last),
    first_seen: (map(.date) | sort | first)
  }) |
  sort_by(-.email_count)'
```

The regex-based filter handles all the common providers. Extend with regional providers like `gmx.de`, `yandex.ru`, or `qq.com` depending on your audience geography.

## Map subsidiaries to parent companies

Large companies own many email domains. Google employees send from `google.com`, but YouTube employees use `youtube.com` and DeepMind uses `deepmind.com`. The 10 largest tech companies own an average of 12 email domains each. Without normalization, these show up as separate companies.

Create a domain alias CSV:

File: `domain-aliases.csv`

```text
# domain-aliases.csv — canonical domain, followed by known subsidiaries
google.com,youtube.com,waze.com,deepmind.com,fitbit.com
microsoft.com,linkedin.com,github.com,xbox.com,nuance.com
meta.com,facebook.com,instagram.com,whatsapp.com,oculus.com
salesforce.com,slack.com,tableau.com,mulesoft.com,heroku.com
amazon.com,aws.com,twitch.tv,ring.com,imdb.com
```

File: `group_by_company.py`

```python
#!/usr/bin/env python3
"""Group inbox emails by sender domain with subsidiary normalization."""

import csv
import json
import subprocess
from collections import defaultdict

FREEMAIL = {
    "gmail.com", "yahoo.com", "outlook.com", "hotmail.com",
    "icloud.com", "aol.com", "protonmail.com", "live.com",
    "mail.com", "gmx.de", "yandex.ru", "qq.com",
}

def load_aliases(path: str = "domain-aliases.csv") -> dict[str, str]:
    """Load subsidiary-to-parent domain mapping."""
    aliases = {}
    try:
        with open(path) as f:
            for row in csv.reader(f):
                if len(row) < 2:
                    continue
                canonical = row[0].strip()
                for alias in row[1:]:
                    aliases[alias.strip()] = canonical
    except FileNotFoundError:
        pass
    return aliases

def fetch_emails(limit: int = 500) -> list[dict]:
    result = subprocess.run(
        ["nylas", "email", "list", "--json", "--limit", str(limit)],
        capture_output=True, text=True, check=True,
    )
    return json.loads(result.stdout)

def group_by_domain(emails: list[dict], aliases: dict[str, str]) -> list[dict]:
    groups: dict[str, dict] = defaultdict(lambda: {
        "senders": set(), "names": set(), "dates": [],
    })
    for msg in emails:
        sender = msg.get("from", [{}])[0]
        addr = sender.get("email", "")
        if not addr or "@" not in addr:
            continue
        raw_domain = addr.split("@")[1].lower()
        domain = aliases.get(raw_domain, raw_domain)
        if domain in FREEMAIL:
            continue
        groups[domain]["senders"].add(addr)
        if sender.get("name"):
            groups[domain]["names"].add(sender["name"])
        if msg.get("date"):
            groups[domain]["dates"].append(msg["date"])

    rows = []
    for domain, data in groups.items():
        dates = sorted(data["dates"])
        rows.append({
            "domain": domain,
            "email_count": len(data["dates"]),
            "unique_senders": len(data["senders"]),
            "senders": sorted(data["names"]),
            "first_seen": dates[0] if dates else "",
            "last_seen": dates[-1] if dates else "",
        })
    return sorted(rows, key=lambda r: r["email_count"], reverse=True)

aliases = load_aliases()
emails = fetch_emails()
companies = group_by_domain(emails, aliases)

# Export CSV
with open("companies.csv", "w") as f:
    f.write("domain,email_count,unique_senders,first_seen,last_seen\n")
    for c in companies:
        f.write(f"{c['domain']},{c['email_count']},{c['unique_senders']},"
                f"{c['first_seen']},{c['last_seen']}\n")

print(f"Grouped into {len(companies)} companies")
for c in companies[:10]:
    print(f"  {c['domain']:30s}  {c['email_count']:>4d} emails  "
          f"{c['unique_senders']:>3d} people")
```

## Export to CSV for CRM import

The jq `@csv` filter handles quoting and escaping automatically. Use this for quick exports:

File: `export-csv.sh`

```bash
nylas email list --json --limit 500 | jq -r '
  [.[] | {
    email: .from[0].email,
    domain: (.from[0].email | split("@")[1]),
    name: .from[0].name,
    date: .date
  }] |
  group_by(.domain) |
  map({
    domain: .[0].domain,
    email_count: length,
    unique_senders: ([.[].email] | unique | length),
    last_seen: (map(.date) | sort | last)
  }) |
  sort_by(-.email_count) |
  ["domain","email_count","unique_senders","last_seen"],
  (.[] | [.domain, .email_count, .unique_senders, .last_seen])
  | @csv' > companies.csv

echo "Exported $(wc -l < companies.csv) rows"
```

## Detect corporate vs. personal domains

Beyond the freemail blocklist, you can detect corporate domains programmatically. Corporate domains have MX records pointing to business email providers and SPF records listing enterprise tools:

File: `detect-corporate.sh`

```bash
# Check if a domain is corporate based on MX records
is_corporate() {
  local domain="$1"
  local mx=$(dig +short MX "$domain" 2>/dev/null)
  if [ -z "$mx" ]; then
    echo "no-mx"
  elif echo "$mx" | grep -qi 'google|outlook|microsoft'; then
    echo "corporate-hosted"
  elif echo "$mx" | grep -qi 'mimecast|proofpoint|barracuda'; then
    echo "enterprise-security"
  else
    echo "self-hosted"
  fi
}

# Check top domains from your inbox
nylas email list --json --limit 500 | jq -r '
  [.[] | .from[0].email | split("@")[1]] | unique | .[]
' | while read domain; do
  type=$(is_corporate "$domain")
  echo "$domain: $type"
done
```

## Analyze communication over time per company

Track how your communication with each company changes month to month. Rising email volume may indicate a deal heating up. Declining volume may signal churn:

File: `time-analysis.sh`

```bash
nylas email list --json --limit 1000 | jq '
  [.[] | {
    domain: (.from[0].email | split("@")[1]),
    month: (.date | split("T")[0] | split("-")[:2] | join("-"))
  }] |
  group_by(.domain) |
  map({
    domain: .[0].domain,
    by_month: (group_by(.month) | map({month: .[0].month, count: length}))
  }) |
  [.[] | select(.by_month | length >= 2)] |
  sort_by(-(.by_month | map(.count) | add))'
```

## Next steps

- [Parse email signatures for enrichment](https://cli.nylas.com/guides/enrich-contacts-from-email) — extract job titles and phone numbers for each contact at these companies
- [Visualize communication patterns](https://cli.nylas.com/guides/map-organization-contacts) — score relationship strength and detect single-threaded risk
- [Reconstruct org charts](https://cli.nylas.com/guides/contact-hierarchy-from-email) — infer reporting lines within the companies you’ve identified
- [Full command reference](https://cli.nylas.com/docs/commands) — every flag and subcommand documented
