Guide

Group Inbox by Corporate Email Domain

The part after the @ sign is the most reliable company identifier in email. It doesn't change when someone updates their display name, it doesn't get truncated like signatures, and it's present on every single message. Group your Gmail, Outlook, Exchange, Yahoo, iCloud, or IMAP inbox by domain and you instantly see which companies you communicate with most.

By Qasim Muhammad

Why domains beat every other company identifier

Display names are unreliable. “Sarah C” could be anyone. Signatures change across devices. But sarah@acme.com unambiguously identifies Acme Corp as the employer. According to the Radicati Group’s 2024 Email Statistics Report, 56% of email addresses use free providers like Gmail and Yahoo. The other 44% use corporate domains that map directly to company names. Filter out the freemail, and you have a clean corporate directory.

Extract unique domains from your inbox

Start with a flat list of every unique sender domain from your recent messages:

# List every unique sender domain
nylas email list --json --limit 500 \
  | jq '[.[] | .from[0].email | split("@")[1]] | unique | sort'

# Count messages per domain
nylas email list --json --limit 500 | jq '
  [.[] | .from[0].email | split("@")[1]] |
  group_by(.) | map({domain: .[0], count: length}) |
  sort_by(-.count)'

A typical business inbox with 500 messages yields 50-200 unique domains. Most of the volume comes from 10-15 domains at the top of the list.

Filter out freemail providers

Consumer domains dominate raw counts. Define a blocklist and filter:

# Define freemail blocklist (covers 95%+ of consumer email)
FREEMAIL="gmail.com|yahoo.com|outlook.com|hotmail.com|icloud.com|aol.com|protonmail.com|live.com|mail.com|gmx.de|yandex.ru|qq.com"

nylas email list --json --limit 500 | jq --arg bl "$FREEMAIL" '
  [.[] | {
    email: .from[0].email,
    domain: (.from[0].email | split("@")[1]),
    name: .from[0].name,
    date: .date
  }] |
  [.[] | select(.domain | test($bl) | not)] |
  group_by(.domain) |
  map({
    domain: .[0].domain,
    email_count: length,
    unique_senders: ([.[].email] | unique | length),
    senders: ([.[].name] | unique),
    last_seen: (map(.date) | sort | last),
    first_seen: (map(.date) | sort | first)
  }) |
  sort_by(-.email_count)'

The regex-based filter handles all the common providers. Extend with regional providers like gmx.de, yandex.ru, or qq.com depending on your audience geography.

Map subsidiaries to parent companies

Large companies own many email domains. Google employees send from google.com, but YouTube employees use youtube.com and DeepMind uses deepmind.com. The 10 largest tech companies own an average of 12 email domains each. Without normalization, these show up as separate companies.

Create a domain alias CSV:

# domain-aliases.csv — canonical domain, followed by known subsidiaries
google.com,youtube.com,waze.com,deepmind.com,fitbit.com
microsoft.com,linkedin.com,github.com,xbox.com,nuance.com
meta.com,facebook.com,instagram.com,whatsapp.com,oculus.com
salesforce.com,slack.com,tableau.com,mulesoft.com,heroku.com
amazon.com,aws.com,twitch.tv,ring.com,imdb.com
#!/usr/bin/env python3
"""Group inbox emails by sender domain with subsidiary normalization."""

import csv
import json
import subprocess
from collections import defaultdict

FREEMAIL = {
    "gmail.com", "yahoo.com", "outlook.com", "hotmail.com",
    "icloud.com", "aol.com", "protonmail.com", "live.com",
    "mail.com", "gmx.de", "yandex.ru", "qq.com",
}

def load_aliases(path: str = "domain-aliases.csv") -> dict[str, str]:
    """Load subsidiary-to-parent domain mapping."""
    aliases = {}
    try:
        with open(path) as f:
            for row in csv.reader(f):
                if len(row) < 2:
                    continue
                canonical = row[0].strip()
                for alias in row[1:]:
                    aliases[alias.strip()] = canonical
    except FileNotFoundError:
        pass
    return aliases

def fetch_emails(limit: int = 500) -> list[dict]:
    result = subprocess.run(
        ["nylas", "email", "list", "--json", "--limit", str(limit)],
        capture_output=True, text=True, check=True,
    )
    return json.loads(result.stdout)

def group_by_domain(emails: list[dict], aliases: dict[str, str]) -> list[dict]:
    groups: dict[str, dict] = defaultdict(lambda: {
        "senders": set(), "names": set(), "dates": [],
    })
    for msg in emails:
        sender = msg.get("from", [{}])[0]
        addr = sender.get("email", "")
        if not addr or "@" not in addr:
            continue
        raw_domain = addr.split("@")[1].lower()
        domain = aliases.get(raw_domain, raw_domain)
        if domain in FREEMAIL:
            continue
        groups[domain]["senders"].add(addr)
        if sender.get("name"):
            groups[domain]["names"].add(sender["name"])
        if msg.get("date"):
            groups[domain]["dates"].append(msg["date"])

    rows = []
    for domain, data in groups.items():
        dates = sorted(data["dates"])
        rows.append({
            "domain": domain,
            "email_count": len(data["dates"]),
            "unique_senders": len(data["senders"]),
            "senders": sorted(data["names"]),
            "first_seen": dates[0] if dates else "",
            "last_seen": dates[-1] if dates else "",
        })
    return sorted(rows, key=lambda r: r["email_count"], reverse=True)

aliases = load_aliases()
emails = fetch_emails()
companies = group_by_domain(emails, aliases)

# Export CSV
with open("companies.csv", "w") as f:
    f.write("domain,email_count,unique_senders,first_seen,last_seen\n")
    for c in companies:
        f.write(f"{c['domain']},{c['email_count']},{c['unique_senders']},"
                f"{c['first_seen']},{c['last_seen']}\n")

print(f"Grouped into {len(companies)} companies")
for c in companies[:10]:
    print(f"  {c['domain']:30s}  {c['email_count']:>4d} emails  "
          f"{c['unique_senders']:>3d} people")

Export to CSV for CRM import

The jq @csv filter handles quoting and escaping automatically. Use this for quick exports:

nylas email list --json --limit 500 | jq -r '
  [.[] | {
    email: .from[0].email,
    domain: (.from[0].email | split("@")[1]),
    name: .from[0].name,
    date: .date
  }] |
  group_by(.domain) |
  map({
    domain: .[0].domain,
    email_count: length,
    unique_senders: ([.[].email] | unique | length),
    last_seen: (map(.date) | sort | last)
  }) |
  sort_by(-.email_count) |
  ["domain","email_count","unique_senders","last_seen"],
  (.[] | [.domain, .email_count, .unique_senders, .last_seen])
  | @csv' > companies.csv

echo "Exported $(wc -l < companies.csv) rows"

Detect corporate vs. personal domains

Beyond the freemail blocklist, you can detect corporate domains programmatically. Corporate domains have MX records pointing to business email providers and SPF records listing enterprise tools:

# Check if a domain is corporate based on MX records
is_corporate() {
  local domain="$1"
  local mx=$(dig +short MX "$domain" 2>/dev/null)
  if [ -z "$mx" ]; then
    echo "no-mx"
  elif echo "$mx" | grep -qi 'google|outlook|microsoft'; then
    echo "corporate-hosted"
  elif echo "$mx" | grep -qi 'mimecast|proofpoint|barracuda'; then
    echo "enterprise-security"
  else
    echo "self-hosted"
  fi
}

# Check top domains from your inbox
nylas email list --json --limit 500 | jq -r '
  [.[] | .from[0].email | split("@")[1]] | unique | .[]
' | while read domain; do
  type=$(is_corporate "$domain")
  echo "$domain: $type"
done

Analyze communication over time per company

Track how your communication with each company changes month to month. Rising email volume may indicate a deal heating up. Declining volume may signal churn:

nylas email list --json --limit 1000 | jq '
  [.[] | {
    domain: (.from[0].email | split("@")[1]),
    month: (.date | split("T")[0] | split("-")[:2] | join("-"))
  }] |
  group_by(.domain) |
  map({
    domain: .[0].domain,
    by_month: (group_by(.month) | map({month: .[0].month, count: length}))
  }) |
  [.[] | select(.by_month | length >= 2)] |
  sort_by(-(.by_month | map(.count) | add))'

Next steps