Building for Email Deliverability
At PDL, one of our core goals has always been to build the most useful workforce data in the world. Not just the biggest dataset or the one with the most fields, but the one that actually works in real-world workflows.
After spending the last 10 years in this space, we’ve started focusing more deeply on a problem that sits right at the center of many of those workflows: email deliverability.
On the surface, this doesn’t sound that complicated. You take a list of emails, run them through a validation tool, filter out the bad ones, and move on.
But in practice, validation and deliverability are two very different things. And that gap between something that looks good on paper and something that actually works in production is where most of the challenge lives.
In this post, we’ll walk through what email deliverability actually means (and how it differs from validation), why it’s harder than it seems, and how we’re approaching it differently.
Let’s dive in!
What Email Deliverability Actually Means
At a high level, email deliverability is about whether an email actually reaches the inbox - not just whether it technically exists.
That outcome depends on a lot of factors working together:
- Sender reputation
- Authentication (SPF, DKIM, DMARC)
- How email providers filter and prioritize messages
- Content and engagement signals over time
When deliverability is strong, emails land where they’re supposed to and everything downstream improves: engagement, response rates, and overall conversion.
Email Validation
Email validation plays a role in deliverability, but it answers a narrower question: is this email address valid right now?
Validation typically checks:
- Syntax and formatting (catching obvious typos)
- Whether the domain exists at all and can receive mail
- Whether the mailbox appears to exist (when that’s detectable)
- Some providers will also flag emails that are known to be temporary, disposable, or fraudulent.
Validation is often treated as something you can buy from off-the-shelf tools on the market (such as
Bouncer). However, in practice, these tools still operate under meaningful constraints that can limit how actionable and reliable their results can be.
For example, many email providers (think like Gmail or Outlook) intentionally obfuscate whether a mailbox exists or can receive mail. As a result, many validation tools frequently return a meaningful segment of results as “unknown” or “risky”, rather than clearly valid or invalid.
This ambiguity isn’t a flaw in any one provider; instead, it’s more a reflection of how the entire email ecosystem has evolved.
All of this information is useful, as it helps reduce hard bounces and cleans up obvious issues in an email list. But ultimately this is just table stakes: on its own, pure validation leaves significant gaps between what appears valid and what actually performs in production.
The Simple Distinction Between Email Deliverability and Email Validation
While email deliverability and email validation are similar concepts, the easiest way to think about the difference is:
- Validation is a point-in-time technical check
- Deliverability is the real-world outcome over time
You can have an email that passes validation and still performs poorly when you actually try to send it.
This gap between what looks valid and what actually performs in production – between sending an email and truly reaching the inbox – is the one we’re solving for.
Why This is Harder Than It Looks
When it comes to ensuring email deliverability, there are a few underlying realities that make it much harder than it first appears.
1) Validation Is Point-In-Time, But Email Is Not
As we covered above, email validation tools answer a very specific question: “Is this email reachable, right now?”
However, people change jobs. Companies shut down or rebrand. Mailboxes get disabled. Email providers continuously update how they filter and accept messages.
Deliverability requires continued confidence in email quality, which means more than just repeatedly running emails through a validation tool (especially when those tools can’t always give definitive answers in the first place). On top of being valid, it also means ensuring that every email belongs to the right person and reaches them in their inbox.
2) Scale Changes the Problem
When you work at the scale of hundreds of millions of emails, the edge cases start to matter a lot.
Tracking catch-all domains, managing domain rebrands and redirects, flagging greylisted domains, handling nuances with international domains, and interpreting large volumes of “unknown” or ambiguous validation results all become part of the baseline.
Addressing email deliverability across a comprehensive, global workforce dataset means ensuring you have solutions for these and the hundreds of other long-tail challenges that you could otherwise just ignore at smaller scales.
3) Tradeoffs Between Coverage vs Quality
There’s also a fundamental tension when it comes to building and maintaining an email dataset:
- Include more emails → higher coverage, but more risk
- Filter more aggressively → higher quality, but less volume
In our experience, many providers lean towards coverage because it’s easier to measure and sell. It’s also easier to inflate coverage by generating inferred emails (from pattern matching on existing emails).
However, the tradeoffs can be significant: inferred or loosely validated emails might not exist. They might not belong to the right person. And over time, they can negatively impact sender reputation.
Historically, we’ve taken a more conservative approach - avoiding inferred emails and aiming for a balance between coverage and quality.
But we haven’t been fully satisfied with where our deliverability lands today. So our goal now is to push forward on both coverage and quality - without cutting corners or inflating metrics.
Our Approach: Treating Deliverability as a System
The biggest shift in our strategy towards email deliverability is this:
Deliverability isn’t a series of new checks we’re adding; it’s a system we’re building from the ground up and integrating throughout our data pipeline.
Moving Beyond “Valid vs Invalid”
In practice, deliverability isn’t binary.
An email isn’t just good or bad; it exists on a spectrum.
We’re starting to think about emails in terms of:
- Confidence
- Recency
- Observed performance over time
We believe this gives a much more realistic and actionable view of how an email will behave inside actual customer workflows.
Continuous Evaluation, Not One-Time Validation
Instead of validating emails once and moving on, we’re treating them as something that needs continuous evaluation.
That means:
- Rechecking emails over time
- Updating our confidence in them
- Demoting or removing those which degrade
- Refreshing the dataset as conditions change
The goal is for the data to evolve as the real world changes, rather than remaining a static snapshot from a single time point.
Separating Use Cases
Not every email needs to do the same job.
Some emails are meant for outreach. Others are useful for matching, enrichment, or building a more complete profile.
So we’re being more explicit about that distinction:
- High confidence emails for outreach use cases
- Broader sets of emails for matching and enrichment
This lets us optimize the data for how it’s actually being used, rather than forcing everything into a single standard.
Rethinking Inference (Carefully)
As mentioned above, we’ve historically avoided email inference to protect data quality.
But we’re revisiting that decision, and with stricter guardrails.
At a high-level, the approach looks like this:
- Generate likely emails based on known patterns
- Validate them rigorously
- Map validated emails to the correct person profile through layered checks
- Continuously improve these steps together
The easy part is generating the emails. The hard part is ensuring they actually exist and map to the correct person.
This only works if inference, validation, and association are tightly coupled and continuously improving.
Building for Scale
Most providers treat deliverability as the final step in an outreach workflow.
That step still matters (and we’ll continue to
recommend it). However, we also think deliverability should be built into the dataset itself.
That means:
- Applying these systems across our entire workforce datasets
- Running them continuously, not periodically
- Treating deliverability as a core property of the data, and not something bolted on at the end
Raising the Bar for What “Good Email Data” Means
We kicked off this work back in
October 2025, and it’s now a core part of how we think about building, maintaining, and improving our dataset.
This isn’t a one-time improvement or a single feature release. Instead, it’s a core part of our Data Roadmap and an ongoing investment in improving our data for our customers.
Our goal is simply to build the most useful workforce datasets in the world.
There’s still more to do, and we’re excited about what’s ahead.
- If you’re already a customer, feel free to reach out to your account team to learn more about how these changes can improve your workflows.
If you’re new to PDL, you can explore the dataset or
get in touch with us to learn how we’re approaching email deliverability.