Post

Introduction to AdTech: The Post-Cookie Frontier - Identity, Privacy & and What Comes Next

Introduction to AdTech: The Post-Cookie Frontier - Identity, Privacy & and What Comes Next

In the previous posts, we explored how an ad impression is auctioned in milliseconds and how machine learning models decide whether and how much to bid. However, every system we’ve discussed - from retargeting to lookalike modeling to budget pacing - rests on one fragile foundation:

the ability to recognize a user across time and context.

For over a decade, third-party cookies quietly solved this problem. They allowed the ecosystem to stitch together impressions, clicks, and conversions across websites with surprising effectiveness.

That era is now over.

With browsers deprecating third-party cookies, AdTech is undergoing its most profound architectural shift in years - not just replacing an identifier, but redefining how identity, trust, and learning work on the open internet. This final post explores what replaces cookies, what becomes harder, and how the system adapts.


1. The Identity Puzzle: Deterministic vs. Probabilistic

At its core, identity answers a deceptively simple question:

“Are these two events generated by the same person?”

Historically, third-party cookies allowed platforms to answer this cheaply and at scale. Without them, identity must be reconstructed more deliberately.

Deterministic Matching (High Precision, Limited Reach)

Deterministic identity relies on explicit user authentication - a login event that directly links an identifier (email or phone) to a browser or device. If you log into Facebook on both your phone and laptop, Meta doesn’t infer that those devices belong to you, it knows.

Strengths

  • Extremely accurate
  • Stable across devices
  • Ideal for measurement and attribution

Limitations

  • Only works where users log in
  • Concentrated among large platforms

Probabilistic Matching (High Reach, Uncertain Truth)

When deterministic signals are absent, systems fall back to probabilistic identity.

Models correlate:

  • IP addresses
  • Device characteristics
  • Temporal usage patterns
  • Behavioral similarity

and estimate a likelihood (e.g. 92%) that two identifiers represent the same user or household.

This approach scales well - but introduces uncertainty. In a privacy-first world, confidence replaces certainty.


2. The New Currency: Hashed Emails and UID 2.0

With third-party cookies gone, the industry shifted toward something more explicit: user-provided identifiers - signals that users knowingly share as part of a direct relationship with a publisher or platform.

Why Hashed Emails Work

A hashed email (e.g., SHA-256) allows two independent systems to generate the same anonymous identifier without ever sharing the raw email address.

Key properties:

  • The hash is deterministic (same input → same output)
  • The transformation is one-way (the original email cannot be reconstructed)
  • No cleartext email moves through the bidstream

This makes hashed emails a portable, privacy-aware bridge across publishers: if the same user logs into multiple sites with the same email, those systems can recognize consistency without revealing identity.

However, raw hashed emails still have limitations:

  • They are static (the identifier never changes)
  • They lack built-in expiry
  • They offer no standardized way to enforce user consent or revocation

This is where UID 2.0 comes in.

Unified ID 2.0 (UID2)

UID2 formalizes the hashed-email idea into a governed, privacy-aware identity framework rather than a simple identifier.

At a high level, UID2 separates identity generation from identity usage.

How UID2 Works (Conceptually)

  1. A user logs into a participating publisher using an email address.
  2. With explicit consent, that email is transformed into a UID2 token.
  3. Instead of sharing a static hash, the system issues:
    • An encrypted token
    • With a limited lifetime
    • That can be rotated or revoked

This token - not the email or its hash - is what enters the advertising workflow.

Why this Matters

UID2 improves on simple hashed emails in several important ways:

  • Encrypted tokens, not static hashes Even if intercepted, UID2 tokens cannot be reused indefinitely or reverse-engineered.

  • Rotation and expiry Tokens automatically expire and refresh, reducing long-term cross-site tracking.

  • Explicit consent Identity exists only if the user has opted in through a logged-in experience.

  • Revocation support If a user withdraws consent, their UID2 can be invalidated across the ecosystem.

In other words, UID2 introduces lifecycle management to identity - something cookies and raw hashes never had.

How UID2 Fits into the Auction Flow

From the perspective of the auction:

  • The UID2 token is simply another signal attached to the bid request.
  • DSPs that are authorized to decrypt or interpret UID2 can use it for:
    • Frequency capping
    • Measurement
    • Audience targeting
  • DSPs without access still operate contextually.

Crucially, UID2 does not guarantee universal reach. It only works where:

  • Users are logged in
  • Consent is granted
  • And publishers participate

That constraint is intentional.

A Shift in Philosophy

UID2 doesn’t attempt to recreate third-party cookies one-for-one. Instead, it reflects a deeper shift in how identity is treated:

  • From implicit tracking to explicit permission
  • From permanent identifiers to time-bound tokens
  • From assumed access to governed participation

Identity becomes less about silent observation and more about a contract between the user, the publisher, and the ecosystem.


3. The Identity Graph: From Identifiers to Relationships

Real users don’t behave cleanly.

They:

  • Use multiple emails
  • Switch devices
  • Clear cookies
  • Move between logged-in and anonymous states

To reconcile this, platforms build identity graphs - large, anonymized structures that connect identifiers into probabilistic clusters.

An identity graph may link:

  • Hashed emails
  • Phone numbers
  • Mobile advertising IDs (MAIDs)
  • Device-level signals

Importantly, identity graphs don’t just store IDs - they store relationships and confidence scores.

This allows systems to say:

“These three identifiers likely belong to the same individual, with 97% confidence.”

In the post-cookie world, identity becomes a graph problem, not a lookup table.


4. What Breaks Without Cookies (and How Systems Adapt)

Removing cookies doesn’t just remove an ID - it weakens several ML workflows:

  • Attribution becomes noisier
  • Frequency capping becomes harder
  • Lookalike models lose training signal
  • Long-term user learning degrades

To compensate, systems increasingly rely on:

  • Stronger first-party data
  • Shorter feedback loops
  • Cohort-level learning
  • Contextual signals

In practice, this shifts optimization from individual users toward populations and intent clusters.


5. The Future Beyond Identity: Context, Semantics, and LLMs

As third-party identity weakens, the industry’s focus is shifting from who the user is to what the user is doing right now.

This marks a deeper transition - not just identity replacement, but identity minimization. Instead of reconstructing individuals across the web, systems increasingly extract meaning from the moment itself.

From User-Centric to Moment-Centric Targeting

In an identity-rich world, targeting revolved around persistent user profiles. In a post-cookie world, those profiles are incomplete, probabilistic, or absent altogether.

The alternative is contextual understanding.

Rather than asking:

“Who is this user?”

Modern systems ask:

“What is the intent, tone, and relevance of this page or interaction?”

This is where semantic understanding becomes critical.

Semantic Contextual Intelligence

Earlier contextual targeting relied heavily on keywords - often crude and misleading.

Large Language Models enable a fundamentally richer approach:

  • Page-level semantic comprehension, not just keyword matching
  • Intent and sentiment detection, distinguishing concern from optimism
  • Disambiguation of similar topics with different meanings

For example, an article discussing market volatility and one focused on financial resilience may share vocabulary, but represent very different user mindsets. LLM-powered semantic models can distinguish between them without knowing anything about the individual reader.

In this sense, context becomes a proxy for intent - without requiring identity.

Where LLMs Actually Fit in the Stack

LLMs do not replace the real-time ML systems described earlier. They are too slow, too expensive, and too opaque for millisecond bidding decisions.

Instead, they operate around and above the core decision loop.

LLMs are increasingly used to:

  • Enrich contextual signals before auctions occur
  • Generate semantic embeddings for pages, creatives, and queries
  • Assist humans in configuring and interpreting campaigns

These enriched signals then feed into the same lightweight, fast models (GBMs, linear models) that execute under strict latency constraints.

Assistive Intelligence, not Bidding Intelligence

LLMs also reduce friction on the human side of the system:

  • Translating natural-language campaign goals into structured targeting rules
  • Summarizing performance trends and anomalies
  • Helping creative teams generate variants aligned with contextual themes

Rather than making bidding decisions themselves, LLMs help align human intent, system configuration, and machine optimization.

A Coherent Shift, not a Collection of Tools

Taken together, these changes reflect a consistent direction:

  • Less reliance on persistent identity
  • More emphasis on meaning, context, and intent
  • Clearer separation between fast decision models and slow reasoning systems

LLMs matter in AdTech not because they replace existing ML - but because they make context rich enough to compensate for the loss of identity.


Series Conclusion: The Shape of the New Stack

Across this series, we followed an ad impression end to end:

  • from browser execution,
  • through real-time auctions,
  • into ML-driven bidding,
  • across identity stitching,
  • and finally into a privacy-first future.

What emerges is not a system in decline, but one being reshaped with intent.

The post-cookie world doesn’t eliminate personalization - it forces discipline.

Identity becomes:

  • Explicit instead of implicit
  • Probabilistic instead of absolute
  • Governed instead of assumed

AdTech remains one of the most demanding engineering domains: massive scale, extreme latency constraints, adversarial incentives, and constant regulatory pressure. Yet that challenge is exactly what makes it fascinating. The problems are harder, the constraints tighter, and the solutions - when they work - far more elegant.

What changes is not the need for intelligence, but how carefully that intelligence must be earned: through better modeling, clearer consent, richer context, and more thoughtful system design.

If you’ve read through all five parts, thank you. This ecosystem is complex, often opaque, and easy to oversimplify, and I hope this series helped demystify how the pieces actually fit together. Hopefully, it was worth your time.

The stack will continue to evolve, but the core questions: trust, value, and decision-making under uncertainty, aren’t going away. If anything, the next phase of AdTech’s evolution will be even more interesting than the last.

Enjoyed this article? Never miss out on future posts - follow me.
This post is licensed under CC BY 4.0 by the author.