Department of Product

Department of Product

Why Semantic Data Layers matter to product teams

🧠 Practical ways you can use it for speeding up discovery, making analytics self-serve, building better in-product search capabilities and more.

Rich Holmes
Jun 17, 2026
∙ Paid

🔒 The Knowledge Series breaks down emerging AI technologies with practical playbooks designed specifically for product teams. Get 100+ guides and practical tutorials covering everything from Claude Code and MCP to agentic workflows, vibe coding, and more.


Anthropic’s data team recently revealed that they built a self-service reporting system where 95% of its data requests are now automated with Claude.

The teams who worked on it say the biggest challenge when building this system was this:

In other words, the hard part wasn’t writing the SQL - it was figuring out exactly what data the user actually meant when they asked a question. And semantic data layers play a huge role in this.

A semantic data layer lets you formalize core product concepts (user, workspace, subscription, session, feature, experiment, etc.) and their relationships as a knowledge graph or semantic model. This model then sits as a “semantic layer” between raw data and tools, so everyone queries with the same business terms and metrics.

AI and AI agent tools with natural language searches have brought semantic data back to the top of the agenda and according to recent reports1, enterprise companies like Microsoft, Databricks, SAP and other AI software providers are all fighting over control and access to this semantic layer as one of the next major battlegrounds in the AI race.

In this Knowledge Series, we’ll explore what semantic data is, why it matters to product teams and how exactly you can put it to use for practical use cases like:

  • Standardising how your company talks about important definitions

  • Speeding up product discovery and concept exploration

  • Building better in-product search features

  • Making data analytics truly self-serve so that you don’t have to ask your data teams for custom reports (using Anthropic’s set up as inspiration)

Plus, we’ll look at a real world case study and experiment with using tools like Claude Code to create visual representations of your data that you can use to present your data model back to non-technical users.


The Knowledge Series

What exactly is a Semantic Data Layer?

A semantic data layer is a translation layer that sits between your raw data warehouse and whoever (or whatever) is querying it. It takes raw table columns and turns them into named, governed business concepts.

Without one, “monthly active users” might mean five different things depending on which analyst wrote the SQL. With one, “monthly active users” is a single defined metric that always returns the same number no matter where you query it from. Anyone who has been tasked with producing a standardized report at any company can attest to just how difficult this is in companies with no pre-existing definitions set up.

A semantic data layer could store things like:

  • “Activation” - the specific milestone that marks a user as activated, per your product’s definition

  • “Feature adoption” - which specific events count, whether a single use counts or a threshold

  • “Customer PII” - identified semantically, so that everyone has a shared understanding of personally identifiable information that can be redacted for privacy reasons

  • “New user” - first 7 days, first 30, or first session only

(You can read more about SQL in previous Knowledge Series editions if you’re interested in that).

Why it matters now specifically

Semantic layers are having a bit of a renaissance in 2026 because precise definitions of data are essential for AI Agents and LLMs. Once definitions are set up, it allows non-technical parts of a business to self-serve analytics reports on demand through conversational interfaces - and it lets AI agents perform data and reporting tasks autonomously.

Not investing in a solid foundation including a semantic data layer can make it virtually impossible to build a self-serve analytics model. In Anthropic’s case, before investing in a semantic layer and complementary agent skills, analytics questions were stuck at around 21% accuracy on their evals. Investing in the infrastructure got these numbers consistently above 95% in aggregate and regularly around 99% in certain domains, meaning that their data science team can focus on more strategic work.

How a semantic layer actually works - and how to practically use them in product

This post is for paid subscribers

Already a paid subscriber? Sign in
© 2026 Department of Product · Privacy ∙ Terms ∙ Collection notice
Start your SubstackGet the app
Substack is the home for great culture