Databricks has entered the martech space with CustomerLake, causing a buzz in the industry. Marketers and technologists are now thinking about what this move could mean for the future of martech.
Ali Ghodsi, CEO and cofounder of Databricks, says buyers have changed, but marketing infrastructure has not kept pace. (Even with 15,505 martech tools up on the supergraphic). More consumers now use AI tools before making purchases, and this trend is only growing.
Tasso Argyros, former ActionIQ founder and now head of CustomerLake, adds that when agents buy across multiple channels in seconds, there’s no time to spend weeks designing a single campaign. This challenge led to a complete rethink of the campaign model.
With these changes in mind, CustomerLake is focused on answering one question:
What should marketing infrastructure look like when your customer could be a person, an AI agent acting for them, or both, and when interactions happen in milliseconds?
According to Databricks, this is where CustomerLake fits in.
As martech service providers, we’ve been following these developments closely. Today, we’ll explore what CustomerLake is, why it matters, how it can help you, and what your next steps are.
What was the need for Databricks CustomerLake?
The primary objective of CustomerLake is to break down silos. The story is familiar. Data engineers usually build pipelines to move data into CDPs, but governance controls often don’t cover these platforms. As a result, AI models remain in the lakehouse, and the CDP relies on slow, batched data copies.
Argyros, who spent ten years building a database company and later founded ActionIQ to address this problem, described today’s CDP model as middleware, which will go away.
CustomerLake is built on the idea that this back-and-forth is both difficult and avoidable. When you store Customer 360 profiles, identity resolution, audience segmentation, and campaign activation in the Databricks lakehouse, your customer data stays secure. The same Unity Catalog controls that protect your enterprise data also keep your marketing operations safe.
You don’t need a separate data contract or extra permissions.
In short, CustomerLake is Databricks’ answer to a lakehouse CDP, a customer data platform built natively on top of the lakehouse rather than bolted onto it.
CustomerLake also supports Lakehouse Federation, so teams can access customer data wherever it is, without making new data silos.
A note on the infrastructure layer beneath Databricks CustomerLake:
Databricks introduced LTAP (Lake Transactional/Analytical Processing), an architecture that consolidates transactional, analytical, streaming, and operational data into a single storage layer within the lakehouse.
Databricks also announced Genie Ontology, a self-improving context layer that continuously analyzes a company’s data, documents, applications, and conversations to create a dynamic model of business operations. While most CDPs have advanced customer context, Genie Ontology addresses the company context gap, which has traditionally been unstructured and difficult to program.
What’s in the box: Inside CustomerLake
CustomerLake ships with two core capabilities: Profile Agents and Campaign Agents.
1. Profile Agents
Profile Agents take care of the data unification layer. They bring in raw customer data using LakeFlow Connect, either from third-party apps or from data already stored in the lakehouse. The agents add semantic tags and correct issues like invalid emails and phone numbers. You can enrich third-party data through the Databricks Marketplace, which Databricks refers to as “data hydration.”
For privacy-sensitive identity resolution with third parties, CustomerLake uses Databricks clean rooms. This lets you match identities with external partners while keeping raw data inside the lakehouse. The identity resolution part, called Agentic Identity Resolution (AIR), combines deterministic rules, probabilistic matching, and LLM-driven arbitration for cases where the rules are not clear. If the agents cannot find an answer, the system turns to human review.
The system also learns from each resolution cycle.
This is the core mechanic that makes a lakehouse CDP different from a traditional one: identity resolution and profile-building happen where the data already lives.
2. Campaign agents
Campaign Agents are the main workspace for marketers. When you have a business goal like growing loyalty enrollment, reactivating churned customers, or increasing average order value, Campaign Agents help you build audiences using natural language prompts powered by Genie, Databricks’ AI assistant. They recommend the next best actions for each customer, generate personalized content, and activate campaigns across different channels. The audience builder is especially easy to use: you can describe in simple terms who you want to reach, see the SQL it creates, adjust or accept it, and use the result in your campaign right away. You don’t need to submit a data engineering request.
All of these features come together in what Databricks calls infinity campaigns. These are ongoing, goal-driven engagement loops that take the place of the old launch-and-expire campaign model. Instead of defining a segment, building a journey, setting a send date, and moving on, marketers set an objective and agents keep working toward it, adjusting to new customer signals as they come in.
Infinity campaigns are interesting from a business perspective because they are also technically demanding. They need a lot of ongoing computing power. Agents are always testing, learning, and activating across a large customer base, running as a constant background process.
This leads directly to questions about pricing.
How much does Databricks CustomerLake cost?
CustomerLake does not charge a platform fee. Databricks makes money by billing for compute, meaning you pay each time a profile is built or an identity is resolved. Greg Kihlstrom pointed out in his post-launch analysis that an independent CDP has to charge for its own product because that is its only source of revenue. Databricks, on the other hand, earns money whether or not you buy its product.
This is not just a temporary discount. It is a pricing model that a pure-play CDP cannot match. Several analysts have explained that the idea is for CustomerLake to generate more compute usage than partner CDPs, which makes up for not charging for the product itself. Infinity campaigns, which run continuously and keep learning, are what keep the compute usage steady.
To sum up, with CustomerLake included, here’s what a Databricks customer gets:
- No more data duplication or reconciliation. Customer data remains in the lakehouse, eliminating the costly and error-prone process of copying and re-governing data in separate systems.
- A single governance model for data and marketing. Security controls, permissions, and compliance rules extend seamlessly from the data platform to marketing operations, removing the need to manage or synchronize a second system.
- CDP-level capabilities at a fraction of the cost. Without a platform fee, enterprises gain identity resolution, audience building, and campaign activation within their existing infrastructure.
- Marketing and data teams are now aligned, at least technically. Unifying customer data, AI models, and campaign execution in a single platform reduces engineering overhead.
Databricks CustomerLake is strategically interesting because it may be the first true test of enterprise marketers’ appetite for agentic AI, will push solution focus away from standalone CDPs and composability in favor of consolidated martech, and displays the true power of AI-enabled software engineering to accelerate innovation and functional expansion. — Joe Stanhope, VP, Principal Analyst, Forrester
What you should be doing now
If your company uses Databricks, there are some straightforward steps you can take:
- Take stock before you renew. Compare your current CDP’s features with what CustomerLake offers out of the box. Do this ahead of your renewal meeting, not during it. Being prepared gives you an advantage. Kihlstrom suggests starting here if your contract is up soon.
- Try a beta as a way to negotiate. CustomerLake is currently in Private Preview. Signing up doesn’t mean you have to switch, it’s a chance to learn and compare.
- Make governance your top priority. The chief risk with campaign agents is not having rules about what they can decide on their own, when people need to step in, and how results are checked.
- Make sure marketing and data teams work together. The platform itself is the easy part. As Kihlstrom says, the real challenge is the gap between data and marketing, and the different skills on each team. If these groups aren’t aligned, even the best tools won’t help.
- Consider where content fits in your overall plan. CustomerLake manages data and activation, but there isn’t yet a complete system that combines data and content. You should build your approach to content creation, variation, and personalization at the same time as your data strategy.
Whether or not you adopt CustomerLake specifically, the shift toward a lakehouse CDP model is likely to shape how vendors compete for the next few years.
The last word on Databricks CustomerLake
Databricks is not the first company to promise to simplify the stack.
CDPs made a similar promise: build a unified customer record, break down silos, and make data useful. Many kept that promise, but also became another layer to manage and connect.
CustomerLake offers a strong argument for avoiding the same mistakes, but there is still a gap between what is possible and what marketers will actually do.
Marketers are likely to adopt these changes more slowly and carefully than the keynote implied.
The data layer is starting to take over the application layer. To prepare for what is coming, take control of your data, manage it carefully, and remember that the tool is not the same as the strategy.




