The New DNA of the Data + AI Team
Table of Contents
We all remember Jurassic Park, right? The classic story of a groundbreaking scientific and technological shift suddenly allowing the creation of entirely new species.
In the world of data + AI, Agents are having their Jurassic Park moment.
Like Spielberg’s dinosaurs, Agents are exciting – but they’re also operating in a system whose boundaries are still under construction. And behind every new Agentic initiative lies a data + AI team navigating this new reality.
I’ve spent the last several months speaking with the people that make up these teams: data scientists, platform engineers, machine learning architects, and more. While their outlooks toward AI development vary from skepticism to excitement, almost all feel pressure to ship something—and fast.
To meet this demand for AI, the traditional anatomy of the data organization is no longer sufficient. The old blueprint—built around reliable pipelines, dashboards, and predictive models—is being rewritten in real time. A new kind of data + AI team is emerging: fluent in unstructured data, AI agent evaluation, and aware that today’s AI applications demand not just intelligence, but trust.
The DNA of today’s data + AI organization is something new. Let’s dive in.
Table of Contents
The New DNA of the Data Team
To meet the new requirements of building Agents, data+AI teams need a new set of capabilities not previously required.
Unstructured data fluency
The unstructured data—call transcripts, internal PDFs, product reviews, raw footage, Slack messages, and more—is now just as important as its structured counterparts. Now, teams need to be able to answer “What does unstructured data quality mean?” and “What are we going to use this for?”
Agent architecture, model selection & fine tuning
Is it one LLM or an ensemble of LLMs? How do you iterate to sufficient precision and recall? Practitioners need expertise not just in data science, but in interpretation. Today’s best teams are expanding their level of fluency to move fluidly between formats and to parse, understand, and engineer the semantics and context.
Retrieval & prompt expertise
To make models work—to move from demo to deployment—requires another new skillset: context and retrieval engineering. Teams are stitching together context-aware systems, often using RAG pipelines.
Evaluations muscle
Because output is probabilistic and answers are subjective, teams are now asking themselves: What does ‘good’ look like when tone matters more than accuracy? Who decides when an answer is ‘close enough’? The answers are a blend of automation and human-in-the-loop review.
Culture of end-to-end observability
New data + AI teams are learning to monitor not just pipelines, but behaviors. Data + AI observability moves beyond logging errors or measuring uptime. Teams are building real-time monitoring systems that watch for drift, hallucination, prompt failure, and more.
This isn’t just a new tech stack. It’s a new mindset. One that treats AI as a continuous evolution.
Structuring data + AI organizations that can scale
At first, AI projects started the way many experiments do: in the margins, with tiger teams assembled from engineers, data scientists, and ML engineers. Executive started taking notice, and suddenly, what began as an innovation sprint became a strategic imperative.
Data leaders were left with a question: Who owns AI?
From my conversations with data+AI organizations, three popular operating models are emerging to answer that question.
1. AI experts embedded are within cross-functional product teams
In product-led organizations, AI experts, like data scientists and ML engineers, are now embedded directly into product teams, joining designers, product managers, and backend engineers. The model is federated, and teams move fast, build fast, evaluate fast.
These AI experts might fine-tune a model for content ranking or build a customer co-pilot, but the standards and tooling come from a central data org. That core ensures shared definitions, unified evaluation criteria, and a platform that scales across use cases.
2. Dedicated AI teams partner with business domain leads
In industries like logistics, insurers, or manufacturing, AI typically lives closer to the data organization. Centralized AI teams build full-stack solutions like invoice processing agents, internal knowledge assistants, smart routing systems by partnering with domain leaders in departments like legal, finance, HR, compliance, and more.

These domains hold tons of of tacit knowledge, much of it locked in unstructured formats. But, when AI meets these silos, it often discovers what the business hadn’t realized: the data is stale, outdated, or otherwise inaccurate. So, this use case typically requires domain owners to audit their knowledge base or rewrite training data accordingly to maintain high-quality inputs.
3. Data organization develops internal AI use cases
These organizations treat AI as an accelerant within the data value chain by structuring the unstructured and automating cleansing, deduping, transforming, etc. This might look like using vision models to scan receipts and issue payments, deploying transformers to solve fuzzy-matching problems, or mining behavioral logs not for dashboards, but for decisions.
Across all three models, one truth remains: data orgs that treat AI as a side project risk becoming irrelevant to the most critical questions their company is about to ask.
Evolving the data organization and platform
To support an effective AI future, many data organizations are quietly undergoing additional infrastructure transformations. Beyond the operating models above, I typically see forward-looking data organizations making two clear moves to set themselves up for change.
1. Organizations expanding an established platform
These data organizations already invested in a platform capability of batch AND stream, structured AND unstructured, analytics AND machine learning. Now, they’re ensuring the extensibility of that platform to cover their new AI use cases.
Alternatively, if their underlying data platform is not “AI-ready”, they’re composing an AI platform that can support a variety of use cases and integrate neatly with their core data environment(s).
2. Organizations evolving the focus of the ML team
In this case, the machine learning team owns traditional AI/ML use cases, like classification and reinforcement learning, but they now also own agents, human-in-the-loop pipelines, model selection, fine-tuning, and real-world deployment. They’re the organization’s experts in neural networks, deep learning and fine tuning models.
This team of experts is now the AI team, and the data org becomes the data + AI org.
These new models represent a recognition that AI is not a subdomain of data—they’re two sides of the same coin.
From data organizations to data + AI organizations
In Jurassic Park, the danger was never the dinosaurs. It was the illusion of control.
Today’s data and AI teams are building systems that act, interpret, and respond in ways not always predictable or transparent. And like the fictional park’s creators, they’re doing so under pressure.
But the smartest teams I’ve met aren’t just building agents. They’re building architectures of accountability. They’re embedding evaluations into their processes. And they’re focusing on the trust and reliability of data, end to end.
The Jurassic Park is now open. The question is whether we’ll build it better this time.
Our promise: we will show you the product.