Your Data Isn't Ready for AI. Here's What That Looks Like in Practice.
A media agency case study
The pitch for AI in marketing is compelling: faster insights, automated reporting, smarter attribution. But there's a prerequisite that doesn't make it into most vendor decks: your data has to be in good enough shape to support it.
I spent six weeks embedded with a premier media agency finding out what happens when it isn't. The high-level findings are below:
What was there
The agency had built real data capability. They had a capable engineering team, log-level impression data from every major channel, a sound architecture (e.g. S3 for raw files, a processing layer, a client-facing output layer). The data existed at the right granularity. The engineering team had improved meaningfully over the prior year.
The problem was that none of it was reliably accessible to the people who needed it. And the cost of working around that gap was consuming resources that should have been building the business.
The audit surfaced five numbers that told the story:
35-45 person-hours per week spent manually stitching data across client success and media teams in Excel
95% of naming convention errors caught after attribution models had already run, not at entry
50% of attribution data required rework in any given week
Three active clients were suppressed from error monitoring because a skip list no one owned had gone stale
12-24 hour gap between an alert firing and the client-facing team knowing about it
That last one drove the weekly crisis pattern. Every Tuesday morning, the client success team would discover data discrepancies right before client reporting. This was not because monitoring didn't exist, but because there was no connection between detection and action. An alert would fire, an engineer would read it, and then someone had to decide what to do. That decision lag cost a full day every week, every week.
Three structural problems; not three technical bugs
These weren't isolated issues. They were patterns that would repeat indefinitely without deliberate intervention.
Detection without remediation. The pipeline generated alerts when things went wrong. What it didn't do was act on them. No ticket created, no backfill triggered, no client team notified. The gap between detection and response was measured in hours, on a good week.
Manual labor as infrastructure. The client reporting layer ran on individual spreadsheets maintained on individual laptops with no automated validation at any handoff point. Every team member had independently built the same workarounds and described the same unmet need: one place to pull spend and results, with reporting they could build themselves. The manual work wasn't just inefficient, it had become the product.
A foundation that couldn't support the tools they wanted to use. Leadership wanted to move toward AI-powered analytics and more marketing mix modeling. These are reasonable ambitions for an agency at this stage, but you can't put AI on top of a broken data layer and expect it to work. AI surfaces problems faster, it doesn't fix the underlying ones.
The harder work was strategic, not technical
The technical recommendations were the easier part. The harder work was giving leadership the evidence base to make two decisions that had been open for over a year: what role their in-house attribution model should play going forward, and whether a third-party reporting tool was the right investment.
Both decisions required trust in the underlying data, and that trust didn't exist until the audit gave leadership a shared picture of what was actually there.
Four recommendations came out of the engagement, sequenced to build on each other:
Build the monitoring response layer first. Route validation failures to the right team with a defined fix window. Store validation history. Add post-join match rate checks. These changes were designed to eliminate the weekly crisis cycle within weeks of implementation and several were already in progress before the report was delivered.
Govern the data inputs. The naming convention process had no owner, no collision detection, and no audit trail. The monitoring skip list had active clients on it with no automated mechanism to update when campaign status changed. Governance failures, not engineering failures. The fixes were mostly process and tooling changes, not rebuilds.
Centralize the reporting layer. Every client success team member independently named the same pain point: one place to pull spend and results, without routing a request through engineering. The recommendation was a third-party ingestion and reporting platform; one that creates a self-serve data layer for CS teams with the governance in place to trust what they're seeing.
Sequence the AI implementation carefully. The agency's in-house attribution model was consuming significant engineering bandwidth for diminishing returns. The recommendation was to transition to a managed provider, but only after stabilizing the data foundation. The sequencing matters: a managed provider receiving clean, governed data produces reliable outputs. The same provider receiving the current data produces a more expensive version of the same problem.
The numbers on the other side
A conservative 50% reduction in manual stitching, achievable with standardized dashboards for the top three clients, frees 17-22 hours per week. At blended rates, that's approximately $55-70K in annual labor savings, before accounting for error reduction, faster client reporting, and engineering time redirected from firefighting to infrastructure work.
The more important number is harder to quantify: how many client conversations shift from explaining why this week's numbers look different from last week's, to actually advising on strategy.
That's the goal. Everything else was in service of it.
The broader pattern
This agency isn't unusual. Most organizations trying to adopt AI tools are sitting on the same foundational problems: siloed data, manual workarounds that became permanent, governance that was never built, and monitoring that detects but doesn't respond.
AI doesn't fix any of that, it amplifies it.
The companies that get the most value from AI in the next few years won't necessarily be the ones with the best models or the biggest budgets. They'll be the ones that did the unglamorous work first: cleaned up the data layer, built the governance, connected detection to action. And then put AI on top of a foundation that could actually support it.
That work is available to most organizations right now, without a major technology investment, if they're willing to look at what's actually there.