That MIT stat is going around again about how some enormous share of enterprise AI pilots never make it past proof of concept. I keep seeing it waved around as proof that the technology is overhyped and it might be (at least the expected timeframe and scope of ROI on AI at the moment) but I also think that’s the wrong lesson to take from it. The number I find more interesting is this one: roughly 84% of organizations haven’t redesigned a single workflow around the AI they bought. Those two numbers aren’t telling two different stories. They’re the same story told from opposite ends.
This is a classic tale of outputs over outcomes. Leadership expects a “thing” (an output) they invest in to magically positively impact the way someone (team member, customer) behaves. They tout the amount of money they spent on the “thing” and then wonder why the ROI doesn’t come. Product managers already know why.
Here’s what I suspect happened in most of these companies. Someone brought in a tool that genuinely changes how the work gets done, a powerful one, and dropped it on top of a process that was designed back in 2019 if they were lucky. Nothing meaningful shifted in how the team behaved or in the business results, and the gap got quietly filed under “AI is the problem.” That isn’t AI transformation, not really. For the most part it’s an expensive demo, and an expensive demo is a very different thing from the return companies thought they were buying.
Why dropping AI on an old process is not AI integration
These AI pilots, if they’re even called that, stall because of the existing org design they are being dropped into. A tool that changes the work cannot pay off inside a process that refuses to change alongside it. You can run the most capable system with the best product teams, AI powered or otherwise, through an existing approval chain built for a slower era, and the approval chain will win every time. So the failure rate in the study isn’t really an indictment of the technology or the teams doing the work, ultimately, but rather of the operating system the company tried to run it on.
Buying the tool was the easy decision. It costs money and a signature. Changing how the work actually moves through the org costs something harder: a willingness to let the shape of the job change. Just because we’ve been doing product one way for a while doesn’t mean it stays that way forever. Everyone is “integrating AI” but very few, as the research suggests, are making the organizational design pivots necessary for these investments to bear fruit.
The people running the pilot were never allowed to change the work
Let’s not be too quick to blame the 84% who didn’t touch their ways of working. They aren’t lazy, and they aren’t behind. In most companies the people closest to the work, the PMs and team leads and middle managers actually running the pilot, were never handed the authority to change the workflow in the first place. They can run the experiment. They can’t redesign the job around what the experiment taught them, because that decision lives two or three layers up from them and several weeks of debate away.
So when an AI pilot “fails,” it’s far more interesting to diagnose whether anyone in that pilot was actually allowed to change the workflow around it. If the answer is no, then the company didn’t test the implementation of AI at all. It tested its existing culture and approval chain, and the approval chain behaved exactly the way it always has, slow and protective of itself. That’s not an indictment of the technology nor the product teams doing the work. That’s a mirror reflecting back the culture class AI implementations are facing.
Cutting headcount with no ROI is a leadership problem, not an IC problem
There’s a companion finding in the research that says something like 80% of these companies cut headcount in connection with their (not especially profitable) AI push, with no real correlation to ROI. It’s genuinely frustrating to keep seeing this trend. These organizations removed the people (the product managers in many cases) who understood the work, kept the process that was already broken, bolted a tool onto the gap, and then waited for a number that was never going to arrive. You could try to pin that on the people doing the work, but in practice this is a leadership and culture issue disguised as a “tech problem.”
The companies actually getting returns on their AI integrations aren’t running better models than everyone else. For the most part they redesigned the work first, and then let the people doing that work make the call on how their ways of working should evolve. The differentiator in these stories is that the culture changed, permission was explicitly granted and encouraged. The choice of AI tool didn’t make a difference at all.
What to try this week: a small pilot team that’s allowed to change the rules
To solve for this we have to put our product management experience to use. We reduce risk when we build products by running experiments. We can do the same for new ways of working. Start small, collect evidence, scale the practice as positive signals come in. Sound familiar?
Start here. Stand up a small pilot team, three or four people close to a real workflow. Hand them a task that’s strategic enough that people will actually care whether it works, but not so critical that a failed experiment does real damage to the business. If you keep the pilot team’s effort too trivial nobody pays attention to the learnings. Make it too critical and the organization’s immune system shuts the experiment down before it can teach you anything.
Here’s the key differentiation. Give that team the authority to change the work around the tool, not just to use the AI. They should be free to redraw the steps, the hand-offs, the approval points, whatever has to move for them to get the full value out of this new tech. Let them tell you which parts of the old process were actually critical to success and which were just habit or theater. Their job isn’t to prove the AI works. It’s to figure out what the work has to become for the AI to be worth anything.
Set real success criteria for these teams. Make sure they are outcomes – measurable changes in human behavior that drive business results. Look at how they work, their speed and efficiency and the impact their work ultimately has on the product they’re building. Make sure, also, to look at team morale. How does this increased autonomy change what the team does and their attitude? Collect all of this data on a weekly basis.
Once you start to get real insight from the pilot team, roughly around 6 weeks into the experiment, scale the learnings to a second team. Keep the same constraints in place – small time boxes, small teams, mid-risk initiatives. As the data comes in, scale what works, kill the changes that don’t. This is evidence-based decisions making except in this case we’re pointing the decisions at our methods, processes and ways of working (including our product management practices).
AI transformation and integration is the last-mile problem, again, and the last mile is never technical. It’s cultural. It’s the permission the people closest to the work need to redesign their ways of working to fit new models and tools. In the age of AI they really can’t sit and wait for the old ways of working to react on their own, because the old ways of working are precisely the thing the AI just made far more risky.






Leave a Reply