One thing I love to do these days is talk to UX designers working with AI. It fascinates me to see how differently they work than I did when I started my career. In many ways it’s a whole new discipline, certainly in the tooling and workflow. In other ways though, it’s the same work I was doing in the late 1990’s. Last week, I managed to chat with a designer at an AI-first startup. They’re working on a tool that will change how people interact with AI models. It’s a bold initiative with many as-of-yet unresolved variables in the user experience.
They told me about a pattern that had been bothering them for weeks. They were iterating on the design while, at the same time, one of the stakeholders on the project was also iterating on the design. At times they were both working on the same design challenges, each with their own idea and solution approach. Both of them were producing polished mockups using AI design tools (e.g., Lovable, Claude Code, etc) and then using their individual output to convince the rest of the team that their version was “better.” There was no coordination between the two tracks. There was no shared definition of what they were trying to achieve. Rather than a clearly defined definition of “done”, both design tracks were working towards a nebulous version of “better.”
Who was right? Whose design was actually better? Nobody on the team knew. Both versions looked compelling. The iteration tracks ran in parallel and neither had a stop criterion.
This is what AI has done to product and design work. The cost of iterating is free. The output comes from natural language prompting. Seemingly, anyone can do that. Clearly missing from this (unlikely unique) phenomenon was an explicit, upfront decision about how to decide when to stop. Without that decision, iteration becomes the activity, not the means to an end. The team keeps producing polished alternatives because they can. And at some point they stop being able to tell whether they’re making something better or just making it different.
AI tools generate articulate, internally coherent options at speeds that compress what used to be days of work into minutes. Now throw in the fact that everyone on the team can produce finished-looking designs and you find yourself in an environment of endless “bake offs” between prototypes. This made me realize that as we continue to figure out what product management and design look like in the coming years, it will be the teams that decide the best that will win. Making good decisions, however, is something that AI tool fluency doesn’t give you. In fact, while tool fluency is the price of admission, decision quality is the moat.
So, how do we stave off endless design battles to ensure we’re choosing work that stands the best chance of serving our customers? Here are the three habits I think separate the teams that compound an advantage from the teams that drown in their own iterations.
1. Name the outcome before you write the prompt.
The designer and the stakeholder weren’t aligned on what “better UX” meant. “Better UX” doesn’t tell you what to optimize for, and it doesn’t tell you how to recognize it when you see it.
The named version of the desired outcome looks completely different. It’s a meaningful change in the behavior of users. It could be something like users can complete X core task in Y seconds with Z confidence, or users prefer this version in 70%+ of head-to-head tests, or users return for a second session within 48 hours. Choose what matters most now. Just make sure it’s specific. If you can’t articulate the outcome specifically, what you have is an output wishlist. And AI will happily generate beautiful outputs against your wishlist forever.
This is the practical version of the prompt-as-mirror idea I wrote about last week. The clarity of your prompt is downstream of the clarity of your outcome. Without one, the other is impossible.
2. Decide the stop criteria up front.
These design prompting pissing contests can go on for weeks because nobody decides what “done” looks like. Without explicit stop criteria, iteration becomes its own justification. You keep going because you can.
The fix is small: before you start iterating, write down what would make you stop. Be specific. We stop when version X tests significantly better than version Y with three customers. We stop when we’ve made three iterations in a row that don’t change the core flow. We stop after Friday at noon, regardless of where we are. Any explicit stop criterion is better than the implicit one which, in most teams, is we stop when the highest paid person (prompter?) decides we’re done.
When iteration costs nothing, polish stops being a stop signal. AI will polish forever. There’s always one more refinement available, one more variant, one more version. The decision about when to stop has to come from you, written down, before the loop starts. Otherwise you’re just running a polish factory.
The other half of this habit is a culture change. Stop criteria need to be agreed by everyone working on the artifact, before anyone touches it. The designer and the stakeholder weren’t running parallel tracks because they were stubborn. They were running parallel tracks because nobody had decided, out loud, what would make either of them stop.
3. Design the test that would prove you wrong.
The designer and the stakeholder had no shared way to falsify either approach. Their iteration loop was effectively a competition over which version looked better, judged internally by team members who had their own preferences and politics.
Instead of competing with each other endlessly, put the prototypes in front of users and look for your specific success criteria. Treat each prototype as a hypothesis and test it. Collect the data and the, quickly, make a kill, pivot or persevere decision. The only benefit of having multiple prototypes to test is the ability to learn from each of them and then combine those findings into the next version.
This is the hypothesis-testing habit applied to AI-augmented product work. Every iteration is an implicit hypothesis. Your job as a product manager, designer or stakeholder isn’t to be impressed by the iteration. Your job as a team is to design the cheapest possible test that would prove the hypothesis wrong and run that test before you produce another iteration.
Here is a way to ship, sense and respond with your team this week
Pick one thing your team is currently iterating on with AI. Before the next session, write three things on a sheet of paper:
- The outcome this iteration is supposed to serve, in your specific terms. If your prototype works, what will people be doing differently?
- The specific criteria (read: number) that would tell you to stop iterating.
- The test you’ll run that will help you prove the current direction wrong or right.
Then run the next session with those three things on the table. In theory, this empowers your team (stakeholders and all) to have an objective conversation about the pros and cons of each design approach.
Tool fluency is going to be table stakes within eighteen months. Every team will have it. The teams that compound an advantage from here will be the ones that build decision-quality habits while the rest of the market is still impressed by how polished the AI output looks.
Bonus tip: Once you have your success criteria and test in place, consider prompting your AI for that outcome instead of a specific output and see what happens.






Leave a Reply