I have shipped AI features across marketplace infrastructure processing hundreds of millions in GMV, anomaly detection systems that saved more than a million dollars in a single engagement, and discovery engines that lifted conversion five times over. Some of those features worked. Some did not. The difference was never the model.
The features that worked were never the ones where the AI was most impressive. They were the ones where the spec was most precise.
That realization took me years to internalize. I kept seeing the same failure pattern: a team gets excited about what the technology can do, skips the hard questions about what it should do, and builds something that was never going to survive contact with real users. Not because the engineering was bad. Because nobody defined the intent.
The pattern
It usually starts in a conference room. Someone with budget says "we should use AI for this." The room nods. A vendor shortlist appears. Engineers spin up infrastructure. Three months later there is a proof of concept that nobody asked for, solving a problem that nobody quantified, for users that nobody interviewed.
I have watched this happen at mid-market logistics companies, enterprise SaaS platforms, consumer marketplaces, and internal tooling teams. Smart people. Real budgets. Zero specificity about what "use AI" actually means in measurable terms.
The gap between "we should use AI" and "here is exactly what we are building, for whom, measured how, at what cost" is where the money goes to die.
The economics nobody runs
Artem Chigrinets coined this well on Mind the Product: "An AI feature should create measurable value at least three times greater than its direct compute cost." He calls it the 3x rule. It sounds obvious. Almost nobody does it.
I have seen teams ship a "chat with your data" feature bundled at no extra charge, then discover that heavy users cost forty dollars a month in compute against a thirty-dollar subscription. The feature was impressive. The unit economics were upside down. They killed it six months later after burning through runway trying to optimize prompts that were never going to close the gap.
Running the numbers first is not conservative. It is the highest-leverage thing you can do before writing a single line of code. If an automated invoice review costs fifteen cents per run, you need a credible case that it saves the user at least forty-five cents in time or risk reduction. If you cannot make that case, you do not have a product. You have a demo.
Spec precision over model selection
The second thing I learned the hard way: model selection is a downstream decision. The upstream decision is spec quality. When I built the anomaly detection system that saved over a million dollars, the breakthrough was not the algorithm. It was spending three weeks with the operations team mapping every edge case in their existing workflow before writing a single line of code.
When I built discovery engines that lifted conversion five times, the win was not the recommendation model. It was defining "conversion" precisely enough that every team — product, engineering, data science, finance — agreed on what we were measuring and why it mattered.
Precision compounds. Vagueness compounds too, just in the wrong direction.
Seven gates
This pattern repeated enough times that I built a framework around it. The Intent Stack is seven gates between a raw idea and the first line of AI-generated code. Each gate asks a question that most teams skip:
- What will this cost?
- Should we build this at all?
- What exactly are we building?
- Is the spec complete?
- Is it AI-ready?
- Was it built right?
- Is it working and learning?
Gates one and two are where the 3x rule lives. You run the economics and the profitability analysis before you write a spec. Most teams start at gate three. That is why most teams fail.
The gates are not bureaucracy. They are insurance. Every dollar spent answering these questions saves ten in wasted engineering and a hundred in opportunity cost.
The teams that move fastest are not the ones that skip the hard questions. They are the ones that answer them first.
Three questions before you build
If you are about to start an AI project, stop. Before you evaluate vendors, before you hire, before you spin up a single GPU instance, answer these:
- Does this feature pass the 3x rule? Can you show that it generates at least three times its compute cost in measurable value?
- Who specifically will use this, and what does their workflow look like today without it?
- What is the smallest version that could deliver measurable value?
If you cannot answer all three with specificity, you are not ready to build. You are ready to think. And thinking, done well, is the highest-leverage activity in AI product development.