Model routing is a solution to AI overspending. This is a problem for OpenAI and Anthropic

model-routing-is-a-solution-to-ai-overspending.-this-is-a-problem-for-openai-and-anthropic

Model routing is a solution to AI overspending. This is a problem for OpenAI and Anthropic

A new spending discipline is taking hold within corporate America, as CFOs and boards of directors begin to crack down on inefficient spending on artificial intelligence. This shift could potentially reshape the AI ​​business.

For the past two years, the playbook has been to default to the most powerful AI model and direct all queries through it, regardless of complexity. Now with AI Bills Well ahead of budget, companies are starting to question whether every task actually requires the boundary. Two executives at the center of AI development told CNBC this week that a solution is emerging: model routing.

What is pattern routing?Routing is a tool that adapts the task to the model, sending difficult problems to expensive frontier models and easier ones to cheaper and faster alternatives.

Scott Wu, CEO of Cognition, which makes the Devin coding agent, said the gains on routine work are huge. For much standard work, he said, companies can achieve five to 10 times the profitability by using models that are still good enough for the task.

Today, most companies don’t do routing at all. Arvind Jain, CEO of Glean, estimated that about 95% of enterprise AI use is still on the most expensive frontier models, even for tasks that lower-cost alternatives could easily handle. Wu gave the example of asking a model to name the third U.S. president. Everyone, no matter what their price, will tell you it was Thomas Jefferson.

Arvind Jain, CEO of Glean, on the SaaS Monster stage during the first day of Web Summit 2022 at Altice Arena in Lisbon, Portugal on November 2, 2022.

Harry Murphy | Sports file | Getty Images

The pressure behind this change is a cost curve that has surprised even the biggest tech companies. Jeetu Patel, Product Manager at Ciscooutlined the calculations. At around $200 in token usage per employee per week, that’s around $10,000 per year per person. With 90,000 employees, a company is looking at an annual budget of $900 million.

Patel said Cisco had gone way over its own budget and had to adapt, with 30,000 engineers now building products written largely with AI. Cisco reallocated its resources, prioritizing tokens over other expenses.

Salespeople under pressureAI companies recognize this anxiety.

Cognition has announced what it calls an AI Productivity Guarantee. if Devin delivers less technical value than a customer is paying for, Cognition will fund its use up to $10 million until it measures up. Wu pitched it as a way to cut through the noise on a metric the industry cares about: return on investment.

Rather than measuring activities such as tokens consumed or lines of code, Wu said, Cognition estimates how many human engineering hours its agent actually saves and backs that estimate with reimbursement. You can spend billions of tokens and do nothing with them, he said. Businesses should aim for production, not activity.

If companies start steering easy, large jobs toward cheaper open source models from China or elsewhere, then OpenAI and Anthropic will stop getting paid for each job. They only get the most complex jobs. Both companies have built their businesses, and the IPO Expectations around them, assuming huge demand at high prices.

Patel doesn’t think this will sink border labs and says cutting-edge technology will remain valuable. But he sees the pricing model changing. Labs will need to become more efficient in how models are used rather than simply charging more, which Patel says will lead to a concerted industry effort.

The question was whether companies would continue to spend as their AI bills rose. It now appears that many will simply find a way to spend smartly. Pricing power is shifting from companies that sell high-end AI to those that buy it.

Frontier laboratories will always receive a bonus for the hardest work. But what share of the market do other products represent? The answer could go a long way in determining the valuations of leading AI companies.

Exit mobile version