The AI revolution is hitting a speed bump no one saw coming – and it’s showing up on the balance sheet. A Silicon Valley software company and a major ecommerce player have opened their books to WIRED, revealing what insiders are calling ‘pretty crazy’ token consumption rates that are forcing executives to rethink their AI-first strategies. The issue isn’t whether AI works, it’s whether companies can afford to run it at scale.

Silicon Valley’s AI deployment reality check has arrived, and it’s written in tokens per dollar. In rare candor, enterprise software provider 8×8 and an ecommerce company have pulled back the curtain on what they’re calling the ‘tokenomics’ problem – the emerging challenge of managing AI costs that are growing faster than anyone budgeted for.

The issue centers on tokens, the basic units of text that large language models process. Every API call to systems like Anthropic’s Claude racks up token charges for both input and output. What seemed manageable in pilot programs has turned ‘pretty crazy’ in production, according to sources speaking with WIRED.

8×8, which provides cloud communications software to enterprises, has become an unexpected case study in AI cost management. The company integrated AI features expecting modest compute bills, but real-world usage patterns told a different story. Engineers are now tracking token consumption with the same rigor they once reserved for bandwidth and storage costs.

The math gets uncomfortable fast. A single complex customer service interaction might consume thousands of tokens. Multiply that across millions of daily interactions, and what looked like a competitive advantage starts looking like a budget crisis. Companies that rushed to add ‘AI-powered’ to their feature lists are now discovering that the compute costs don’t scale the same way traditional software does.

This isn’t just an 8×8 problem. The ecommerce company that spoke to WIRED – choosing to remain unnamed – described similar dynamics. Product recommendation engines, customer service chatbots, and inventory optimization tools all seemed like obvious AI wins until the monthly bills arrived. The company is now implementing strict token budgets for different use cases and killing features that don’t generate enough value to justify their token consumption.

The challenge extends beyond raw costs. Traditional software scales predictably – more users mean more revenue with relatively fixed marginal costs. AI applications flip that model. More usage means linearly higher compute costs, and there’s no Moore’s Law coming to save margins. Anthropic and OpenAI have both cut API pricing over the past year, but not fast enough to keep pace with enterprise consumption growth.

Companies are responding with a toolkit of optimization strategies. Prompt engineering has evolved from an art to a cost-cutting discipline, with teams dedicated to extracting maximum value from minimum tokens. Some are implementing tiered systems – using expensive frontier models only for complex queries while routing simpler tasks to cheaper alternatives. Others are building their own smaller models for specific use cases, trading capability for cost predictability.

The tokenomics crunch is also changing how enterprises evaluate AI vendors. Pricing transparency has become a key vendor selection criterion. Companies want predictable costs, volume discounts, and clear metrics on token efficiency. Some are negotiating custom enterprise agreements that cap monthly spending or provide token pools, essentially bringing SaaS-style pricing to AI consumption.

What’s emerging is a new discipline within enterprise IT – AI cost optimization. It sits somewhere between DevOps and financial planning, requiring technical understanding of model behavior and hard-nosed analysis of ROI. Companies are hiring specialists who can analyze token flows, identify waste, and architect systems that deliver AI value without blowing up budgets.

The irony isn’t lost on anyone – AI was supposed to reduce costs through automation. Instead, it’s creating a new category of operational expense that many companies weren’t prepared for. The ones succeeding are treating tokenomics as seriously as they once treated cloud cost optimization, with dedicated teams, monitoring tools, and executive-level oversight.

The tokenomics challenge reveals a fundamental truth about the current AI wave – deployment at scale requires rethinking cost structures, not just capabilities. The companies willing to share their struggles are doing the industry a favor, exposing a reality that pilot programs and vendor demos conveniently gloss over. As AI moves from experimental to operational, the winners won’t just be the ones with the best models – they’ll be the ones who figured out how to run them profitably. For every executive betting big on AI transformation, there’s now a CFO asking harder questions about the monthly compute bill.