Ryan Fedasiuk
The U.S.-China AI competition is becoming less about who can build the best models and more about who can deliver AI services, reliably and cheaply, to global publics. Absent federal intervention, the United States is at risk of losing its ability to steer the development and adoption of this extremely consequential technology, as Chinese open-weight models become “good enough” for large segments of white-collar work at significantly lower cost.
For most of the past decade, the “AI race” was defined by training—the one-time, massively expensive process of building a frontier model. Training is a discrete engineering project: a lab assembles thousands of chips in a centralized data center, runs them for weeks or months, and produces a model. It is a costly but finite event. The new phase of AI competition is about inference—the continuous, 24/7 process of serving a trained model to users, wherever they might be. Every time a customer asks a question of ChatGPT or Claude, generates a document, or debugs a line of code, a computer is processing that request—making an inference about how to respond. Unlike training, inference cannot be centralized. It must be geographically distributed to minimize latency, and it must run reliably around the clock. In 2026, the computational power required to serve a popular model to hundreds of millions of users dwarfs the compute required to train it in the first place—by one to two orders of magnitude.
No comments:
Post a Comment