25 May 2025

The Hidden Cost of AI: Extractive AI Is Bad for Business

Ali Crawford, Matthias Oschinski, and Andrew J. Lohn

The Chinese AI company DeepSeek recently sent shockwaves through the financial world, causing market chaos and sparking uncertainty among tech policymakers. OpenAI released a statement acknowledging potential evidence that DeepSeek trained its model on data generated by outputs from OpenAI’s GPT-4o model through a process called distillation. Simply put, DeepSeek is being accused of training its model on OpenAI’s model and benefiting from that transfer of knowledge. But before we ask whether DeepSeek stole from OpenAI, we should ask a deeper question: who did OpenAI take from?

OpenAI has been accused of illegally appropriating data in the form of news articles, stories, and even YouTube video transcriptions to power its models. Those models are trained on vast amounts of human-generated data, often without compensation or acknowledgement to the human creator. These practices are only lightly discussed at major international AI safety summits—such as those in the United Kingdom, South Korea, and more recently this past February in France—which tend to focus on whether AI might invent biological weapons, develop new cyberattacks, or if unseen model bias poses a threat to humanity. The silent transfer of value from creators to algorithms is emerging as one of the most overlooked economic risks of the AI boom. The truth is, people have already begun to express that they have been harmed by decisions to use or employ AI.

In a recent significant event, one of the first major labor disputes over the use of AI was observed in the 2023 Writers’ Guild of America (WGA) strike. While the main issue revolved around streaming services and residuals owed to writers, negotiations concerning the use of generative AI prolonged the strike, which has its own section in the WGA’s Minimum Basic Agreement (MBA). Essentially, the WGA advocated against company or studio use of AI to write or rewrite literary materials, and that AI-generated content cannot be used as source material, which would have implications for how writers receive credit for their original work. Additionally, the MBA gives the WGA the right to assert that the exploitation of writers’ work used to train an AI model is prohibited.

No comments: