OpenAI: Copyrighted data ‘impossible’ to avoid for AI training

  • by:
  • Source: AI News
  • 01/11/2024
OpenAI made waves this week with its bold assertion to a UK parliamentary committee that it would be “impossible” to develop today’s leading AI systems without using vast amounts of copyrighted data.

The company argued that advanced AI tools like ChatGPT require such broad training that adhering to copyright law would be utterly unworkable.

In written testimony, OpenAI stated that between expansive copyright laws and the ubiquity of protected online content, “virtually every sort of human expression” would be off-limits for training data. From news articles to forum comments to digital images, little online content can be utilised freely and legally.

According to OpenAI, attempts to create capable AI while avoiding copyright infringement would fail: “Limiting training data to public domain books and drawings created more than a century ago … would not provide AI systems that meet the needs of today’s citizens.”
While defending its practices as compliant, OpenAI conceded that partnerships and compensation schemes with publishers may be warranted to “support and empower creators.” But the company gave no indication that it intends to dramatically restrict its harvesting of online data, including paywalled journalism and literature.

This stance has opened OpenAI up to multiple lawsuits, including from media outlets like The New York Times alleging copyright breaches.
Nonetheless, OpenAI appears unwilling to fundamentally alter its data collection and training processes—given the “impossible” constraints self-imposed copyright limits would bring. The company instead hopes to rely on broad interpretations of fair use allowances to legally leverage vast swathes of copyrighted data.

As advanced AI continues to demonstrate uncanny abilities emulating human expression, legal experts expect vigorous courtroom battles around infringement by systems intrinsically designed to absorb enormous volumes of protected text, media, and other creative output. 
Open AI by Levart_Photographer is licensed under Unsplash unsplash.com

Get latest news delivered daily!

We will send you breaking news right to your inbox

© 2024 louder.news, Privacy Policy