in

OpenAI strikes multi-million dollar deals with publishers for AI training data

OpenAI AI training data news

OpenAI is reportedly in discussions with various news publishers to gain access to their extensive news archive — offering sums ranging from 1 to 5 million US dollars per year for this data acquisition.

The pursuit of training data has led OpenAI to explore partnerships with press publishers. One notable collaboration is with Axel Springer, a deal that not only provides OpenAI with training data but also allows the real-time integration of news from the publisher’s outlets, including prominent names like Bild, Welt, Business Insider, and Politico, into ChatGPT. Although the financial details of this deal remain undisclosed, reports from the Financial Times suggest a figure exceeding $10 million.

This trend isn’t limited to OpenAI. Tech giant Apple has also shown interest in similar data, reportedly offering up to $50 million for multi-year licenses. However, publishers remain cautious, particularly due to Apple’s demand for extensive rights and unclear plans for AI-generated content use in news reporting.

The AI industry faces increasing pressure, not just in acquiring training data but also in navigating copyright issues. High-profile lawsuits have been initiated by authors, artists, and actors throughout 2023, culminating in a significant lawsuit by the New York Times against OpenAI and Microsoft. The accusation centers on the unauthorized use of the publisher’s content for training AI models. Despite ongoing negotiations, a resolution remains elusive, with OpenAI expressing a commitment to respecting copyright laws and seeking mutually beneficial agreements.

A critical aspect of these developments is the potential exclusion of smaller media companies, despite their content also being utilized by AI developers. This raises questions about equitable solutions, possibly involving negotiations with collective rights organizations.

Both OpenAI and Microsoft have expressed intentions to involve content creators in their AI business models. However, the strategy for including authors and creators outside major media companies is still under development. This approach is crucial, especially in light of lawsuits from cultural figures like George R.R. Martin and Sarah Silverman.

Vishak

Written by Vishak

Vishak is a skilled Editor-in-chief at Code and Hack with a passion for AI and coding. He has a deep understanding of the latest trends and advancements in the fields of AI and Coding. He creates engaging and informative content on various topics related to AI, including machine learning, natural language processing, and coding. He stays up to date with the latest news and breakthroughs in these areas and delivers insightful articles and blog posts that help his readers stay informed and engaged.

Leave a Reply

Avatar

Your email address will not be published. Required fields are marked *

Apple Ferret

Apple unveils its open-source multimodal language model Ferret

Google Bard set to launch paid version

Google Bard set to launch paid version, following ChatGPT’s lead