Stack Overflow, one of the world’s largest developer communities, plans to start charging big AI developers for access to their data. The move comes in response to the growing demand for large language models (LLMs) that power AI algorithms like ChatGPT and DALL-E, which require massive data for training. Platforms such as Stack Overflow and Reddit have long been a source of this data for AI engineers, and they are now attempting to monetise their efforts.
According to reports, Stack Overflow plans to start charging AI algorithm developers in the middle of this year, offering access to more than 50 million questions and answers in return. Currently, the site has over 20 million registered users.
The platform’s CEO, Prashanth Chandrasekar, stated that community platforms that fuel the development of LLMs need to be compensated for their contributions so that companies like Stack Overflow can reinvest back into communities to keep them thriving.
This move by Stack Overflow follows in the footsteps of Reddit, which also recently announced plans to ban the free use of its content for training neural networks and start charging AI developers. The move reflects the increasing cost of developing AI algorithms like ChatGPT and DALL-E, which can cost hundreds of millions of dollars. Companies like Microsoft, Google, and OpenAI rely heavily on platforms like Stack Overflow and Reddit for their data, and they currently do not pay anything for it.
However, some experts fear that the addition of a commercial side may slow down the development of AI and the improvement of the LLM language model. As the cost of accessing data from community platforms like Stack Overflow and Reddit increases, smaller AI developers may be priced out of the market, leading to a concentration of power among big tech companies.
The move by Stack Overflow to start charging AI developers for access to their data marks a significant shift in the AI industry’s landscape. While it is important to reward community platforms for their contributions, it is equally important to guarantee that data costs do not impede AI development and that access to data stays accessible and equitable for all players.