OpenAI is in court again. This time because of the encyclopedia

It continues in the world of artificial intelligence another high-profile legal disputewhich may be important consequences for the entire industry. On Friday, Encyclopedia Britannica and Merriam-Webster filed lawsuit against OpenAIaccusing the company of unlawful use of their content. Now the Reuters news agency has revealed more details.

This is neither the first nor the last such case

Models used include: In ChatGPT they had many times copy copyrighted materials. The lawsuit states that GPT-4 was supposed to “memorize” much of the encyclopedia and be able to reproduce large fragments word for word. According to the plaintiffs, this means that the data was used illegally already at the model training stage.

They appear in court documents specific examples of AI-generated responses juxtaposed directly with the original Britannica texts. In some cases, entire paragraphs are intended to be identical or very close. This is one of the strongest allegations in the case, because it concerns not only the use of training data itself, but also end result visible to users.

Britannica also highlights another problemwhich increasingly appears in the context of generative AI. According to the company tools like ChatGPT cannibalize internet trafficbecause instead of directing users to sources, they provide ready-made answers that compete directly with the content published by publishers. In practice, this may mean a decline in visits and real financial losses.

Finally, it is worth recalling that OpenAI is already facing similar allegations from The New York Times. And in September, competitive Anthropic decided to settle a class action lawsuit over the use of books to train models. The agreement resulted in a payment of as much as $1.5 billion to the authors. If this continues, the AI market will face major changes.