NVIDIA is in trouble. It’s about piracy
NVIDIA has been accused of piracy. The company allegedly trained artificial intelligence using illegal data.
When we read about the fight against piracy on the Internet, it is usually large corporations that blame the small ones. So the opposite situation is a nice change in its own way. This time she came under fire NVIDIAwhich had artificial intelligence to train its models use books from illegal sources.
NVIDIA used pirated books
A group of American writers, Abdi Nazemian, Brian Keene and Stewart O’Nan, filed a lawsuit against NVIDIA in a California federal court. The authors accuse the giant of using a set of data in the work on the development of artificial intelligence “Books3”, including illegal copies of their works. The package was to be used, among others, for: for training the language model NeMo Megatron.
The mentioned “Books3” collection was developed by Shawn Presser, an artificial intelligence researcher, based on collections of the pirate website Bibliotik. The package was created in 2020 and then joined a larger collection “The Pile” prepared by EleutherAI. It was the latter set that NVIDIA was supposed to use during its work on artificial intelligence. The giant from Santa Clara is not the only company that used it – it was also used by, among others, Meta, Microsoft and OpenAI.
The authors demand from NVIDIA compensation for the damage suffered, the amount of the claim was not specified in the lawsuit.
The described case is certainly worth following, although at the moment the situation does not look particularly optimistic from the perspective of the victims. In February 2024, a Californian court in a similar case dismissed the lawsuit aimed against OpenAIwithout considering any copyright infringement in the presented situation.