Meta has been accused of torrenting an astonishing 81.7TB of pirated books to train its Llama AI models according to a new lawsuit filed in the US District Court for the Northern District of California.
![Meta accused of downloading torrents of 81.7TB of pirated books to train its Llama AI models 29](https://static.tweaktown.com/news/1/0/103101_29_meta-torrented-over-81-7tb-of-pirated-books-to-train-ai-authors-say.jpg)
The social networking giant has been accused of illegally torrenting copyrighted materials from sources including Z-Library and LibGen, with the plaintiffs led by author Richard Kadrey and others representing a proposed class, filing a motion objecting to a pre-trial discovery ruling that the authors argue limits their ability to gather critical evidence against Meta.
The authors claim that Meta's last-minute disclosure of over 2000 documents on December 13, 2024 just hours before the close of fact discovery revealed admissions from Meta employees about using pirated materials for its AI training. The newly-unsealed emails reveal damning evidence against Meta in a copyright lawsuit filed by book authors, claiming that Meta unlawfully trained its AI models using pirated books downloaded over torrents.
In new evidence that shows Meta torrented "at least 81.7 terabytes of data across multiple shadow libraries through the site's Anna's Archive, including at least 35.7 terabytes of data from Z-Library and LibGen" according to the authors' court filing, adding "Meta also previously torrented 80.6 terabytes of data from LibGen".
The authors' filing alleges: "the magnitude of Meta's unlawful torrenting scheme is astonishing", insisting that "vastly smaller acts of data piracy-just .008 percent of the amount of copyrighted works Meta pirated-have resulted in Judges referring the conduct to the US Attorneys' office for criminal investigation".
One Meta staffer reportedly said: "I feel that using pirated material should be beyond our ethical threshold", while another document alleges that Meta's decision to use LibGen was escalated to Meta CEO Mark Zuckerberg. The authors claim that internal emails about torrenting prove that Meta was well aware its actions were illegal, pointing to warnings from employees that say they were ignored.
The plaintiffs are challenging several aspects of a recent discovery ruling:
- Reopening Depositions: They argue that the late-disclosed documents contradict prior testimony from key Meta witnesses and justify reopening depositions to question them about these revelations.
- Torrenting Data: Plaintiffs are seeking access to Meta's torrenting logs and peer-sharing records to demonstrate how much pirated material was downloaded and redistributed.
- Llama 4 and 5 Training Datasets: The plaintiffs claim that datasets used for upcoming versions of Llama are relevant to their case and should be produced.
- Crime-Fraud Exception: They allege that Meta's attorneys were involved in decisions to use pirated materials despite knowing it was illegal, warranting an in-camera review of privileged communications under the crime-fraud exception.