-2.5 C
Rīga
Thursday , June 04, 2026
Zaļā Josta - Reklāma
Mājas Entertainment Zuckerberg Personally Authorized Massive Copyright Infringement to Train AI, Multiple Publishers Allege

Zuckerberg Personally Authorized Massive Copyright Infringement to Train AI, Multiple Publishers Allege

Photo Credit: Snowscat

Five book publishers and an author allege that Meta CEO Mark Zuckerberg personally authorized mass copyright infringement to train the company’s AI. Now, they’re suing.

Meta and its founder and CEO Mark Zuckerberg now face a copyright infringement lawsuit by five book publishers and author Scott Turow, alleging that the company torrented millions of copyrighted works from notorious pirate sites to train its AI—at Zuckerberg’s direct instruction.

The lawsuit was filed on Tuesday (May 5) in the U.S. District Court for the Southern District of New York by publishers Nachette, Macmillan, McGraw Hill, Elsevier, and Cengage, as well as Turow individually. The suit is a proposed class-action and seeks an unspecified amount in monetary damages for the alleged infringement.

“In their efforts to win the AI ‘arms race’ and build a functional generative AI model, Defendants Meta and Zuckerberg followed their well-known motto: ‘move fast and break things,’” the lawsuit reads. “They first illegally torrented millions of copyrighted books and journal articles from notorious pirate sites and downloaded unauthorized web scrapes of virtually the entire internet.”

“They then copied those stolen fruits many times over to train Meta’s multi-billion-dollar generative AI system called Llama,” the filing continues. “In doing so, Defendants engaged in one of the most massive infringements of copyrighted materials in history.”

A Meta spokesperson told Variety: “AI is powering transformative innovations, productivity, and creativity for individuals and companies, and courts have rightly found that training AI on copyrighted materials can qualify as fair use. We will fight this lawsuit aggressively.”

To their point, authors have sued AI companies for copyright infringement and lost. In June 2025, a federal judge rejected a claim brought by 13 authors, including comedian Sarah Silverman. They alleged that Meta violated their copyrights by training its AI model on their books, but the judge ruled that it constituted “fair use”—despite whispers that Meta pirated some 200,000 books to train its generative AI model.

What makes this go-around different is that Turow and the publishers specifically allege that Meta employees deliberately circumvented copyright protections at Zuckerberg’s behest after initially considering going through the correct channels and paying to license those works. Therefore, the lawsuit argues that such conduct falls outside fair use protection.

The lawsuit claims that Meta considered licensing deals with major publishers, and that doing so would require increasing the company’s “dataset licensing” budget to as much as $200 million from January to April 2023. However, the company then “abruptly stopped its licensing strategy,” as the question of licensing was “escalated” to Zuckerberg.

At this point, Meta’s business development team was given verbal instructions to cease licensing efforts, with one employee quoted as saying, “If we license [one] single book, we won’t be able to lean into the fair use strategy.”

“Meta—at Zuckerberg’s direction—copied millions of books, journal articles, and other written works without authorization, including those owned or controlled by Plaintiffs and the Class, and then made additional copies of those works to train Llama,” the suit asserts. “Zuckerberg himself personally authorized and actively encouraged the infringement. Meta also stripped [copyright management information] from the copyrighted works it stole. It did this to conceal its training sources and facilitate their unauthorized use.”

In December 2023, Meta employees began passing around a memo concerning the legal risks of using the online repository LibGen to source copyrighted material. This memo described this dataset as one “we know to be pirated,” and that the company “would not disclose use of LibGen datasets used to train.”

“Ultimately, however, those concerns went unheeded,” the suit reads. “Zuckerberg and other Meta executives authorized and directed the torrenting of over 267 TB of pirated material—equivalent to hundreds of millions of publications and many times the size of the entire print collection of the Library of Congress.”

As a result, the lawsuit claims that Meta’s AI system “readily generates, at speed and scale, substitutes for Plaintiffs’ and the Class’ works on which it was trained.” These can take many forms, such as “verbatim and near-verbatim copies, replacement chapters of academic textbooks, summaries and alternative versions of famous novels and journal articles, inferior knockoffs that copy creative elements of original works, and derivative works.”

Read More

Zaļā Josta - Reklāma