Anthropic to pay authors $1.5B to settle lawsuit over pirated chatbot training material

NEW YORK — Artificial intelligence company Anthropic has agreed to pay $1.5 billion to settle a class-action lawsuit by book authors who say the company took pirated copies of their works to train its chatbot.

The landmark settlement, if approved by a judge as soon as Monday, could mark a turning point in legal battles between AI companies and the writers, visual artists and other creative professionals who accuse them of copyright infringement.

The company has agreed to pay authors about $3,000 for each of an estimated 500,000 books covered by the settlement.

“As best as we can tell, it’s the largest copyright recovery ever,” said Justin Nelson, a lawyer for the authors. “It is the first of its kind in the AI era.”

A trio of authors — thriller novelist Andrea Bartz and nonfiction writers Charles Graeber and Kirk Wallace Johnson — sued last year and now represent a broader group of writers and publishers whose books Anthropic downloaded to train its chatbot Claude.

The Anthropic website and mobile phone app are shown in this photo in New York on July 5, 2024.
The Anthropic website and mobile phone app are shown in this photo in New York on July 5, 2024. (Richard Drew | AP)

A federal judge dealt the case a mixed ruling in June, finding that training AI chatbots on copyrighted books wasn’t illegal but that Anthropic wrongfully acquired millions of books through pirate websites.

If Anthropic had not settled, experts say losing the case after a scheduled December trial could have cost the San Francisco-based company even more money.

“We were looking at a strong possibility of multiple billions of dollars, enough to potentially cripple or even put Anthropic out of business,” said William Long, a legal analyst for Wolters Kluwer.

U.S. District Judge William Alsup of San Francisco has scheduled a Monday hearing to review the settlement terms.

Anthropic said in a statement Friday that the settlement, if approved, “will resolve the plaintiffs’ remaining legacy claims.”

“We remain committed to developing safe AI systems that help people and organizations extend their capabilities, advance scientific discovery, and solve complex problems,” said Aparna Sridhar, the company’s deputy general counsel.

As part of the settlement, the company has also agreed to destroy the original book files it downloaded.

Books are known to be important sources of data — in essence, billions of words carefully strung together — that are needed to build the AI large language models behind chatbots like Anthropic’s Claude and its chief rival, OpenAI’s ChatGPT.

Alsup’s June ruling found that Anthropic had downloaded more than 7 million digitized books that it “knew had been pirated.” It started with nearly 200,000 from an online library called Books3, assembled by AI researchers outside of OpenAI to match the vast collections on which ChatGPT was trained.

Debut thriller novel The Lost Night by Bartz, a lead plaintiff in the case, was among those found in the Books3 dataset.

Anthropic later took at least 5 million copies from the pirate website Library Genesis, or LibGen, and at least 2 million copies from the Pirate Library Mirror, Alsup wrote.

The Authors Guild told its thousands of members last month that it expected “damages will be minimally $750 per work and could be much higher” if Anthropic was found at trial to have willfully infringed their copyrights. The settlement’s higher award — approximately $3,000 per work — likely reflects a smaller pool of affected books, after taking out duplicates and those without copyright.

On Friday, Mary Rasenberger, CEO of the Authors Guild, called the settlement “an excellent result for authors, publishers, and rightsholders generally, sending a strong message to the AI industry that there are serious consequences when they pirate authors’ works to train their AI, robbing those least able to afford it.”

The Danish Rights Alliance, which successfully fought to take down one of those shadow libraries, said Friday that the settlement would be of little help to European writers and publishers whose works aren’t registered with the U.S. Copyright Office.

“On the one hand, it’s comforting to see that compiling AI training datasets by downloading millions of books from known illegal file-sharing sites comes at a price,” said Thomas Heldrup, the group’s head of content protection and enforcement.

“This indicates that maybe for other cases, it’s possible for creators and AI companies to reach settlements without having to essentially go for broke in court.”

William Long, a legal analyst for Wolters Kluwer

On the other hand, Heldrup said it fits a tech industry playbook to grow a business first and later pay a relatively small fine, compared to the size of the business, for breaking the rules.

“It is my understanding that these companies see a settlement like the Anthropic one as a price of conducting business in a fiercely competitive space,” Heldrup said.

The privately held Anthropic, founded by ex-OpenAI leaders in 2021, earlier this week put its value at $183 billion after raising another $13 billion in investments.

Anthropic also said it expects to make $5 billion in sales this year, but, like OpenAI and many other AI startups, it has never reported making a profit, relying instead on investors to back the high costs of developing AI technology for the expectation of future payoffs.

The settlement is likely to influence other disputes, including an ongoing lawsuit by authors and newspapers against OpenAI and its business partner Microsoft.

“This indicates that maybe for other cases, it’s possible for creators and AI companies to reach settlements without having to essentially go for broke in court,” said Long, the legal analyst.

The industry, including Anthropic, had largely praised Alsup’s June ruling because he found that training AI systems on copyrighted works so chatbots can produce their own passages of text qualified as “fair use” under U.S. copyright law because it was “quintessentially transformative.”

Comparing the AI model to “any reader aspiring to be a writer,” Alsup wrote that Anthropic “trained upon works not to race ahead and replicate or supplant them — but to turn a hard corner and create something different.”

But documents disclosed in court showed Anthropic employees’ internal concerns about the legality of their use of pirate sites. The company later shifted its approach and hired Tom Turvey, the former Google executive in charge of Google Books, a searchable library of digitized books that successfully weathered years of copyright battles.

With his help, Anthropic began buying books in bulk, tearing off the bindings and scanning each page before feeding the digitized versions into its AI model, according to court documents. But that didn’t undo the earlier piracy, according to the judge.

 

Don Lemon and Georgia Fort vow to continue reporting following arrests tied to anti-ICE protest

The two independent journalists face federal charges related to the interruption of a church service in Minnesota earlier this month. Lemon and Fort say they were there to cover a protest.

‘Sanford and Son’ co-star Demond Wilson dies at 79

The actor was best known for playing Lamont Sanford, opposite Redd Foxx's Fred Sanford in the hit 1970s sitcom. Wilson died Friday from complications related to cancer, his publicist said.

Milan protesters call for U.S. ICE agents to leave Italy as Winter Games approach

An ICE unit from the US Department of Homeland Security is playing a role providing security at the Winter Games. At past Olympics, their involvement would have been unremarkable. But after the violence in Minneapolis, many Italians protesting in Milan say ICE agents are no longer welcome.

Judge says she won’t halt the immigration enforcement surge as a lawsuit proceeds

U.S. Attorney General Pam Bondi praised the ruling on social media, calling it "another HUGE" legal win for the Justice Department.

Kazakhstan’s Elena Rybakina defeats No. 1 Aryana Sabalenka to win Australian Open

Saturday's win marks the second Grand Slam title for Rybakina, who took Wimbledon in 2022.

The U.S. will likely lose its measles elimination status. Here’s what that means

The South Carolina measles outbreak is now bigger than last year's Texas outbreak and is happening as the U.S. is poised to lose its measles elimination status.

More Front Page Coverage