Latest news with #fairuse


South China Morning Post
2 hours ago
- Business
- South China Morning Post
Anthropic wins ruling on use of copyrighted books to train AI, but not pirated books
In a test case for the artificial intelligence (AI) industry, a federal judge has ruled that AI company Anthropic did not break the law by training its chatbot Claude on millions of copyrighted books. But the company is still on the hook and must now go to trial over how it acquired those books by downloading them from online 'shadow libraries' of pirated copies. US District Judge William Alsup of San Francisco said in a ruling filed late Monday that the AI system's distilling from thousands of written works to be able to produce its own passages of text qualified as 'fair use' under US copyright law because it was 'quintessentially transformative'. 'Like any reader aspiring to be a writer, Anthropic's [AI large language models] trained upon works not to race ahead and replicate or supplant them – but to turn a hard corner and create something different,' Alsup wrote. Anthropic CEO Dario Amodei. Photo: AP But while dismissing a key claim made by the group of authors who sued the company for copyright infringement last year, Alsup also said Anthropic must still go to trial in December over its alleged theft of their works. 'Anthropic had no entitlement to use pirated copies for its central library,' Alsup wrote.

Malay Mail
5 hours ago
- Business
- Malay Mail
US judge rules Anthropic can train AI on books without permission, but says pirated library still copyright breach
SAN FRANCISCO, June 25 — A US federal judge has sided with Anthropic regarding training its artificial intelligence models on copyrighted books without authors' permission, a decision with the potential to set a major legal precedent in AI deployment. District Court Judge William Alsup ruled on Monday that the company's training of its Claude AI models with books bought or pirated was allowed under the 'fair use' doctrine in the US Copyright Act. 'Use of the books at issue to train Claude and its precursors was exceedingly transformative and was a fair use,' Alsup wrote in his decision. 'The technology at issue was among the most transformative many of us will see in our lifetimes,' Alsup added in his 32-page decision, comparing AI training to how humans learn by reading books. Tremendous amounts of data are needed to train large language models powering generative AI. Musicians, book authors, visual artists and news publications have sued various AI companies that used their data without permission or payment. AI companies generally defend their practices by claiming fair use, arguing that training AI on large datasets fundamentally transforms the original content and is necessary for innovation. 'We are pleased that the court recognised that using 'works to train LLMs was transformative,'' an Anthropic spokesperson said in response to an AFP query. The judge's decision is 'consistent with copyright's purpose in enabling creativity and fostering scientific progress,' the spokesperson added. Blanket protection rejected The ruling stems from a class-action lawsuit filed by authors Andrea Bartz, Charles Graeber, and Kirk Wallace Johnson, who accused Anthropic of illegally copying their books to train Claude, the company's AI chatbot that rivals ChatGPT. However, Alsup rejected Anthropic's bid for blanket protection, ruling that the company's practice of downloading millions of pirated books to build a permanent digital library was not justified by fair use protections. Along with downloading books from websites offering pirated works, Anthropic bought copyrighted books, scanned the pages and stored them in digital formats, according to court documents. Anthropic's aim was to amass a library of 'all the books in the world' for training AI models on content as deemed fit, the judge said in his ruling. While training AI models on the pirated content posed no legal violation, downloading pirated copies to build a general-purpose library constituted copyright infringement, the judge ruled, regardless of eventual training use. The case will now proceed to trial on damages related to the pirated library copies, with potential penalties including financial damages. Anthropic said it disagreed with going to trial on this part of the decision and was evaluating its legal options. 'Judge Alsup's decision is a mixed bag,' said Keith Kupferschmid, chief executive of US nonprofit Copyright Alliance. 'In some instances AI companies should be happy with the decision and in other instances copyright owners should be happy.' Valued at $61.5 billion and heavily backed by Amazon, Anthropic was founded in 2021 by former OpenAI executives. The company, known for its Claude chatbot and AI models, positions itself as focused on AI safety and responsible development. — AFP
Yahoo
6 hours ago
- Business
- Yahoo
Bad News for Movie Studios: Authors Just Lost on a Key Issue In a Major AI Lawsuit
Transformative. That's how a federal court characterized Amazon-backed Anthropic's use of millions of books across the web to teach its artificial intelligence system. It's the first decision to consider the issue and will serve as a template for other courts overseeing similar cases. And studios, now that some have entered the fight over the industry-defining technology, should be uneasy about the ruling. More from The Hollywood Reporter Netflix Founder Reed Hastings Joins Board of AI Giant Anthropic Anthropic Wins First Round In Lawsuit From Music Publishers Over Song Lyrics Music Publishers Reach Deal With AI Giant Anthropic Over Copyrighted Song Lyrics The thrust of these cases will be decided by one question: Are AI companies covered by fair use, the legal doctrine in intellectual property law that allows creators to build upon copyrighted works without a license? On that issue, a court found that Anthropic is on solid legal ground, at least with respect to training. The technology is 'among the most transformative many of us will see in our lifetimes,' wrote U.S. District Judge William Alsup. Still, Anthropic will face a trial over illegally downloading seven millions books to create a library that was used for training. That it later purchased copies of those books it stole off the internet earlier to cover its tracks doesn't absolve it of liability, the court concluded. The company faces potential damages of hundreds of millions of dollars stemming from the decision that could lead to Disney and Universal getting a similar payout depending on what they unearth in discovery over how Midjourney allegedly obtained copies of thousands of films that were repurposed to teach its image generator. Last year, authors filed a lawsuit against Anthropic accusing it of illegally downloading and copying their books to power its AI chatbot Claude. The company chose not to move to dismiss the complaint and instead skipped straight for a decision on fair use. In the ruling, the court found that authors don't have the right to exclude Anthropic from using their works to train its technology, much in the same way they don't have the right to exclude any person from reading their books to learn how to write. 'Everyone reads texts, too, then writes new texts,' reads the order. 'They may need to pay for getting their hands on a text in the first instance. But to make anyone pay specifically for the use of a book each time they read it, each time they recall it from memory, each time they later draw upon it when writing new things in new ways would be unthinkable.' If someone were to read all the modern-day classics, memorize them and emulate a blend of their writing, that wouldn't constitute copyright infringement, the court concluded. Like any reader who wants to be a writer, Anthropic's technology draws upon works not to replicate or supplant them but to create something entirely different, according to the order. Those aren't the findings that Disney or Universal — both of whom are suing Midjourney for copyright infringement — wanted or expected. For them, there's reason to worry that Alsup's analysis will shape the way in which the judge overseeing their case considers undermining development of a technology that was found by another court to be revolutionary (or something close to it). More broadly, it could be found that AI video generators, like Sora, are simply distilling every movie ever made to create completely new works. 'This Anthropic decision will likely be cited by all creators of AI models to support the argument that fair use applies to the use of massive datasets to train foundational models,' says Daniel Barsky, an intellectual property lawyer at Holland & Knight. Important to note: The authors didn't allege that responses generated by Anthropic infringed upon their works. And if they had, they would've lost that argument under the court's finding that guardrails are in place to ensure that no infringing ever reached users. Alsup compared it to Google imposing limits on how many snippets of text from any one book could be seen by a user through its Google Book service, preventing its search function from being misused as a way to access full books for free. 'Here, if the outputs seen by users had been infringing, Authors would have a different case,' Alsup writes. 'And, if the outputs were ever to become infringing, Authors could bring such a case. But that is not this case.' But that could be the case for Midjourney, which returns nearly exact replicas of frames from films in some instances. When prompted with 'Thanos Infinity War,' Midjourney — an AI program that translates text into hyper-realistic graphics — replied with an image of the purple-skinned villain in a frame that appears to be taken from the Marvel movie or promotional materials, with few to no alterations made. A shot of Tom Cruise in the cockpit of a fighter jet, from Top Gun: Maverick, is produced when the tool was asked for a frame from the film. The chatbots can seemingly replicate almost any animation style, generating startlingly accurate characters from titles ranging from DreamWorks' Shrek to Pixar's Ratatouille to Warner Bros.' The Lego Movie. 'The fact that Midjourney generates copies and derivatives of' films from Disney and Universal proves that the company, without their knowledge or permission, 'copied plaintiffs' copyrighted works to train and develop' its technology, states the complaint. Also at play: The possibility that Midjourney pirated the studios' movies. In the June 23 ruling, Alsup found that Anthropic illegally downloading seven million books to build a library to be used for training isn't covered by fair use. He said that the company could've instead paid for the copies. Such piracy, the court concluded, is 'inherently, irredeemably infringing.' With statutory damages for willful copyright infringement reaching up to $150,000 per work, massive payouts are a possibility. Best of The Hollywood Reporter How the Warner Brothers Got Their Film Business Started Meet the World Builders: Hollywood's Top Physical Production Executives of 2023 Men in Blazers, Hollywood's Favorite Soccer Podcast, Aims for a Global Empire


Al Jazeera
7 hours ago
- Business
- Al Jazeera
US judge allows company to train AI using copyrighted literary materials
A United States federal judge has ruled that the company Anthropic made 'fair use' of the books it utilised to train artificial intelligence (AI) tools without the permission of the authors. The favourable ruling comes at a time when the impacts of AI are being discussed by regulators and policymakers, and the industry is using its political influence to push for a loose regulatory framework. 'Like any reader aspiring to be a writer, Anthropic's LLMs [large language models] trained upon works not to race ahead and replicate or supplant them — but to turn a hard corner and create something different,' US District Judge William Alsup said. A group of authors had filed a class-action lawsuit alleging that Anthropic's use of their work to train its chatbot, Claude, without their consent was illegal. But Alsup said that the AI system had not violated the safeguards in US copyright laws, which are designed for 'enabling creativity and fostering scientific progress'. He accepted Anthropic's claim that the AI's output was 'exceedingly transformative' and therefore fell under the 'fair use' protections. Alsup, however, did rule that Anthropic's copying and storage of seven million pirated books in a 'central library' infringed author copyrights and did not constitute fair use. The fair use doctrine, which allows limited use of copyrighted materials for creative purposes, has been employed by tech companies as they create generative AI. Technology developpers often sweeps up large swaths of existing material to train their AI models. Still, fierce debate continues over whether AI will facilitate greater artistic creativity or allow the mass-production of cheap imitations that render artists obsolete to the benefit of large companies. The writers who brought the lawsuit — Andrea Bartz, Charles Graeber and Kirk Wallace Johnson — alleged that Anthropic's practices amounted to 'large-scale theft', and that the company had sought to 'profit from strip-mining the human expression and ingenuity behind each one of those works'. While Tuesday's decision was considered a victory for AI developpers, Alsup nevertheless ruled that Anthropic must still go to trial in December over the alleged theft of pirated works. The judge wrote that the company had 'no entitlement to use pirated copies for its central library'.


CBS News
9 hours ago
- Business
- CBS News
Anthropic wins key AI copyright case, but remains on the hook for using pirated books
Anthropic has won a major legal victory in a case over whether the artificial intelligence company was justified in hoovering up millions of copyrighted books to train its chatbot. In a ruling that could set an important precedent for similar disputes, Judge William Alsup of the United States District Court for the Northern District of California on Tuesday said Anthropic's use of legally purchased books to train its AI model, Claude, did not violate U.S. copyright law. Anthropic, which was founded by former executives with ChatGPT developer OpenAI, introduced Claude in 2023. Like other generative AI bots, the tool lets users ask natural language questions and then provides neatly summarized answers using AI trained on millions of books, articles and other material. Alsup ruled that Anthropic's use of copyrighted books to train its language learning model, or LLM, was "quintessentially transformative" and did not violate "fair use" doctrine under copyright law. "Like any reader aspiring to be a writer, Anthropic's LLMs trained upon works not to race ahead and replicate or supplant them, but to turn a hard corner and create something different," his decision states. By contrast, Alsup also found that Anthropic may have broken the law when it separately downloaded millions of pirated books and said it will face a separate trial in December over this issue. Court documents revealed that Anthropic employees expressed concern concern about the legality of using pirate sites to access books. The company later shifted its approach and hired a former Google executive in charge of Google Books, a searchable library of digitized books that successfully weathered years of copyright battles. Authors had filed suit Anthropic cheered the ruling. "We are pleased that the Court recognized that using 'works to train LLMs (language learning models) was transformative — spectacularly so," an Anthropic spokesperson told CBS News in an email. The ruling stems from a case filed last year by three authors in federal court. After Anthropic used copies of their books to train Claude, Andrea Bartz, Charles Graeber and Kirk Wallace Johnson sued Anthropic for alleged copyright infringement, claiming the company's practices amounted to "large-scale theft." The authors also alleged that Anthropic "seeks to profit from strip-mining the human expression and ingenuity behind each one of those works." The authors' attorneys declined comment. Other AI companies have also come under fire over the material they use to build their language learning models. The New York Times, for example, sued Open AI and Microsoft in 2023, claiming that the tech companies used millions of its articles to train their automated chatbots. At the same time, some media companies and publishers are also seeking compensation by licensing their content to companies like Anthropic and OpenAI. contributed to this report.