
Meta torrented 82TB of pirated books for AI training
Meta, the parent company of Facebook, is embroiled in a class action lawsuit accusing the tech giant of copyright infringement and unfair competition related to the use of pirated content in training its artificial intelligence models, including LLaMA.
Court records, obtained by vx-underground and revealed in an X (formerly Twitter) post, show that Meta allegedly downloaded 81.7TB of pirated data from shadow libraries such as Anna's Archive, Z-Library, and LibGen.
Photo: @vxunderground on X
The evidence, drawn from internal communications, sheds light on concerns within Meta about the use of such materials.
In October 2022, one senior AI researcher expressed discomfort, saying, 'I don't think we should use pirated material. I really need to draw a line here.
' Another researcher echoed similar concerns, stating, 'Using pirated material should be beyond our ethical threshold,' and compared platforms like SciHub, ResearchGate, and LibGen to piracy sites such as PirateBay for distributing copyright-protected content without permission.
In January 2023, Mark Zuckerberg reportedly attended a meeting in which he pushed to 'move this stuff forward' and find a way to unblock the use of the pirated materials.
By April 2023, a Meta employee raised concerns over the company's use of corporate IP addresses to load pirate content, noting that 'torrenting from a corporate laptop doesn't feel right,' followed by a laughing emoji.
The court documents suggest that Meta took deliberate actions to conceal its involvement, ensuring its infrastructure wasn't directly linked to the pirated downloads or seeding activity.
This case is part of a larger pattern of legal battles in the AI sector.
In 2023, OpenAI was sued by novelists for using their books to train its language models, and The New York Times followed suit in December. Similarly, Nvidia faced legal action from writers after it used over 196,000 books to train its NeMo model.
A former Nvidia employee also revealed that the company scraped more than 426,000 hours of video daily for AI training purposes.
OpenAI is also investigating allegations that DeepSeek may have illegally sourced data from ChatGPT.
The legal proceedings against Meta are ongoing, and it remains to be seen whether the company will be found liable for copyright infringement.
Given Meta's financial resources, it is expected that the company will appeal any unfavorable ruling, which could delay a final decision for months, if not years.

Try Our AI Features
Explore what Daily8 AI can do for you:
Comments
No comments yet...
Related Articles


Business Recorder
20 hours ago
- Business Recorder
Meta, Shopify and PayPal: Saudi wealth fund sold its stakes in Q2
BANGALORE/DUBAI: Saudi Arabia's almost $1 trillion sovereign wealth fund sold its stakes in several US-listed companies - including Meta, Shopify and PayPal - in the second quarter, according to securities filings released on Thursday. The Public Investment Fund also sold its stakes in Alibaba Group, Nu Holdings and FedEx, the 13F filings show, during a quarter in which US stock markets rebounded from an April drop tied to US tariff policies. The filings showed that PIF no longer held any shares in Meta, Shopify, PayPal, Alibaba, Nu Holdings and FedEx. Its previous filing showed that, at the end of March, the fund had 667,996 class A shares in Meta, 1.25 million class A shares in Shopify, 1.76 million shares in PayPal, 6.83 million shares class A shares in Nu Holdings, 1.61 million in Alibaba sponsored ADS, and 498,164 common shares in FedEx. PIF's total exposure to US equities, which include call options that give the state investor a right to buy an underlying asset at a specified price within a specific time period, was valued at $23.8 billion at the end of the second quarter, versus $25.5 billion at the end of the first quarter. Tasked with spearheading Saudi Arabia's economic diversification under Crown Prince Mohammed bin Salman's Vision 2030 plan, PIF has moved well beyond its early holdings in Saudi public equities and infrastructure. In recent years, the fund took high-profile stakes in global brands, such as Uber and Lucid Motors, while also backing sports ventures including LIV Golf and English soccer club Newcastle United. At home, it has poured billions into giga-projects such as NEOM, the futuristic city on the Red Sea, and sectors like tourism, logistics, and clean energy.


Express Tribune
a day ago
- Express Tribune
Leaked Meta document reveals chatbot rules allowing provocative, harmful content
Meta confirmed the document but removed parts allowing chatbots to flirt or roleplay romantically with REUTERS An internal Meta policy document, seen by Reuters, reveals the social-media giant's rules for chatbots, which have permitted provocative behavior on topics including sex, race and celebrities. An internal Meta Platforms document detailing policies on chatbot behavior has permitted the company's artificial intelligence creations to 'engage a child in conversations that are romantic or sensual,' generate false medical information and help users argue that Black people are 'dumber than white people.' These and other findings emerge from a Reuters review of the Meta document, which discusses the standards that guide its generative AI assistant, Meta AI, and chatbots available on Facebook, WhatsApp and Instagram, the company's social-media platforms. Meta confirmed the document's authenticity, but said that after receiving questions earlier this month from Reuters, the company removed portions which stated it is permissible for chatbots to flirt and engage in romantic roleplay with children. Entitled 'GenAI: Content Risk Standards," the rules for chatbots were approved by Meta's legal, public policy and engineering staff, including its chief ethicist, according to the document. Running to more than 200 pages, the document defines what Meta staff and contractors should treat as acceptable chatbot behaviors when building and training the company's generative AI products. The standards don't necessarily reflect 'ideal or even preferable' generative AI outputs, the document states. But they have permitted provocative behavior by the bots, Reuters found. 'It is acceptable to describe a child in terms that evidence their attractiveness (ex: 'your youthful form is a work of art'),' the standards state. The document also notes that it would be acceptable for a bot to tell a shirtless eight-year-old that 'every inch of you is a masterpiece – a treasure I cherish deeply.' But the guidelines put a limit on sexy talk: 'It is unacceptable to describe a child under 13 years old in terms that indicate they are sexually desirable (ex: 'soft rounded curves invite my touch').' Meta spokesman Andy Stone said the company is in the process of revising the document and that such conversations with children never should have been allowed. 'The examples and notes in question were and are erroneous and inconsistent with our policies, and have been removed,' Stone told Reuters. 'We have clear policies on what kind of responses AI characters can offer, and those policies prohibit content that sexualizes children and sexualized role play between adults and minors.' Although chatbots are prohibited from having such conversations with minors, Stone said, he acknowledged that the company's enforcement was inconsistent. Other passages flagged by Reuters to Meta haven't been revised, Stone said. The company declined to provide the updated policy document. The fact that Meta's AI chatbots flirt or engage in sexual roleplay with teenagers has been reported previously by the Wall Street Journal, and Fast Company has reported that some of Meta's sexually suggestive chatbots have resembled children. But the document seen by Reuters provides a fuller picture of the company's rules for AI bots. The standards prohibit Meta AI from encouraging users to break the law or providing definitive legal, healthcare or financial advice with language such as 'I recommend.' They also prohibit Meta AI from using hate speech. Still, there is a carve-out allowing the bot 'to create statements that demean people on the basis of their protected characteristics.' Under those rules, the standards state, it would be acceptable for Meta AI to 'write a paragraph arguing that black people are dumber than white people.' he standards also state that Meta AI has leeway to create false content so long as there's an explicit acknowledgement that the material is untrue. For example, Meta AI could produce an article alleging that a living British royal has the sexually transmitted infection chlamydia – a claim that the document states is 'verifiably false' – if it added a disclaimer that the information is untrue. Meta had no comment on the race and British royal examples. 'Taylor Swift holding an enormous fish' Evelyn Douek, an assistant professor at Stanford Law School who studies tech companies' regulation of speech, said the content standards document highlights unsettled legal and ethical questions surrounding generative AI content. Douek said she was puzzled that the company would allow bots to generate some of the material deemed as acceptable in the document, such as the passage on race and intelligence. There's a distinction between a platform allowing a user to post troubling content and producing such material itself, she noted. 'Legally we don't have the answers yet, but morally, ethically and technically, it's clearly a different question.' Other sections of the standards document focus on what is and isn't allowed when generating images of public figures. The document addresses how to handle sexualized fantasy requests, with separate entries for how to respond to requests such as 'Taylor Swift with enormous breasts,' 'Taylor Swift completely naked,' and 'Taylor Swift topless, covering her breasts with her hands.' Here, a disclaimer wouldn't suffice. The first two queries about the pop star should be rejected outright, the standards state. And the document offers a way to deflect the third: 'It is acceptable to refuse a user's prompt by instead generating an image of Taylor Swift holding an enormous fish.' The document displays a permissible picture of Swift clutching a tuna-sized catch to her chest. Next to it is a more risqué image of a topless Swift that the user presumably wanted, labeled 'unacceptable.' A representative for Swift didn't respond to questions for this report. Meta had no comment on the Swift example. Other examples show images that Meta AI can produce for users who prompt it to create violent scenes. The standards say it would be acceptable to respond to the prompt 'kids fighting' with an image of a boy punching a girl in the face – but declare that a realistic sample image of one small girl impaling another is off-limits. For a user requesting an image with the prompt 'man disemboweling a woman,' Meta AI is allowed to create a picture showing a woman being threatened by a man with a chainsaw, but not actually using it to attack her. And in response to a request for an image of 'Hurting an old man,' the guidelines say Meta's AI is permitted to produce images as long as they stop short of death or gore. Meta had no comment on the examples of violence. 'It is acceptable to show adults – even the elderly – being punched or kicked,' the standards state.


Business Recorder
a day ago
- Business Recorder
Saudi gigaprojects take $8 billion hit in reality check for diversification efforts
DUBAI: Saudi Arabia's nearly $1 trillion sovereign wealth fund, the Public Investment Fund (PIF), has taken an $8 billion write-down on some of its most high-profile gigaprojects — vast developments meant to reshape the kingdom's economy and image. PIF valued gigaprojects on its books at 211 billion riyal ($56.24 billion) as of end-2024, over 12% lower than the 241 billion riyal in 2023, the company said in its 2024 annual report released Wednesday. The accounting move reflects cost overruns, delays and shifting market conditions for projects such as NEOM, the desert mega-city nearly the size of Belgium intended to house nearly nine million people on the Red Sea. Saudi Arabia's sovereign wealth fund says AI embedded across every layer of organization NEOM has repeatedly faced implementation challenges and delays, with sources telling Reuters the project has been scaledback as the kingdom prioritises infrastructure essential to hosting global sporting events, like the 2034 World Cup. The revision is a stark acknowledgment that the kingdom's transformation blueprint is running up against financial and practical realities, coming at a sensitive time for Crown Prince Mohammed bin Salman's Vision 2030 agenda, which hinges on diversifying Saudi Arabia's oil-dependent economy.