
Why Facebook-parent Meta may face same 'AI copying' problem as ChatGPT-maker OpenAI, Microsoft
Representative image
Facebook parent
Meta
's newest AI model, Llama 3.1, has been found to replicate passages from well-known books, including Harry Potter, far more frequently than anticipated, as per a new report which also says that many of these works remain under copyright. Researchers claim that the AI has memorised roughly 42% of the first Harry Potter book and can accurately reproduce 50-word sections about half the time. The study, conducted by experts from Stanford, Cornell, and West Virginia University, examined how five leading AI models processed the Books3 dataset, which includes thousands of copyrighted titles.
"Llama 3.1 70B—a mid-sized model Meta released in July 2024—is far more likely to reproduce Harry Potter text than any of the other four models,
the researchers found
.
"Interestingly, Llama 1 65B, a similar-sized model released in February 2023, had memorized only 4.4 percent of Harry Potter and the Sorcerer's Stone. This suggests that despite the potential legal liability, Meta did not do much to prevent memorization as it trained Llama 3. At least for this book, the problem got much worse between Llama 1 and Llama 3," the researchers wrote.
Meta's Llama 3.1 has been noted for retaining large portions of well-known books, including The Hobbit, 1984, and Harry Potter and the Sorcerer's Stone. In contrast, earlier versions, such as Llama 1, only memorized around 4% of Harry Potter. This suggests that the newer model is preserving significantly more copyrighted content.
Why Meta's models are reproducing exact text
Researchers suggest several reasons why Meta's AI models may be copying text verbatim. One possibility is that the same books were repeatedly used during training, reinforcing memorisation rather than generalising language patterns.
Others speculate that training data could include excerpts from fan websites, reviews, or academic papers, leading the model to inadvertently retain copyrighted content. Additionally, adjustments to the training process may have amplified this issue without developers realizing the extent of its impact.
What this means for Meta
These findings intensify concerns about how AI models are trained and whether they might be violating copyright laws. As authors and publishers push back against unauthorised use of their work, this could become a major challenge for tech companies like Meta.
Earlier this year, The New York Times sued OpenAI and Microsoft for copyright infringement, alleging that their AI models, including ChatGPT, were trained on copyrighted articles without permission. According to the Times, OpenAI, 'can generate output that recites Times' content verbatim, closely summarizes it, and mimics its expressive style.' It said that the AI company essentially stole their intellectual property.
AI Masterclass for Students. Upskill Young Ones Today!– Join Now

Try Our AI Features
Explore what Daily8 AI can do for you:
Comments
No comments yet...
Related Articles


Time of India
4 hours ago
- Time of India
What took Google hours to fix a glitch that its engineers 'identified in 10 minutes'
Google has apologised for a major outage recently that impacted services running on Google Cloud across the globe recently. The widespread outage disrupted over 70 Google Cloud services, bringing down major third-party platforms like Cloudflare, OpenAI, and Shopify, as well as Google's own products, including Gmail, Google Calendar, Google Drive, and Google Meet. In an incident report released late Friday (June 17), the company said, "Google Cloud, Google Workspace and Google Security Operations products experienced increased 503 errors in external API requests, impacting customers." Google attributed the hours-long downtime to a series of flawed updates, particularly a new 'quota policy checks' feature introduced in May. The feature, designed to evaluate automated incoming requests, was not adequately tested in real-world scenarios. This led to blank entries being sent across all Google Cloud data centers, triggering widespread system crashes. What caused the Google Cloud outage "On May 29, 2025, a new feature was added to Service Control for additional quota policy checks. This code change and binary release went through our region by region rollout, but the code path that failed was never exercised during this rollout due to needing a policy change that would trigger the code. As a safety precaution, this code change came with a red-button to turn off that particular policy serving path. The issue with this change was that it did not have appropriate error handling nor was it feature flag protected. Without the appropriate error handling, the null pointer caused the binary to crash. Feature flags are used to gradually enable the feature region by region per project, starting with internal projects, to enable us to catch issues. If this had been flag protected, the issue would have been caught in staging," said Google in the Incident report. by Taboola by Taboola Sponsored Links Sponsored Links Promoted Links Promoted Links You May Like Nejapa: Invierte $100 en AES CFD [Lee más] ofertas especiales Contáctanos Undo What took Google engineers hours to fix the issue and the way forward According to the company, Google engineers identified the issue within 10 minutes, but the outage persisted for seven hours due to overloaded systems in larger regions. Google admitted it failed to use feature flags, a standard industry practice that could have caught the issue during a gradual rollout. " Within 10 minutes, the root cause was identified and the red-button (to disable the serving path) was being put in place. The red-button was ready to roll out ~25 minutes from the start of the incident. Within 40 minutes of the incident, the red-button rollout was completed, and we started seeing recovery across regions, starting with the smaller ones first. Within some of our larger regions, such as us-central-1, as Service Control tasks restarted, it created a herd effect on the underlying infrastructure it depends on (i.e. that Spanner table), overloading the infrastructure. Service Control did not have the appropriate randomized exponential backoff implemented to avoid this. It took up to ~2h 40 mins to fully resolve in us-central-1 as we throttled task creation to minimize the impact on the underlying infrastructure and routed traffic to multi-regional databases to reduce the load. At that point, Service Control and API serving was fully recovered across all regions. Corresponding Google and Google Cloud products started recovering with some taking longer depending upon their architecture," said Google in its Incident Report. To prevent future incidents, Google said it will overhaul its system architecture to ensure operations continue even if one component fails. The company also committed to auditing its systems and improving both automated and human communication channels to keep customers informed during issues. Google apologises for the outage 'We deeply apologize for the impact this outage has had,' Google wrote in the incident report. 'Google Cloud customers and their users trust their businesses to Google, and we will do better. We apologize for the impact this has had not only on our customers' businesses and their users but also on the trust of our systems. We are committed to making improvements to help avoid outages like this moving forward.' CEO of Google's cloud unit Thomas Kurian, also posted about the outage in a Twitter post, saying 'we regret the disruption this caused our customers.' AI Masterclass for Students. Upskill Young Ones Today!– Join Now


Time of India
4 hours ago
- Time of India
Scale AI founder Alexandr Wang says he is waiting for Elon Musk's brain chips before having kids
Scale AI founder Alexandr Wang has stated that he plans to delay having children until brain-computer interfaces like Neuralink become available. The 28-year-old tech founder and soon-to-be head of Meta's superintelligence initiatives, shared this perspective on a recent Shawn Ryan Show episode. Tired of too many ads? go ad free now His decision highlights his interest in integrating superintelligence into the next generation. Neuralink, a project led by , is developing coin-sized microchips designed for brain implantation. These chips are intended to both record and stimulate brain activity. Currently in clinical trials, Neuralink has been implanted in three patients. One patient, Brad Smith, who has ALS, reported being able to edit a video using his Neuralink brain chip. Why Scale AI founder Alexandr Wang is waiting for Neuralink brain chips to have kids At one of the recent episodes of Shawn Ryan Show, Wang said: 'I want to wait to have kids until we figure out how Neuralink or other ways (brain computer interfaces) for brains to interlink with a computer until they start working. There are a few reasons for this. In first seven years of life, your brain is more neuroplastic than at any other point by an order of magnitude. For example, if a newborn that has cataracts in their eyes, so they can't see through the cataracts and then they live the first seven years of their life with those cataracts. Then when you have them removed they're like eight or nine. Even with those removed, they're not going to learn how to see because it's so important in those first seven years of your development that you're able to see, that your brain can learn how to read the signals coming off of your eyes. And if you don't have that until you're eight or nine, then you won't learn how to see, because it's so important that your neuroplasticity is so high in that early stage of life. Tired of too many ads? go ad free now I think, when we get Neuralink and these other technologies, kids who are born with them are going to learn how to use them in like crazy ways. It'll be like a part of their brain in a way that it'll never be true for an adult who gets a Neuralink or whatever hooked into their brain.' You can watch the video . (Cue: 1.00)


Time of India
4 hours ago
- Time of India
Big tech on a quest for ideal AI device
ChatGPT-maker OpenAI has enlisted the legendary designer behind the iPhone to create an irresistible gadget for using generative artificial intelligence (AI). The ability to engage digital assistants as easily as speaking with friends is being built into eyewear, speakers, computers and smartphones, but some argue that the Age of AI calls for a transformational new gizmo. "The products that we're using to deliver and connect us to unimaginable technology are decades old," former Apple chief design officer Jony Ive said when his alliance with OpenAI was announced. "It's just common sense to at least think, surely there's something beyond these legacy products." Sharing no details, OpenAI chief executive Sam Altman said that a prototype Ive shared with him "is the coolest piece of technology that the world will have ever seen." According to several US media outlets, the device won't have a screen, nor will it be worn like a watch or broach. Kyle Li, a professor at The New School, said that since AI is not yet integrated into people's lives, there is room for a new product tailored to its use. The type of device won't be as important as whether the AI innovators like OpenAI make "pro-human" choices when building the software that will power them, said Rob Howard of consulting firm Innovating with AI Learning from flops The industry is well aware of the spectacular failure of the AI Pin, a square gadget worn like a badge packed with AI features but gone from the market less than a year after its debut in 2024 due to a dearth of buyers. The AI Pin marketed by startup Humane to incredible buzz was priced at $699. Now, Meta and OpenAI are making "big bets" on AI-infused hardware, according to CCS Insight analyst Ben Wood. OpenAI made a multi-billion-dollar deal to bring Ive's startup into the fold. Google announced early this year it is working on mixed-reality glasses with AI smarts, while Amazon continues to ramp up Alexa digital assistant capabilities in its Echo speakers and displays. Apple is being cautious embracing generative AI , slowly integrating it into iPhones even as rivals race ahead with the technology. Plans to soup up its Siri chatbot with generative AI have been indefinitely delayed. The quest for creating an AI interface that people love "is something Apple should have jumped on a long time ago," said Futurum research director Olivier Blanchard. Time to talk Blanchard envisions some kind of hub that lets users tap into AI, most likely by speaking to it and without being connected to the internet. "You can't push it all out in the cloud," Blanchard said, citing concerns about reliability, security, cost, and harm to the environment due to energy demand. "There is not enough energy in the world to do this, so we need to find local solutions," he added. Howard expects a fierce battle over what will be the must-have personal device for AI, since the number of things someone is willing to wear is limited and "people can feel overwhelmed." A new piece of hardware devoted to AI isn't the obvious solution, but OpenAI has the funding and the talent to deliver, according to Julien Codorniou, a partner at venture capital firm 20VC and a former Facebook executive. OpenAI recently hired former Facebook executive and Instacart chief Fidji Simo as head of applications, and her job will be to help answer the hardware question. Voice is expected by many to be a primary way people command AI. Google chief Sundar Pichai has long expressed a vision of "ambient computing" in which technology blends invisibly into the world, waiting to be called upon. "There's no longer any reason to type or touch if you can speak instead," Blanchard said. "Generative AI wants to be increasingly human" so spoken dialogues with the technology "make sense," he added. However, smartphones are too embedded in people's lives to be snubbed any time soon, said Wood.