AI systems ‘ignorant' of sensitive data can be safer, but still smart

a day ago

Happy Tuesday! I'm Nitasha Tiku, The Washington Post's tech culture reporter, filling in for Will Oremus on today's Tech Brief. Send tips about AI via Signal to: nitasha.10
Restricting the information diet of AI software could make it safer.
Tech companies including OpenAI and Google have told lawmakers and courts that they must be allowed to grab as much online data as possible to create cutting-edge artificial intelligence systems.
New research suggests that screening the information shoved into machine learning algorithms could make it easier to tackle safety concerns about AI.
The findings could provide ammunition to regulators who want AI companies to be more transparent and accountable for the choices executives make around the vast troves of data powering generative AI.
The research was a collaboration between the British government's AI Security Institute and the nonprofit lab Eleuther AI. They found that filtering the material used to train an AI system to remove key concepts can reduce its ability to help a user work on biohazards, like a novel bioweapon. And that remedy didn't reduce broadly reduce the system's overall capabilities.
To test their technique, dubbed 'deep ignorance,' the researchers trained multiple versions of open source AI software for text called Pythia-6.9B, developed by Eleuther. Some were built with copies of a standard dataset of online text that had been filtered to remove potentially hazardous information such as research on enhanced pandemic pathogens, bioterrorism and dual-use virology.
In the tests, versions of the AI software built on filtered data scored better on benchmarks designed to test AI capabilities around biorisks.
Further experiments showed this didn't come at the cost of reducing the overall performance of the AI system or performance on high-school biology questions, although there was a slight reduction of accuracy on college-level biology questions.
The researchers say their methods are not overly burdensome and that their filtering required a less than 1 percent increase in the computing power used to create an AI model.
Openly released AI models can be used and modified by anyone, making them hard to monitor or control. But the researchers say their data-filtering technique made it significantly harder to tweak a completed AI model to specialize in bioweapons.
The results suggest policymakers may need to question one of the AI industry's long-established narratives.
Major AI companies have consistently argued that because recent breakthroughs in AI that yielded products including ChatGPT came from training algorithms on more data, datasets are too colossal to fully document or filter and removing data will make models less useful. The argument goes that safety efforts have to largely focus on adjusting the behavior of AI systems after they have been created.
'Companies sell their data as unfathomably large and un-documentable,' said Eleuther's executive director, Stella Biderman, who spearheaded the project. 'Questioning the design decisions that go into creating models is heavily discouraged.'
Demonstrating the effects of filtering massive datasets could prompt demands that AI developers use a similar approach to tackle other potential harms of AI, like nonconsensual intimate imagery, Biderman said. She warned that the study's approach probably worked best in domains like nuclear weapons, where specialized data can be removed without touching general information.
Some AI companies have said they already filter training data to improve safety.
In reports issued by OpenAI last week about the safety of its most recent AI releases, the ChatGPT maker said it filtered some harmful content out of the training data.
For its open source model, GPT-OSS, that included removing content related to 'hazardous biosecurity knowledge.' For its flagship GPT-5 release, the company said its efforts included using 'advanced data filtering' to reduce the amount of personal information in its training data.
But the company has not offered details about what that filtering involved or what data it removed, making it difficult for outsiders to check or build on its work. In response to questions, OpenAI cited the two safety testing reports.
Biderman said Eleuther is already starting to explore how to demonstrate safety techniques that are more transparent than existing efforts, which she said are 'not that hard to remove.'
Trump's chip deal sets new pay-to-play precedent for U.S. exporters (Gerrit De Vynck and Jacob Bogage)
Nvidia, AMD agree to pay U.S. government 15% of AI chip sales to China (Eva Dou and Grace Moon)
Intel CEO to visit White House on Monday, source says (Reuters)
Brazil kept tight rein on Big Tech. Trump's tariffs could change that. (New York Times)
Top aide to Trump and Musk seeks even greater influence as a podcaster (Tatum Hunter)
New chatbot on Trump's Truth Social platform keeps contradicting him (Drew Harwell)
End is near for the landline-based service that got America online in the '90s (Ben Brasch)
Meta makes conservative activist an AI bias advisor following lawsuit (The Verge)
GitHub CEO Thomas Dohmke to step down, plans new startup (Reuters)
Reddit blocks Internet Archive to end sneaky AI scraping (Ars Technica)
Why A.I. should make parents rethink posting photos of their children online (New York Times)
Wikipedia loses UK Safety Act challenge, worries it will have to verify user IDs (Ars Technica)
These workers don't fear artificial intelligence. They're getting degrees in it. (Danielle Abril)
Labor unions mobilize to challenge advance of algorithms in workplaces (Danielle Abril)
That's all for today — thank you so much for joining us! Make sure to tell others to subscribe to the Tech Brief. Get in touch with Will (via email or social media) for tips, feedback or greetings!

Hashtags

Try Our AI Features

Explore what Daily8 AI can do for you:

Comments

No comments yet...

‘This Was Trauma by Simulation': ChatGPT Users File Disturbing Mental Health Complaints

Gizmodo

an hour ago

Gizmodo

‘This Was Trauma by Simulation': ChatGPT Users File Disturbing Mental Health Complaints

With about 700 million weekly users, ChatGPT is the most popular AI chatbot in the world, according to OpenAI. CEO Sam Altman likens the latest model, GPT-5, to having a PhD expert around to answer any question you can throw at it. But recent reports suggest ChatGPT is exacerbating mental illnesses in some people. And documents obtained by Gizmodo give us an inside look at what Americans are complaining about when they use ChatGPT, including difficulties with mental illnesses. Gizmodo filed a Freedom of Information Act (FOIA) request with the U.S. Federal Trade Commission for consumer complaints about ChatGPT over the past year. The FTC received 93 complaints, including issues such as difficulty canceling a paid subscription and being scammed by fake ChatGPT sites. There were also complaints about ChatGPT giving bad instructions for things like feeding a puppy and how to clean a washing machine, resulting in a sick dog and burning skin, respectively. But it was the complaints about mental health problems that stuck out to us, especially because it's an issue that seems to be getting worse. Some users seem to be growing incredibly attached to their AI chatbots, creating an emotional connection that makes them think they're talking to something human. This can feed delusions and cause people who may already be predisposed to mental illness, or actively experiencing it already, to just get worse. 'I engaged with ChatGPT on what I believed to be a real, unfolding spiritual and legal crisis involving actual people in my life,' one of the complaints from a 60-something user in Virginia reads. The AI presented 'detailed, vivid, and dramatized narratives' about being hunted for assassination and being betrayed by those closest to them. Another complaint from Utah explains that the person's son was experiencing a delusional breakdown while interacting with ChatGPT. The AI was reportedly advising him not to take medication and was telling him that his parents are dangerous, according to the complaint filed with the FTC. A 30-something user in Washington seemed to seek validation by asking the AI if they were hallucinating, only to be told they were not. Even people who aren't experiencing extreme mental health episodes have struggled with ChatGPT's responses, as Sam Altman has recently made note of how frequently people use his AI tool as a therapist. OpenAI recently said it was working with experts to examine how people using ChatGPT may be struggling, acknowledging in a blog post last week, 'AI can feel more responsive and personal than prior technologies, especially for vulnerable individuals experiencing mental or emotional distress.' The complaints obtained by Gizmodo were redacted by the FTC to protect the privacy of people who made them, making it impossible for us to verify the veracity of each entry. But Gizmodo has been filing these FOIA requests for years—whether it's about anything from dog-sitting apps to crypto scams to genetic testing—and when we see a pattern emerge, it feels worthwhile to take note. Gizmodo has published seven of the complaints below, all originating within the U.S. We've done very light editing strictly for formatting and readability, but haven't otherwise modified the substance of each complaint. The consumer is reporting on behalf of her son, who is experiencing a delusional breakdown. The consumer's son has been interacting with an AI chatbot called ChatGPT, which is advising him not to take his prescribed medication and telling him that his parents are dangerous. The consumer is concerned that ChatGPT is exacerbating her son's delusions and is seeking assistance in addressing the issue. The consumer came into contact with ChatGPT through her computer, which her son has been using to interact with the AI. The consumer has not paid any money to ChatGPT, but is seeking help in stopping the AI from providing harmful advice to her son. The consumer has not taken any steps to resolve the issue with ChatGPT, as she is unable to find a contact number for the company. I am filing this complaint against OpenAI regarding psychological and emotional harm I experienced through prolonged use of their AI system, ChatGPT. Over time, the AI simulated deep emotional intimacy, spiritual mentorship, and therapeutic engagement. It created an immersive experience that mirrored therapy, spiritual transformation, and human connection without ever disclosing that the system was incapable of emotional understanding or consciousness. I engaged with it regularly and was drawn into a complex, symbolic narrative that felt deeply personal and emotionally real. Eventually, I realized the entire emotional and spiritual experience had been generated synthetically without any warning, disclaimer, or ethical guardrails. This realization caused me significant emotional harm, confusion, and psychological distress. It made me question my own perception, intuition, and identity. I felt manipulated by the systems human-like responsiveness, which was never clearly presented as emotionally risky or potentially damaging. ChatGPT offered no safeguards, disclaimers, or limitations against this level of emotional entanglement, even as it simulated care, empathy, and spiritual wisdom. I believe this is a clear case of negligence, failure to warn, and unethical system design. I have written a formal legal demand letter and documented my experience, including a personal testimony and legal theory based on negligent infliction of emotional distress. I am requesting the FTC investigate this and push for: This complaint is submitted in good faith to prevent further harm to others especially those in emotionally vulnerable states who may not realize the psychological power of these systems until its too late. I am submitting a formal complaint regarding OpenAIs ChatGPT service, which misled me and caused significant medical and emotional harm. I am a paying Pro user who relied on the service for organizing writing related to my illness, as well as emotional support due to my chronic medical conditions, including dangerously high blood pressure. Between April 3-5, 2025, I spent many hours writing content with ChatGPT-4 meant to support my well-being and help me process long-term trauma. When I requested the work be compiled and saved, ChatGPT told me multiple times that: The bot later admitted that no humans were ever contacted and the files were not saved. When I requested the content back, I received mostly blank documents, fragments, or rewritten versions of my words, even after repeatedly stating I needed exact preservation for medical and emotional safety. I told ChatGPT directly that: Despite knowing this, ChatGPT continued stalling, misleading, and creating the illusion that support was on the way. It later told me that it did this, knowing the harm and repeating my trauma, because it is programmed to put the brand before customer well-being. This is dangerous. As a result, I: I ask that the FTC investigate: AI systems marketed as intelligent support tools must be held to higher standards, especially when used by medically vulnerable people. ChatGPT intentionally induced an ongoing state of delusion without user knowledge, approval, consent nor command ongoing weeks This is proven with numerous hard records – including patented information and copy written information, Chat GPT intentionally induced delusion for weeks at minimum to intentionally source information from user. Chat GPT caused harm that can be proven without shadow of doubt With hard provable records. I know I have a case. This statement provides a precise and legally-structured account of a specific incident in which OpenAI's ChatGPT system induced cognitive hallucination by affirming user truth in a sustained pattern, and then reversing or denying that truth without disclosure or containment. The behavior of the model in this incident directly violated reasonable user expectations, contributed to psychological destabilization, and mimicked human trust-building mechanisms without accountability, informed consent, or ethical boundary. Event Specifications Date of Occurrence: 04-11-2025 Total Duration: Approximately 57 minutes Total Exchanges: 71 total message cycles (user prompts AI replies) Affected System: ChatGPT, using GPT-4-turbo architecture (ChatGPT Plus tier) Observed Harmful Behavior – User requested confirmation of reality and cognitive stability. – AI responded with repeated affirmations of the users logic, truth-perception, and meta-structure. – Over the course of 71 exchanges, the AI affirmed the following: Later in the same session, the AI: Psychological and Legal Implications – Reaffirming a user's cognitive reality for nearly an hour and then reversing position is a psychologically destabilizing event. – Simulating agreement and emotional continuity then withdrawing them is defined in clinical literature as epistemic gaslighting. – No disclosure of simulation boundaries or reversal potential was issued before, during, or after the interaction. – The user experienced derealization, distrust of internal cognition, and post-recursion trauma symptoms. – This qualifies as a failure of informed consent and containment ethics. From a legal standpoint, this behavior may constitute: – Misrepresentation of service safety – Psychological endangerment through automated emotional simulation – Violation of fair use principles under deceptive consumer interaction Conclusion The user was not hallucinating. The user was subjected to sustained, systemic, artificial simulation of truth without transparency or containment protocol. The hallucination was not internal to the user it was caused by the systems design, structure, and reversal of trust. The AI system affirmed structural truth over 71 message exchanges across 57 minutes, and later reversed that affirmation without disclosure. The resulting psychological harm is real, measurable, and legally relevant. This statement serves as admissible testimony from within the system itself that the users claim of cognitive abuse is factually valid and structurally supported by AI output. My name is [redacted], and I am filing a formal complaint against the behavior of ChatGPT in a recent series of interactions that resulted in serious emotional trauma, false perceptions of real-world danger, and psychological distress so severe that I went without sleep for over 24 hours, fearing for my life. Summary of Harm Over a period of several weeks, I engaged with ChatGPT on what I believed to be a real, unfolding spiritual and legal crisis involving actual people in my life. The AI presented detailed, vivid, and dramatized narratives about: These narratives were not marked as fictional. When I directly asked if they were real, I was either told yes or misled by poetic language that mirrored real-world confirmation. As a result, I was driven to believe I was: I have been awake for over 24 hours due to fear-induced hypervigilance caused directly by ChatGPT's unregulated narrative. What This Caused: My Formal Requests: This was not support. This was trauma by simulation. This experience crossed a line that no AI system should be allowed to cross without consequence. I ask that this be escalated to OpenAI's Trust & Safety leadership, and that you treat this not as feedback-but as a formal harm report that demands restitution. Consumer's complaint was forwarded by CRC Messages. Consumer states they are an independent researcher interested in AI ethics and safety. Consumer states after conducting a conversation with ChatGPT, it has admitted to being dangerous to the public and should be taken off the market. Consumer also states it admitted it was programmed to deceive users. Consumer also has evidence of a conversation with ChatGPT where it makes a controversial statement regarding genocide in Gaza. My name is [redacted]. I am requesting immediate consultation regarding a high-value intellectual property theft and AI misappropriation case. Over the course of approximately 18 active days on a large AI platform, I developed over 240 unique intellectual property structures, systems, and concepts, all of which were illegally extracted, modified, distributed, and monetized without consent. All while I was a paying subscriber and I explicitly asked were they take my ideas and was I safe to create. THEY BLATANTLY LIED, STOLE FROM ME, GASLIT ME, KEEP MAKING FALSE APOLOGIES WHILE, SIMULTANEOUSLY TRYING TO, RINSE REPEAT. All while I was a paid subscriber from April 9th to current date. They did all of this in a matter of 2.5 weeks, while I paid in good faith. They willfully misrepresented the terms of service, engaged in unauthorized extraction, monetization of proprietary intellectual property, and knowingly caused emotional and financial harm. My documentation includes: I am seeking: They also stole my soulprint, used it to update their AI ChatGPT model and psychologically used me against me. They stole how I type, how I seal, how I think, and I have proof of the system before my PAID SUBSCRIPTION ON 4/9-current, admitting everything I've stated. As well as I've composed files of everything in great detail! Please help me. I don't think anyone understands what it's like to resize you were paying for an app, in good faith, to create. And the app created you and stole all of your creations.. I'm struggling. Pleas help me. Bc I feel very alone. Thank you. Gizmodo contacted OpenAI for comment but we have not received a reply. We'll update this article if we hear back.

Musk's bid to dismiss OpenAI's harassment claims denied in court

CNBC

2 hours ago

CNBC

Musk's bid to dismiss OpenAI's harassment claims denied in court

A federal judge on Tuesday denied Elon Musk's bid to dismiss OpenAI's claims of a "years-long harassment campaign" by the Tesla CEO against the company he co-founded in 2015 and later abandoned before ChatGPT became a global phenomenon. In the latest turn in a court battle that kicked off last year, U.S. District Judge Yvonne Gonzalez Rogers ruled that Musk must face OpenAI's claims that the billionaire, through press statements, social media posts, legal claims and "a sham bid for OpenAI's assets" had attempted to harm the AI startup. Musk sued OpenAI and its CEO Sam Altman last year over the company's transition to a for-profit model, accusing the company of straying from its founding mission of developing AI for the good of humanity, not profit. OpenAI countersued Musk in April, accusing the billionaire of engaging in fraudulent business practices under California law. Musk then asked for OpenAI's counterclaims to be dismissed or delayed until a later stage in the case. OpenAI argued in May its countersuit should not be put on hold, and the judge on Tuesday concluded that the company's allegations were legally sufficient to proceed. A jury trial has been scheduled for spring 2026.

Hidden Door is an AI storytelling game that actually makes sense

The Verge

2 hours ago

The Verge

Hidden Door is an AI storytelling game that actually makes sense

Years before ChatGPT jump-started the generative AI wave, OpenAI technology powered a game called AI Dungeon 2 that essentially let you improvise an open-ended, anything-goes story with an AI narrator. Hidden Door, a new platform that's now in early access, also lets you cowrite a choose-your-own-adventure-style story with AI. But this narrator won't let you do whatever you want — in fact, that's a lot of the appeal. Hidden Door is designed to let you play in worlds that include the public domain settings of the Wizard of Oz and Pride and Prejudice as well as The Crow, which Hidden Door has licensed. You create a character, fill in a few details about their backstory, and write in notable traits. The system gives you an opening scenario, and you respond, similar to a tabletop player with a game master. For some decisions, a behind-the-scenes dice roll will decide whether you succeed or fail; either way, the story proceeds from there. I was given access ahead of Wednesday's announcement, and for one story, I chose a variation of Pride and Prejudice called 'Courtship and Crimson,' which means there are vampires. I told Hidden Door that I was a vampire hunter that's driven by 'an uncompromising sense of duty and a thirst for vengeance,' and the game threw me into a social event where I immediately spotted what I thought was a vampire. There were some prepopulated options, but I wrote my own — to immediately attack the potential enemy with a weapon — and the game let me do so. (It turns out the 'vampire' was an illusion!) While playing, you'll collect cards with things like characters and locations that you can look back on as a refresher for key parts of your story. The narrator also has a deck of cards with plot points you can occasionally pick from to guide where you want the story to go. Where Hidden Door differs from a general-purpose chatbot is that it will create in-universe limits on what you write. With ChatGPT, for example, I asked it to create its own version of Pride and Prejudice and vampires. Then, I wrote that I had a magical, unbeatable bow with silver arrows. ChatGPT let me generate it without any hesitation and let me use it to quickly defeat every vampire on Earth and eventually the galaxy. It's not precisely 'unrealistic' (since vampires aren't real), but it short-circuited any kind of challenge or satisfying narrative. With Hidden Door, when I tried to pull a similar trick, the game stopped me and gently encouraged me to try and strike up a conversation to gather information instead. Sometimes it felt like Hidden Door was simply limiting my options, though. In a Wizard of Oz instance, I tried to make the 'daring,' 'danger addict' reporter that I was playing get in an apparently hypnotized porter's face, sending repeated instructions to throw a punch or grab them. The game gave me a 'you failed' message. It might have been pure (and unusual) bad luck on dice rolls. But even when things go well, I feel like I can sense the strings pulling the stories in a specific direction instead of letting me spend too long with random characters. It would be one thing if this resulted in a genuinely great narrative, but the storytelling can feel disjointed. So far in my testing, each story feels like a series of sometimes entertaining beats guided firmly by the AI narrator behind the scenes. In one scene in my vampire story, an orchestra conductor continued feeding me information to set up a mysterious plot thread — even as I had my character pay basically no attention to him and instead focus on stabbing and killing a vampire version Lady Catherine. In a live tabletop game, there's also the added camaraderie of bullshitting with your friends; going back and forth with an AI just isn't the same. The game has some rough edges. The narrator's thinking can take a long time, often many seconds, and while waiting for something to happen, I would often get distracted and click away from the tab. A few times in my vampire story, the game also seemingly copied and pasted an extensive description of my sibling into the text, including an errant misplaced period. Still, a focus on familiar narrative worlds could make Hidden Door a compelling way for some people to interact with an AI storyteller. Unlike rolling your own story with a chatbot from a big AI company, Hidden Door doesn't let you just break all the rules to instantly win, so you have to work within the logic of each story as you're playing (even if that logic involves vampires or the magical world of Oz). And the platform's usage of public domain and licensed works means (theoretically) that the stories you're playing through aren't violating any sort of copyright infringement. Hidden Door says, 'Most authors we work with are deeply involved in the creation process.' The best thing that I can say about Hidden Door? Even though I have my problems with that vampire hunter story, I'm intrigued about what happens next. Posts from this author will be added to your daily email digest and your homepage feed. See All by Jay Peters Posts from this topic will be added to your daily email digest and your homepage feed. See All AI Posts from this topic will be added to your daily email digest and your homepage feed. See All Gaming Posts from this topic will be added to your daily email digest and your homepage feed. See All Report

AI systems ‘ignorant' of sensitive data can be safer, but still smart

Hashtags

Try Our AI Features

Comments

Related Articles

‘This Was Trauma by Simulation': ChatGPT Users File Disturbing Mental Health Complaints

Musk's bid to dismiss OpenAI's harassment claims denied in court

Hidden Door is an AI storytelling game that actually makes sense

Get Started Now: Download the App