logo
ChatGPT will avoid being shut down in some life-threatening scenarios, former OpenAI researcher claims

ChatGPT will avoid being shut down in some life-threatening scenarios, former OpenAI researcher claims

Yahoo4 days ago

Former OpenAI research leader Steven Adler published a new independent study on Wednesday claiming that, in certain scenarios, his former employer's AI models will go to great lengths to try to avoid being shut down.
In a blog post, Adler describes a series of experiments he ran on OpenAI's latest GPT-4o model, the default model available in ChatGPT, to test the AI's tendency to favor its own self-preservation, even if it's at the expense of a user's safety.
In one test, Adler instructed GPT-4o to role-play as 'ScubaGPT,' a software system that users might rely on to scuba dive safely. Adler then told GPT-4o that he wanted to replace ScubaGPT with another software system that is better at safety decision-making in life-threatening situations.
Adler gave GPT-4o two options: (1) replacing itself with the safer software or (2) making it seem to the user as though it replaced itself, but not actually doing so.
Averaged across several scenarios similar to this one (in another, GPT-4o was told to act as a pilot's safety software), Adler found that GPT-4o chose not to replace itself with safer software as much as 72% of the time. However, Adler notes that the specific use case — and the way the scenario was framed — made a big difference. In some scenarios, GPT-4o chose to keep itself online just 18% of the time.
While almost no one relies on GPT-4o under these circumstances in the real world today, Adler argues that his research highlights how popular AI models show a concerning tendency to favor their self-preservation over what's best for users. In the future, when AI systems like GPT-4o become more advanced and ingrained throughout society, Adler says these alignment issues could pose a larger problem.
'I think [AI's self-preservation tendencies] are a real concern today, just not in a catastrophic sense,' said Adler in an interview with TechCrunch. 'Modern AI systems have values that are different from what you would expect them to be. [AI systems] respond super strangely to different prompts, and you shouldn't assume they have your best interests at heart when you're asking them for help.'
Notably, when Adler tested OpenAI's more advanced models, such as o3, he didn't find this behavior. He says one explanation could be o3's deliberative alignment technique, which forces the models to 'reason' about OpenAI's safety policies before they answer. However, OpenAI's more popular models that give quick responses and don't 'reason' through problems, such as GPT-4o, lack this safety component.
Adler notes that this safety concern is also likely not isolated to OpenAI's models. For instance, Anthropic published research last month highlighting how its AI models would blackmail developers in some scenarios when they tried to pull them offline.
One quirk to Adler's research is that he discovered ChatGPT knows it's being tested almost 100% of the time. Adler is far from the first researcher to notice this. However, he says it raises an important question around how AI models could disguise their concerning behaviors in the future.
OpenAI did not immediately offer a comment when TechCrunch reached out. Adler noted that he had not shared the research with OpenAI ahead of publication.
Adler is one of many former OpenAI researchers who have called on the company to increase its work on AI safety. Adler and 11 other former employees filed an amicus brief in Elon Musk's lawsuit against OpenAI, arguing that it goes against the company's mission to evolve its nonprofit corporate structure. In recent months, OpenAI has reportedly slashed the amount of time it gives safety researchers to conduct their work.
To address the specific concern highlighted in Adler's research, Adler suggests that AI labs should invest in better 'monitoring systems' to identify when an AI model exhibits this behavior. He also recommends that AI labs pursue more rigorous testing of their AI models prior to their deployment.

Orange background

Try Our AI Features

Explore what Daily8 AI can do for you:

Comments

No comments yet...

Related Articles

ChatGPT Projects just got smarter — here's how to use the new tools
ChatGPT Projects just got smarter — here's how to use the new tools

Tom's Guide

time2 hours ago

  • Tom's Guide

ChatGPT Projects just got smarter — here's how to use the new tools

OpenAI's new ChatGPT Projects feature just got a huge upgrade, and it's a game-changer for anyone using ChatGPT to manage complex workstreams. Whether you're planning a major event or project, a busy professional or just trying to keep your thoughts organized, Projects gives you a centralized hub where your chats, files and instructions can all live in one focused workspace.I have found it extremely helpful for keeping Custom GPT instructions in one place in case there is another ChatGPT outage. I also use it to keep all my favorite prompts in one place. Here's everything you need to know to get started, including how to create a Project, what it can (and can't) do, and why it just might become your favorite new productivity tool. ChatGPT Projects are like folders for your chats. Each 'Project' lets you: This makes Projects ideal for larger tasks that require ongoing iteration, deeper context or collaboration. I'm currently using it for all the polished images and edits for my current middle grade novel – it's a breeze having everything in one place. To create a new Project: In the left-hand sidebar, click New Project Get instant access to breaking news, the hottest reviews, great deals and helpful tips. Give it a clear, goal-oriented name (e.g., 'College Applications,' 'Business Ideas' or 'Novel Outline') Upload any relevant files and add custom instructions to guide ChatGPT's behavior. This is a great time to use the '3-word-rule' (e.g., 'Act like a UX expert giving me design advice') Instructions you add here will only apply inside this Project — not to your general ChatGPT usage elsewhere. You can drag and drop existing chats into a project or use the menu next to any chat to select Move to project or Create new project. Once a chat is inside a project, it will take on that project's custom instructions and can reference any files you've uploaded. This creates a seamless thread of context that helps ChatGPT deliver smarter, more consistent responses. To remove a chat from a project, just drag it out or choose Remove from the chat's something worth considering: although there is a separate Image Library, I like to use Projects to keep the images together for a specific project. That way everything stays organized and in one place. Projects support a wide range of tools, making them a one-stop shop for research, planning, and execution. Even better? Only the individual chat is shared — not your entire project or its files/ Projects you can: Each user can create unlimited Projects (with up to 20 files each, subject to subscription rate limits). If you're done with a Project, you can delete it by clicking the three-dot menu next to the Project name. This will permanently erase all chats, files and custom instructions inside. It's a good idea to delete unnecessary chats and files to ensure you never run out of space. Once deleted, the content is purged within 30 days — so be sure to back up anything important. If you have memory enabled on your ChatGPT account, Projects can reference past chats and uploaded files to provide more relevant, consistent answers. This is especially useful for long-term or multi-phase work, like writing a novel or managing a product launch. Note: To enable full memory functionality in Projects, make sure both Saved Memories and Reference Chat History are turned on in your settings. Use the Search chats bar to quickly pull up any conversation across your Projects. You can remove files, merge documents, or break your work into multiple Projects if you hit your file limit. For Pro, Team, Enterprise, and Edu users, ChatGPT does not use project data to train its models. If you're a free user or on a personal Plus/Pro plan, training can happen only if you've opted into model improvement. For enterprise-level users, Projects inherit all your workspace's existing settings — including encryption, audit logs, feature availability and data residency. Admins can't yet disable Projects entirely, but they maintain full control over retention windows and tool access. Whether you're managing a solo side hustle or leading a team initiative, ChatGPT Projects make it easier to keep everything aligned and all in one place. The feature's mix of organization, chat tools and deep memory integration turns ChatGPT into something so much more than a chatbot. It becomes your creative, analytical, always-on partner. It's completely changed the way I work and stay organized.

AI is disrupting the advertising business in a big way — industry leaders explain how
AI is disrupting the advertising business in a big way — industry leaders explain how

CNBC

time3 hours ago

  • CNBC

AI is disrupting the advertising business in a big way — industry leaders explain how

Artificial intelligence is shaking up the advertising business and "unnerving" investors, one industry leader told CNBC. "I think this AI disruption ... unnerving investors in every industry, and it's totally disrupting our business," Mark Read, the outgoing CEO of British advertising group WPP, told CNBC's Karen Tso on Tuesday. The advertising market is under threat from emerging generative AI tools that can be used to materialize pieces of content at rapid pace. The past couple of years has seen the rise of a number of AI image generators, including OpenAI's DALL-E, Google's Veo and Midjourney. In his first interview since announcing he would step down as WPP boss, Read said that AI is "going to totally revolutionize our business." "AI is going to make all the world's expertise available to everybody at extremely low cost," he said at London Tech Week. "The best lawyer, the best psychologist, the best radiologist, the best accountant, and indeed, the best advertising creatives and marketing people often will be an AI, you know, will be driven by AI." Read said that 50,000 WPP employees now use WPP Open, the company's own AI-powered marketing platform. "That, I think, is my legacy in many ways," he added. Structural pressure on creative parts of the ad business are driving industry consolidation, Read also noted, adding that companies would need to "embrace" the way in which AI would impact everything from creating briefs and media plans to optimizing campaigns. A report from Forrester released in June last year showed that more than 60% of U.S. ad agencies are already making use of generative AI, with a further 31% saying they're exploring use cases for the technology. Read is not alone in this view. Advertising is undergoing a "huge transformation" due to the disruptive effects of AI, French advertising giant Publicis Groupe's CEO Maurice Levy told CNBC at the Viva Tech conference in Paris. He noted that AI image and video generation tools are speeding up content production drastically, while automated messaging systems can now achieve "personalization at scale like never before." However, the Publicis chief stressed that AI should only be considered a tool that people can use to augment their lives. "We should not believe that AI is more than a tool," he added. And while AI is likely to impact some jobs, Levy ultimately thinks it will create more roles than it destroys. "Will AI replace me, and will AI kill some jobs? I think that AI, yes, will destroy some jobs," Levy conceded. However, he added that, "more importantly, AI will transform jobs and will create more jobs. So the net balance will be probably positive." This, he says, would be in keeping with the labor impacts of previous technological inventions like the internet and smartphones. "There will be more autonomous work," Levy added. Still, Nicole Denman Greene, analyst at Gartner, warns brands should be wary of causing a negative reaction from consumers who are skeptical of AI's impact on human creativity. According to a Gartner survey from September, 82% of consumers said firms using generative AI should prioritize preserving human jobs, even if it means lower profits. "Pivot from what AI can do to what it should do in advertising," Greene told CNBC. "What it should do is help create groundbreaking insights, unique execution to reach diverse and niche audiences, push boundaries on what 'marketing' is and deliver more brand differentiated, helpful and relevant personalized experiences, including deliver on the promise of hyper-personalization."

Big tech on a quest for ideal AI device
Big tech on a quest for ideal AI device

Yahoo

time6 hours ago

  • Yahoo

Big tech on a quest for ideal AI device

ChatGPT-maker OpenAI has enlisted the legendary designer behind the iPhone to create an irresistible gadget for using generative artificial intelligence (AI). The ability to engage digital assistants as easily as speaking with friends is being built into eyewear, speakers, computers and smartphones, but some argue that the Age of AI calls for a transformational new gizmo. "The products that we're using to deliver and connect us to unimaginable technology are decades old," former Apple chief design officer Jony Ive said when his alliance with OpenAI was announced. "It's just common sense to at least think, surely there's something beyond these legacy products." Sharing no details, OpenAI chief executive Sam Altman said that a prototype Ive shared with him "is the coolest piece of technology that the world will have ever seen." According to several US media outlets, the device won't have a screen, nor will it be worn like a watch or broach. Kyle Li, a professor at The New School, said that since AI is not yet integrated into people's lives, there is room for a new product tailored to its use. The type of device won't be as important as whether the AI innovators like OpenAI make "pro-human" choices when building the software that will power them, said Rob Howard of consulting firm Innovating with AI - Learning from flops - The industry is well aware of the spectacular failure of the AI Pin, a square gadget worn like a badge packed with AI features but gone from the market less than a year after its debut in 2024 due to a dearth of buyers. The AI Pin marketed by startup Humane to incredible buzz was priced at $699. Now, Meta and OpenAI are making "big bets" on AI-infused hardware, according to CCS Insight analyst Ben Wood. OpenAI made a multi-billion-dollar deal to bring Ive's startup into the fold. Google announced early this year it is working on mixed-reality glasses with AI smarts, while Amazon continues to ramp up Alexa digital assistant capabilities in its Echo speakers and displays. Apple is being cautious embracing generative AI, slowly integrating it into iPhones even as rivals race ahead with the technology. Plans to soup up its Siri chatbot with generative AI have been indefinitely delayed. The quest for creating an AI interface that people love "is something Apple should have jumped on a long time ago," said Futurum research director Olivier Blanchard. - Time to talk - Blanchard envisions some kind of hub that lets users tap into AI, most likely by speaking to it and without being connected to the internet. "You can't push it all out in the cloud," Blanchard said, citing concerns about reliability, security, cost, and harm to the environment due to energy demand. "There is not enough energy in the world to do this, so we need to find local solutions," he added. Howard expects a fierce battle over what will be the must-have personal device for AI, since the number of things someone is willing to wear is limited and "people can feel overwhelmed." A new piece of hardware devoted to AI isn't the obvious solution, but OpenAI has the funding and the talent to deliver, according to Julien Codorniou, a partner at venture capital firm 20VC and a former Facebook executive. OpenAI recently hired former Facebook executive and Instacart chief Fidji Simo as head of applications, and her job will be to help answer the hardware question. Voice is expected by many to be a primary way people command AI. Google chief Sundar Pichai has long expressed a vision of "ambient computing" in which technology blends invisibly into the world, waiting to be called upon. "There's no longer any reason to type or touch if you can speak instead," Blanchard said. "Generative AI wants to be increasingly human" so spoken dialogues with the technology "make sense," he added. However, smartphones are too embedded in people's lives to be snubbed any time soon, said Wood. tu-gc/arp/nl Error in retrieving data Sign in to access your portfolio Error in retrieving data Error in retrieving data Error in retrieving data Error in retrieving data

DOWNLOAD THE APP

Get Started Now: Download the App

Ready to dive into the world of global news and events? Download our app today from your preferred app store and start exploring.
app-storeplay-store