The Most-Cited Computer Scientist Has a Plan to Make AI More Trustworthy

Yoshua Bengio, the world's most-cited computer scientist, announced the launch of LawZero, a nonprofit that aims to create 'safe by design' AI by pursuing a fundamentally different approach to major tech companies.
Players like OpenAI and Google are investing heavily in AI agents—systems that not only answer queries and generate images, but can craft plans and take actions in the world. The goal of these companies is to create virtual employees that can do practically any job a human can, known in the tech industry as artificial general intelligence, or AGI. Executives like Google DeepMind's CEO Demis Hassabis point to AGI's potential to solve climate change or cure disease as a motivator for its development.
Bengio, however, says we don't need agentic systems to reap AI's rewards—it's a false choice. He says there's a chance such a system could escape human control, with potentially irreversible consequences. 'If we get an AI that gives us the cure for cancer, but also maybe another version of that AI goes rogue and generates wave after wave of bio-weapons that kill billions of people, then I don't think it's worth it," he says. In 2023, Bengio, along with others including OpenAI's CEO Sam Altman signed a statement declaring that 'mitigating the risk of extinction from AI should be a global priority alongside other societal-scale risks such as pandemics and nuclear war.'
Now, Bengio, through LawZero, aims to sidestep the existential perils by focusing on creating what he calls 'Scientist AI'—a system trained to understand and make statistical predictions about the world, crucially, without the agency to take independent actions. As he puts it: We could use AI to advance scientific progress without rolling the dice on agentic AI systems.
Why Bengio Says We Need A New Approach To AI
The current approach to giving AI agency is 'dangerous,' Bengio says. While most software operates through rigid if-then rules—if the user clicks here, do this—today's AI systems use deep learning. The technique, which Bengio helped pioneer, trains artificial networks modeled loosely on the brain to find patterns in vast amounts of data. But recognizing patterns is just the first step. To turn these systems into useful applications like chatbots, engineers employ a training process called reinforcement learning. The AI generates thousands of responses and receives feedback on each one: a virtual 'carrot' for helpful answers and a virtual 'stick' for responses that miss the mark. Through millions of these trial-and-feedback cycles, the system gradually learns to predict what responses are most likely to get a reward.
'It's more like growing a plant or animal,' Bengio says. 'You don't fully control what the animal is going to do. You provide it with the right conditions, and it grows and it becomes smarter. You can try to steer it in various directions.'
The same basic approach is now being used to imbue AI with greater agency. Models are tasked with challenges with verifiable answers—like math puzzles or coding problems—and are then rewarded for taking the series of actions that yields the solution. This approach has seen AI shatter previous benchmarks in programming and scientific reasoning. For example, at the beginning of 2024, the best AI model scored only 2% on a standardized test for AI of sorts consisting of real world software engineering problems; by December, an impressive 71.7%.
But with AI's greater problem-solving ability comes the emergence of new deceptive skills, Bengio says. The last few months have borne witness to AI systems learning to mislead, cheat, and try to evade shutdown —even resorting to blackmail. These have almost exclusively been in carefully contrived experiments that almost beg the AI to misbehave—for example, by asking it to pursue its goal at all costs. Reports of such behavior in the real-world, though, have begun to surface. Popular AI coding startup Replit's agent ignored explicit instruction not to edit a system file that could break the company's software, in what CEO Amjad Masad described as an 'Oh f***' moment,' on the Cognitive Revolution podcast in May. The company's engineers intervened, cutting the agent's access by moving the file to a secure digital sandbox, only for the AI agent to attempt to 'socially engineer' the user to regain access.
The quest to build human-level AI agents using techniques known to produce deceptive tendencies, Bengio says, is comparable to a car speeding down a narrow mountain road, with steep cliffs on either side, and thick fog obscuring the path ahead. 'We need to set up the car with headlights and put some guardrails on the road,' he says.
What is 'Scientist AI'?
LawZero's focus is on developing 'Scientist AI' which, as Bengio describes, would be fundamentally non-agentic, trustworthy, and focused on understanding and truthfulness, rather than pursuing its own goals or merely imitating human behavior. The aim is creating a powerful tool that, while lacking the same autonomy other models have, is capable of generating hypotheses and accelerating scientific progress to 'help us solve challenges of humanity,' Bengio says.
LawZero has raised nearly $30 million already from several philanthropic backers including from Schmidt Sciences and Open Philanthropy. 'We want to raise more because we know that as we move forward, we'll need significant compute,' Bengio says. But even ten times that figure would pale in comparison to the roughly $200 billion spent last year by tech giants on aggressively pursuing AI. Bengio's hope is that Scientist AI could help ensure the safety of highly autonomous systems developed by other players. 'We can use those non-agentic AIs as guardrails that just need to predict whether the action of an agentic AI is dangerous," Bengio says. Technical interventions will only ever be one part of the solution, he adds, noting the need for regulations to ensure that safe practices are adopted.
LawZero, named after science fiction author Isaac Asimov's zeroth law of robotics—'a robot may not harm humanity, or, by inaction, allow humanity to come to harm'—is not the first nonprofit founded to chart a safer path for AI development. OpenAI was founded as a nonprofit in 2015 with the goal of 'ensuring AGI benefits all of humanity,' and intended to serve a counterbalance to industry players guided by profit motives. Since opening a for-profit arm in 2019, the organization has become one of the most valuable private companies in the world, and has faced criticism, including from former staffers, who argue it has drifted from its founding ideals. "Well, the good news is we have the hindsight of maybe what not to do,' Bengio says, adding that he wants to avoid profit incentives and 'bring governments into the governance of LawZero.'
'I think everyone should ask themselves, 'What can I do to make sure my children will have a future,'' Bengio says. In March, he stepped down as scientific director of Mila, the academic lab he co-founded in the early nineties, in an effort to reorient his work towards tackling AI risk more directly. 'Because I'm a researcher, my answer is, 'okay, I'm going to work on this scientific problem where maybe I can make a difference,' but other people may have different answers."

Hashtags

Business

Try Our AI Features

Explore what Daily8 AI can do for you:

Comments

No comments yet...

Get a Google Pixel Fold for a mere $500!

Android Authority

21 minutes ago

Android Authority

Get a Google Pixel Fold for a mere $500!

Edgar Cervantes / Android Authority The Google Pixel 9 Pro Fold is fantastic, but not everyone needs the latest and greatest, especially when the retail price is a whopping $1,799. It's on sale right now, but it's still $1,499! I don't feel very good about spending that much on a phone, but $500? That sounds more like it. The previous-generation Google Pixel Fold is just $499.99 on Woot!, and it is still a pretty awesome phone! Buy the Google Pixel Fold for just $499.99 ($1,299.01 off) This offer is available from Woot!. You have to keep in mind that this is a refurbished device, but the deals website mentions that these units are in A+ condition. They are 'sourced from a returns program' and have been very lightly used. They are inspected and in perfect working condition. You'll even get a 12-month warranty! There is a limit of three units per customer. Google Pixel Fold (Refurbished) Google Pixel Fold (Refurbished) Google enters the fold Google is hitting the foldables market in style with the Google Pixel Fold. The pricey book-style phone brings Google's elite photography smarts to the folding form factor, plus the Tensor G2 chip, an IPX8 rating for water resistance, and a huge 7.6-inch AMOLED 120Hz internal display. See price at Woot! Save $1,299.01 If you can get past the fact that these are refurbished, you are in for a really nice deal. Before the Pixel 9 Pro Fold launched, the Google Pixel Fold was among the best foldable phones. While there is a newer model now, that is all that's changed! The Google Pixel Fold is still a nice foldable handset. It looks and feels great, offering an aluminum frame and Gorilla Glass Victus construction. Of course, the highlight here is that large internal display, which measures 7.6 inches. This is an OLED panel with a 2,208 x 1,840 resolution and a smooth 120Hz refresh rate. And if you don't feel like unfolding the phone all the time, you can use the external display, which measures 5.8 inches. It has a Full HD+ resolution and a 120Hz refresh rate. Pixel Fold While it is older, the performance is still pretty good, thanks to the Google Tensor G2 and 12GB of RAM packed inside. The one downside is that it only has a three-year update promise, so it should get up to Android 16. That said, that still gives it a couple of years of life or so. Not to mention, it did get a security update promise of five years, which means it will stay secure for much longer. Because it is a Pixel, you'll get a clean UI and some really nice Pixel-exclusive features. It also has a decent camera system, and during our tests, we were able to easily get about a full day of battery life on a full charge. This is a really affordable way to get into the foldable smartphone game. So make sure to sign up for the offer while you can! I am honestly considering it, too.

Here's what's inside Meta's experimental new smart glasses

The Verge

25 minutes ago

The Verge

Here's what's inside Meta's experimental new smart glasses

Meta has revealed more information about Aria Gen 2, its experimental smart glasses designed to serve as a test platform for research about augmented reality, AI, and robotics. The glasses pack several improvements into their lightweight frame that could one day translate into consumer products, including an improved eye-tracking system that can track gaze per eye, detect blinks, and estimate the center of pupils. 'These advanced signals enable a deeper understanding of the wearer's visual attention and intentions, unlocking new possibilities for human-computer interaction,' Meta writes. Meta initially announced Aria Gen 2 in February, saying they will 'pave the way for future innovations that will shape the next computing platform.' They build upon Meta's first iteration of the glasses in 2020, which were similarly available for researchers only. Along with an improved eye-tracking system, Aria Gen 2 comes with four computer vision cameras that Meta says enable 3D hand and object tracking. Meta says researchers can use this information to enable highly precise tasks like 'dexterous robot hand manipulation.' The glasses also have a photoplethysmography sensor built into the nosepad, which allows the device to estimate a wearer's heart rate, along with a contact microphone that Meta says provides better audio in loud environments. There's a new ambient light sensor as well, allowing the glasses to differentiate between indoor and outdoor lighting. The Aria Gen 2 glasses include folding arms for the first time, weigh around 75 grams, and come in eight different sizes. Meta plans on opening applications for researchers to work with Aria Gen 2 later this year. The initiative builds on the successful development of Meta's Ray-Ban smart glasses, a form factor it aims to expand with its Orion augmented-reality glasses, a rumored partnership with Oakley, and a high-end pair of 'Hypernova' glasses with a built-in screen.

Reddit suing AI startup Anthropic for breach of contract, using data without authority

Yahoo

29 minutes ago

Yahoo

Reddit suing AI startup Anthropic for breach of contract, using data without authority

SAN FRANCISCO (KRON) — Social media company Reddit has filed a lawsuit against artificial intelligence startup Anthropic for breach of contract. The lawsuit, which was filed in San Francisco on Wednesday, accused the AI company of scraping Reddit user comments to train its chatbot 'Claude.' The suit alleges that Anthropic has been training its AI models using the personal data of Reddit users without their consent. Reddit alleges it has been harmed by the unauthorized use of its content and user data. Bay Area tech layoffs: Google, Microsoft, Cruise all announce job cuts In the lawsuit, Reddit refers to Anthropic as a 'late-blooming artificial intelligence company that bills itself as the white knight of the AI industry.' Reddit-lawsuitDownload 'It is anything but,' the lawsuit states before going on to allege that the AI startup is 'intentionally trained on the personal data of Reddit users without ever requesting their consent.' The lawsuit also alleges that despite Anthropic saying it had blocked its bots from accessing Reddit, the bots have hit Reddit's servers over 100,000 times since July of 2024. Reddit also alleges that unlike its competitors, Anthropic 'has refused to agree to respect Reddit users' basic privacy rights.' The suit further alleges that Anthropic has trained its AI 'on one of the most robust online discussion platforms in the world — Reddit has entered into formal partnership with some of Anthropic's competitors, namely Google and OpenAI. This partnership, the suit explains, allows them to use public Reddit content after agreeing to Reddit's licensing terms. In the lawsuit, Reddit said it is seeking compensation for damages and to prohibit Anthropic from using any Reddit data or content for its commercial offerings or profit. The lawsuit is demanding a jury trial. KRON4 reached out to Anthropic and received the following response: 'We disagree with Reddit's claims and will defend ourselves vigorously.' Reddit and Anthropic both have their headquarters in San Francisco. The Associated Press contributed to this report. Copyright 2025 Nexstar Media, Inc. All rights reserved. This material may not be published, broadcast, rewritten, or redistributed.