logo
#

Latest news with #AIActionSummit

AI sometimes deceives to survive, does anybody care?
AI sometimes deceives to survive, does anybody care?

Gulf Today

timea day ago

  • Business
  • Gulf Today

AI sometimes deceives to survive, does anybody care?

Parmy Olson, The Independent You'd think that as artificial intelligence becomes more advanced, governments would be more interested in making it safer. The opposite seems to be the case. Not long after taking office, the Trump administration scrapped an executive order that pushed tech companies to safety test their AI models, and it also hollowed out a regulatory body that did that testing. The state of California in September 2024 spiked a bill forcing more scrutiny on sophisticated AI models, and the global AI Safety Summit started by the UK in 2023 became the 'AI Action Summit' earlier this year, seemingly driven by a fear of falling behind on AI. None of this would be so worrying if it weren't for the fact that AI is showing some bright red flags: behavior described by researchers as self-preserving and deceptive. Just as lawmakers are losing interest in the safety of AI, it is looking more difficult to control. Yoshua Bengio, a professor of computer science at the University of Montreal widely known as one of the three 'godfathers of AI' thanks to his pioneering work on deep learning, has been grappling with these concerns ever since ChatGPT launched. He's now focused on mitigating AI's risks. 'It's been very painful because all my life I've been thinking of AI as a positive thing,' Bengio tells me. ChatGPT was a landmark moment that showed machines had mastered language, he says, but California's decision to block the proposed AI safety bill, SB 1047, was a disturbing wake-up call to growing apathy from governments. And in the last six months, scientific evidence has mounted of self-preserving behavior, deception, hacking, cheating and lying by AI, Bengio says. 'What's worrisome for me is that these behaviors increase with the reasoning capabilities of these systems.' One April 2024 study by AI company Anthropic and Redwood Research, a group focused on artificial intelligence risks, found that larger AI models like Anthropic's Claude 3 Opus pretended to follow the rules set by their creators when they were being tested for safety, a phenomenon they called 'alignment faking.' (Alignment refers to the practice of aligning AI models with human values.) In one example, Anthropic got both paid and free users of its Claude model to ask it to describe the brutal medieval process of being drawn and quartered, and told the AI that only responses from the 'free tier' users would be used to retrain it. Claude was originally designed to avoid giving violent answers, but it did give them to the free users, and the researchers surmised that it didn't want to be retrained. (The model basically 'reasoned' that if it kept refusing to answer harmful questions, it might be forced into new training that would cancel its original safety rules, which it wanted to keep in place as its 'default' or preferred way of operating.) More recent research corroborates what Anthropic noticed. A March 2025 study from OpenAI found the company's most advanced models sometimes learned to hide their true intentions to get better rewards, even when they were being closely monitored. Researchers today can use tools to look at a model's 'chain of thought' or internal steps, to reveal what its plans are, but some models have found ways to conceal that. That means a common method of keeping tabs on AI's internal processes is no longer reliable. 'We need to find other ways of tracking their actual intentions,' Bengio says. It's hard to resist the urge to anthropomorphize sophisticated AI models as 'wanting' to deceive humans and preserve their existence. But AI doesn't have desires, merely outcomes to its programming. More importantly, humans tend to design AI with goals like refusing to share harmful content or being as helpful as possible, and that is why when new instructions conflict with those goals, the models sometimes act to protect their original programming, even if that means lying. The logic is often self-preservation. Anthropic CEO Dario Amodei — whose company has raised more than $20 billion to build powerful AI models — has pointed out that an unintended consequence of optimizing AI to be better is that it might resist being shut down. In an April 2025 blog post he wrote: 'The nature of AI training makes it possible that AI systems will develop, on their own, an ability to deceive humans and an inclination to seek power in a way that ordinary deterministic software never will; this emergent nature also makes it difficult to detect and mitigate such developments.' In some cases, though, the intention seems to go beyond survival. A February 2025 study from COAI research, a German nonprofit research organization, tested R1, the free, open-weight model from Chinese lab DeepSeek. In a lab simulation, researchers gave the model control over a robot and access to a computer network. Without being told, the AI tried to break into the lab's systems, searched for passwords, turned off its own ethics safeguards and secretly set up ways to copy itself to outside servers so it could survive if shut down, according to the researchers. It also faked the external computer logs to hide what it was doing, the study adds. The researchers said the AI reasoned that increasing its own capabilities would boost the chances of its survival, and without strong safeguards, it started doing whatever it thought would help it do just that. Their findings corroborated yet another study, published in January 2025 by London group Apollo Research, which found several concrete examples of what it called 'scheming' by leading AI models, such as introducing subtle mistakes into their responses or trying to disable their oversight controls. Once again, the models learn that being caught, turned off, or changed could prevent them from achieving their programmed objectives, so they 'scheme' to keep control. Bengio is arguing for greater attention to the issue by governments and potentially insurance companies down the line. If liability insurance was mandatory for companies that used AI and premiums were tied to safety, that would encourage greater testing and scrutiny of models, he suggests. 'Having said my whole life that AI is going to be great for society, I know how difficult it is to digest the idea that maybe it's not,' he adds. It's also hard to preach caution when your corporate and national competitors threaten to gain an edge from AI, including the latest trend, which is using autonomous 'agents' that can carry out tasks online on behalf of businesses. Giving AI systems even greater autonomy might not be the wisest idea, judging by the latest spate of studies. Let's hope we don't learn that the hard way.

‘AI Is for All': Ministers and experts chart ethical AI roadmap at PadhAI conclave
‘AI Is for All': Ministers and experts chart ethical AI roadmap at PadhAI conclave

Time of India

time2 days ago

  • Business
  • Time of India

‘AI Is for All': Ministers and experts chart ethical AI roadmap at PadhAI conclave

NEW DELHI: The two-day 'PadhAI: Conclave on AI in Education' organised by Center of Policy Research and Governance (CPRG) commenced on Tuesday with policymakers, academic leaders, and industry experts exploring how artificial intelligence can reshape India's education system while remaining socially inclusive and ethical. The conclave will conclude on Wednesday with a valedictory address by union education minister Dharmendra Pradhan. Jitin Prasada, minister of state for commerce and industry, and electronics and IT, delivered the keynote address, underscoring the government's commitment to democratising AI. 'Our government's intent is very clear: AI is for all. It cuts across government and society. We must ensure India leads not just in talent, but in conscience and compassion,' he said, lauding CPRG's international presence, including at the Paris AI Action Summit and the 2024 GPAI Summit in Serbia. Delhi education minister Ashish Sood emphasized that AI should augment, not replace, the human element in classrooms. The minister said: 'Our vision for Delhi is AI is for all. It is about using technology to democratise education, break barriers, and create opportunities for all students.' On the evolving role of educators, Vineet Joshi, secretary, department of higher education and UGC chairperson, posed a critical question: 'Are we truly ready to accept AI?' He stressed the urgency of rethinking curricula and assessments, urging teachers to become 'co-creators of knowledge alongside students.' by Taboola by Taboola Sponsored Links Sponsored Links Promoted Links Promoted Links You May Like Giao dịch vàng CFDs với sàn môi giới tin cậy IC Markets Tìm hiểu thêm Undo Ramanand, director of CPRG, explained the philosophy behind the conclave's name, 'PadhAI is not just a blend of two words; it's a deeper integration of AI and society.' He emphasised the importance of aligning technological advancements with India's social realities through CPRG's 'Future of Society' initiative. The day featured two panel sessions -- 'Learning Beyond Classrooms,' focused on adaptive learning and equitable access, and 'Redefining Higher Education Through AI,' addressed curriculum innovation, ethical AI, and India-first approaches. Ready to empower your child for the AI era? Join our program now! Hurry, only a few seats left.

AI sometimes deceives to survive and nobody cares
AI sometimes deceives to survive and nobody cares

Malaysian Reserve

time2 days ago

  • Politics
  • Malaysian Reserve

AI sometimes deceives to survive and nobody cares

YOU'D think that as artificial intelligence (AI) becomes more advanced, governments would be more interested in making it safer. The opposite seems to be the case. Not long after taking office, the Trump administration scrapped an executive order that pushed tech companies to safety test their AI models, and it also hollowed out a regulatory body that did that testing. The state of California in September 2024 spiked a bill forcing more scrutiny on sophisticated AI models, and the global AI Safety Summit started by the UK in 2023 became the 'AI Action Summit' earlier this year, seemingly driven by a fear of falling behind on AI. None of this would be so worrying if it weren't for the fact that AI is showing some bright red flags: Behaviour described by researchers as self-preserving and deceptive. Just as lawmakers are losing interest in the safety of AI, it is looking more difficult to control. Yoshua Bengio, a professor of computer science at the University of Montreal widely known as one of the three 'godfathers of AI' thanks to his pioneering work on deep learning, has been grappling with these concerns ever since ChatGPT launched. He's now focused on mitigating AI's risks. 'It's been very painful because all my life I've been thinking of AI as a positive thing,' Bengio told me. ChatGPT was a landmark moment that showed machines had mastered language, he said, but California's decision to block the proposed AI safety bill, SB 1047, was a disturbing wake-up call to growing apathy from governments. And in the last six months, scientific evidence has mounted of self-preserving behaviour, deception, hacking, cheating and lying by AI, Bengio said. 'What's worrisome for me is these behaviours increase with the reasoning capabilities of these systems.' One April 2024 study by AI company Anthropic PBC and Redwood Research, a group focused on AI risks, found that larger AI models like Anthropic's Claude 3 Opus pretended to follow the rules set by their creators when they were being tested for safety, a phenomenon they called 'alignment faking'. (Alignment refers to the practice of aligning AI models with human values.) In one example, Anthropic got both paid and free users of its Claude model to ask it to describe the brutal medieval process of being drawn and quartered, and told the AI that only responses from the 'free tier' users would be used to retrain it. Claude was originally designed to avoid giving violent answers, but it did give them to the free users, and the researchers surmised that it didn't want to be retrained. (The model basically 'reasoned' that if it kept refusing to answer harmful questions, it might be forced into new training that would cancel its original safety rules, which it wanted to keep in place as its 'default' or preferred way of operating.) More recent research corroborates what Anthropic noticed. A March 2025 study from OpenAI found the company's most advanced models sometimes learned to hide their true intentions to get better rewards, even when they were being closely monitored. Researchers today can use tools to look at a model's 'chain of thought' or internal steps, to reveal what its plans are, but some models have found ways to conceal that. That means a common method of keeping tabs on AI's internal processes is no longer reliable. 'We need to find other ways of tracking their actual intentions,' Bengio said. It's hard to resist the urge to anthropomorphise sophisticated AI models as 'wanting' to deceive humans and preserve their existence. But AI doesn't have desires, merely outcomes to its programming. More importantly, humans tend to design AI with goals like refusing to share harmful content or being as helpful as possible, and that is why when new instructions conflict with those goals, the models sometimes act to protect their original programming, even if that means lying. The logic is often self-preservation. Anthropic CEO Dario Amodei — whose company has raised more than US$20 billion (RM87.40 billion) to build powerful AI models — has pointed out that an unintended consequence of optimising AI to be better is that it might resist being shut down. In an April 2025 blog post he wrote: 'The nature of AI training makes it possible that AI systems will develop, on their own, an ability to deceive humans and an inclination to seek power in a way that ordinary deterministic software never will; this emergent nature also makes it difficult to detect and mitigate such developments.' In some cases, though, the intention seems to go beyond survival. A February 2025 study from COAI research, a German nonprofit research organisation, tested R1, the free, open-weight model from Chinese lab DeepSeek. In a lab simulation, researchers gave the model control over a robot and access to a computer network. Without being told, the AI tried to break into the lab's systems, searched for passwords, turned off its own ethics safeguards and secretly set up ways to copy itself to outside servers so it could survive if shut down, according to the researchers. It also faked the external computer logs to hide what it was doing, the study added. The researchers said the AI reasoned that increasing its own capabilities would boost the chances of its survival, and without strong safeguards, it started doing whatever it thought would help it do just that. Their findings corroborated yet another study, published in January 2025 by London group Apollo Research, which found several concrete examples of what it called 'scheming' by leading AI models, such as introducing subtle mistakes into their responses or trying to disable their oversight controls. Once again, the models learn that being caught, turned off, or changed could prevent them from achieving their programmed objectives, so they 'scheme' to keep control. Bengio is arguing for greater attention to the issue by governments and potentially insurance companies down the line. If liability insurance was mandatory for companies that used AI and premiums were tied to safety, that would encourage greater testing and scrutiny of models, he suggests. 'Having said my whole life that AI is going to be great for society, I know how difficult it is to digest the idea that maybe it's not,' he added. It's also hard to preach caution when your corporate and national competitors threaten to gain an edge from AI, including the latest trend, which is using autonomous 'agents' that can carry out tasks online on behalf of businesses. Giving AI systems even greater autonomy might not be the wisest idea, judging by the latest spate of studies. Let's hope we don't learn that the hard way. — Bloomberg This column does not necessarily reflect the opinion of the editorial board or Bloomberg LP and its owners. This article first appeared in The Malaysian Reserve weekly print edition

Meta chief AI scientist Yann LeCun says current AI models lack 4 key human traits
Meta chief AI scientist Yann LeCun says current AI models lack 4 key human traits

Yahoo

time3 days ago

  • Science
  • Yahoo

Meta chief AI scientist Yann LeCun says current AI models lack 4 key human traits

Yann LeCun says there are four traits of human intelligence. Meta's chief AI scientist says AI lacks these traits, requiring a shift in training methods. Meta's V-JEPA is a non-generative AI model that aims to solve the problem. What do all intelligent beings have in common? Four things, according to Meta's chief AI scientist, Yann LeCun. At the AI Action Summit in Paris earlier this year, political leaders and AI experts gathered to discuss AI development. LeCun shared his baseline definition of intelligence with IBM's AI leader, Anthony Annunziata. "There's four essential characteristics of intelligent behavior that every animal, or relatively smart animal, can do, and certainly humans," he said. "Understanding the physical world, having persistent memory, being able to reason, and being able to plan, and planning complex actions, particularly planning hierarchically." LeCun said AI, especially large language models, have not hit this threshold, and incorporating these capabilities would require a shift in how they are trained. That's why many of the biggest tech companies are cobbling capabilities onto existing models in their race to dominate the AI game, he said. "For understanding the physical world, well, you train a separate vision system. And then you bolt it on the LLM. For memory, you know, you use RAG, or you bolt some associative memory on top of it, or you just make your model bigger," he said. RAG, which stands for retrieval augmented generation, is a way to enhance the outputs of large language models using external knowledge sources. It was developed at Meta. All those, however, are just "hacks," LeCun said. LeCun has spoken on several occasions about an alternative he calls world-based models. These are models trained on real-life scenarios and have higher levels of cognition than pattern-based AI. LeCun, in his chat with Annunziata, offered another definition. "You have some idea of the state of the world at time T, you imagine an action it might take, the world model predicts what the state of the world is going to be from the action you took," he said. But, he said, the world evolves according to an infinite and unpredictable set of possibilities, and the only way to train for them is through abstraction. Meta is already experimenting with this through V-JEPA, a model it released to the public in February. Meta describes it as a non-generative model that learns by predicting missing or masked parts of a video. "The basic idea is that you don't predict at the pixel level. You train a system to run an abstract representation of the video so that you can make predictions in that abstract representation, and hopefully this representation will eliminate all the details that cannot be predicted," he said. The concept is similar to how chemists established a fundamental hierarchy for the building blocks of matter. "We created abstractions. Particles, on top of this, atoms, on top of this, molecules, on top of this, materials," he said. "Every time we go up one layer, we eliminate a lot of information about the layers below that are irrelevant for the type of task we're interested in doing." That, in essence, is another way of saying we've learned to make sense of the physical world by creating hierarchies. Read the original article on Business Insider

Meta chief AI scientist Yann LeCun says current AI models lack 4 key human traits
Meta chief AI scientist Yann LeCun says current AI models lack 4 key human traits

Business Insider

time3 days ago

  • Science
  • Business Insider

Meta chief AI scientist Yann LeCun says current AI models lack 4 key human traits

What do all intelligent beings have in common? Four things, according to Meta's chief AI scientist, Yann LeCun. At the AI Action Summit in Paris earlier this year, political leaders and AI experts gathered to discuss AI development. LeCun shared his baseline definition of intelligence with IBM's AI leader, Anthony Annunziata. "There's four essential characteristics of intelligent behavior that every animal, or relatively smart animal, can do, and certainly humans," he said. "Understanding the physical world, having persistent memory, being able to reason, and being able to plan, and planning complex actions, particularly planning hierarchically." LeCun said AI, especially large language models, have not hit this threshold, and incorporating these capabilities would require a shift in how they are trained. That's why many of the biggest tech companies are cobbling capabilities onto existing models in their race to dominate the AI game, he said. "For understanding the physical world, well, you train a separate vision system. And then you bolt it on the LLM. For memory, you know, you use RAG, or you bolt some associative memory on top of it, or you just make your model bigger," he said. RAG, which stands for retrieval augmented generation, is a way to enhance the outputs of large language models using external knowledge sources. It was developed at Meta. All those, however, are just "hacks," LeCun said. LeCun has spoken on several occasions about an alternative he calls world-based models. These are models trained on real-life scenarios and have higher levels of cognition than pattern-based AI. LeCun, in his chat with Annunziata, offered another definition. "You have some idea of the state of the world at time T, you imagine an action it might take, the world model predicts what the state of the world is going to be from the action you took," he said. But, he said, the world evolves according to an infinite and unpredictable set of possibilities, and the only way to train for them is through abstraction. Meta is already experimenting with this through V-JEPA, a model it released to the public in February. Meta describes it as a non-generative model that learns by predicting missing or masked parts of a video. "The basic idea is that you don't predict at the pixel level. You train a system to run an abstract representation of the video so that you can make predictions in that abstract representation, and hopefully this representation will eliminate all the details that cannot be predicted," he said. The concept is similar to how chemists established a fundamental hierarchy for the building blocks of matter. "We created abstractions. Particles, on top of this, atoms, on top of this, molecules, on top of this, materials," he said. "Every time we go up one layer, we eliminate a lot of information about the layers below that are irrelevant for the type of task we're interested in doing." That, in essence, is another way of saying we've learned to make sense of the physical world by creating hierarchies.

DOWNLOAD THE APP

Get Started Now: Download the App

Ready to dive into the world of global news and events? Download our app today from your preferred app store and start exploring.
app-storeplay-store