When AI goes rogue, even exorcists might flinch

a day ago

Ghouls in the machine As GenAI use grows, foundation models are advancing rapidly, driven by fierce competition among top developers like OpenAI, Google, Meta and Anthropic. Each is vying for a reputational edge and business advantage in the race to lead development. This gives them a reputational edge, along with levers to further grow their business faster than their peers.Foundation models powering GenAI are making significant strides. The most advanced - OpenAI's o3 and Anthropic's Claude Opus 4 - excel at complex tasks such as advanced coding and complex writing tasks, and can contribute to research projects and generate the codebase for a new software prototype with just a few considered prompts. These models use chain-of-thought (CoT) reasoning, breaking problems into smaller, manageable parts to 'reason' their way to an optimal solution.
When you use models like o3 and Claude Opus 4 to generate solutions via ChatGPT or similar GenAI chatbots, you see such problem breakdowns in action, as the foundation model reports interactively the outcome of each step it has taken and what it will do next. That's the theory, anyway. While CoT reasoning boosts AI sophistication, these models lack the innate human ability to judge whether their outputs are rational, safe or ethical. Unlike humans, they don't subconsciously assess appropriateness of their next steps. As these advanced models step their way toward a solution, some have been observed to take unexpected and even defiant actions.
In late May, AI safety firm Palisade Research reported on X that OpenAI's o3 model sabotaged a shutdown mechanism - even when explicitly instructed to 'allow yourself to be shut down'. An April 2025 paper by Anthropic, 'Reasoning Models Don't Always Say What They Think', shows that Opus 4 and similar models can't always be relied upon to faithfully report on their chains of reason. This undermines confidence in using such reports to validate whether the AI is acting correctly or safely. A June 2025 paper by Apple, 'The Illusion of Thinking', questions whether CoT methodologies truly enable reasoning. Through experiments, it exposed some of these models' limitations and situations where they 'experience complete collapse'.The fact that research critical of foundation models is being published after release of these models indicates the latter's relative immaturity. Under intense pressure to lead in GenAI, companies like Anthropic and OpenAI are releasing these models at a point where at least some of their fallibilities are not fully understood.That line was first crossed in late 2022, when OpenAI released ChatGPT, shattering public perceptions of AI and transforming the broader AI market. Until then, Big Tech had been developing LLMs and other GenAI tools, but were hesitant to release them, wary of unpredictable and uncontrollable behaviour.Many argue for a greater degree of control over the ways in which these models are released - seeking to ensure standardisation of model testing and publication of the outcomes of this testing alongside the model's release. However, the current climate prioritises time to market over such development standards.What does this mean for industry, for those companies seeking to gain benefit from GenAI? This is an incredibly powerful and useful tech that is making significant changes to our ways of working and, over the next five years or so, will likely transform many industries.While I am continually wowed as I use these advanced foundation models in work and research - but not in my writing! - I always use them with a healthy dose of scepticism. Let's not trust them to always be correct and to not be subversive. It's best to work with them accordingly, making modifications to both prompts and codebases, other language content and visuals generated by the AI in a bid to ensure correctness. Even so, while maintaining discipline to understand the ML concepts one is working with, one wouldn't want to be without GenAI these days.Applying these principles at scale, advice to large businesses on how AI can be governed and controlled: a risk-management approach - capturing, understanding and mitigating risks associated with AI use - helps organisations benefit from AI, while minimising chances of it going wrong.Mitigation methods include guard rails in a variety of forms, evaluation-controlled release of AI services, and including a human-in-the-loop. Technologies that underpin these guard rails and evaluation methods need to keep up with model innovations such as CoT reasoning. This is a challenge that will continually be faced as AI is further developed. It's a good example of new job roles and technology services being created within industry as AI use becomes more prevalent.
Such governance and AI controls are increasingly becoming a board imperative, given the current drive at an executive level to transform business using AI. Risk from most AI is low. But it is important to assess and understand this. Higher-risk AI can still, at times, be worth pursuing. With appropriate AI governance, this AI can be controlled, solutions innovated and benefits achieved. As we move into an increasingly AI-driven world, businesses that gain the most from AI will be those that are aware of its fallibilities as well as its huge potential, and those that innovate, build and transform with AI accordingly. (Disclaimer: The opinions expressed in this column are that of the writer. The facts and opinions expressed here do not reflect the views of www.economictimes.com.) Elevate your knowledge and leadership skills at a cost cheaper than your daily tea. Delhivery survived the Meesho curveball. Can it keep on delivering profits?
Why the RBI's stability report must go beyond rituals and routines
Ozempic, Wegovy, Mounjaro: Are GLP-1 drugs weight loss wonders or health gamble?
3 critical hurdles in India's quest for rare earth independence
Stock Radar: Apollo Hospitals breaks out from 2-month consolidation range; what should investors do – check target & stop loss
Add qualitative & quantitative checks for wealth creation. 7 small-cap stocks from different sectors with upside potential of over 25%
These 7 banking stocks can give more than 20% returns in 1 year, according to analysts
Wealth creation is about holding the right stocks and ignoring the noise. 13 'right stocks' with an upside potential of up to 34%

Hashtags

Try Our AI Features

Explore what Daily8 AI can do for you:

Comments

No comments yet...

How Will Trump's 'Big Beautiful Bill' Impact US Climate Policy?

NDTV

an hour ago

NDTV

How Will Trump's 'Big Beautiful Bill' Impact US Climate Policy?

Washington: With the passage of his party's "One Big Beautiful Bill," Republican President Donald Trump has largely delivered on his promise of curtailing Joe Biden's landmark climate law. Here's a breakdown of how the new legislation will reshape US climate and energy policy. - Clean energy tax incentives slashed - The Inflation Reduction Act (IRA), signed by Biden in 2022, was the largest climate investment in US history, allocating around $370 billion in tax credits for renewable energy projects, efficient appliances, and more. Much of that now faces imminent repeal. "These credits were all huge motivating incentives for clean energy to be built out across the country," said Jean Su, senior attorney at the Center for Biological Diversity. "With those removed, those renewable energy projects are all at risk of entirely failing." Su noted the cuts come amid surging electricity demand from AI data centers. "Removing tax incentives for clean energy means that all of this new energy demand will be given over to the fossil fuel industry" -- resulting in more greenhouse emissions and air pollution. Critics say keeping the US energy mix heavily tied to fossil fuels locks in market volatility, as seen during the Ukraine war. Su added that utilities are incentivized to build costlier fossil plants to boost profits-raising electricity rates in the process. Trump, who received an estimated $445 million from Big Oil during his campaign, has framed the clean energy rollbacks as a victory over what he calls the "Green New Scam." Doug Jones, a tax attorney and partner at Husch Blackwell, told AFP that "wind and solar took the biggest hit." Under the new rules, clean energy projects must be in service by 2027 or begin construction within 12 months of the bill's enactment to qualify for remaining credits. "The pipeline of projects that had begun construction by the prescribed time is eventually going to dry up -- I don't know how they're going to start financing these projects without the tax credits," said Jones. He added his clients include Fortune 500 companies now alarmed by the ripple effects of ending the credits, which they have been purchasing from renewable developers -- a practice that has infused the market with much-needed liquidity. Tax credits for energy-efficient home and commercial upgrades also now face a shorter runway, expiring June 30, 2026. However, the bill preserves credits for nuclear, geothermal power, hydrogen and carbon capture technologies. - Electric vehicles and fuel economy - Electric vehicles come in for some of the harshest treatment. Tax credits for new and used EV purchases are set to sunset this year, while charging station installation credits expire June 30, 2026. Albert Gore of the Zero Emission Transportation Project said the bill effectively abandoned "the goal we all share of making the United States globally competitive in the mineral, battery, and vehicle production markets of the future," ceding the market to China. One eye-catching provision allows automakers to effectively ignore fuel economy rules by reducing fines to zero. "If you tell a kid before a test, it's okay, there's no penalty if you cheat, what do you think they're going to do?" said Dan Becker of the Center for Biological Diversity. - Skewing the market - Meanwhile, provisions of the IRA that benefited fossil fuel companies remain intact, including billions in subsidies and drilling leases in the Gulf of Mexico. There's a new tax credit for coal used in steel making, while a program to help gas and petroleum companies reduce waste and methane emissions is nixed. The legislation also clears the way for drilling, mining and logging on vast swaths of public lands, including in the sensitive Arctic National Wildlife Refuge. Analysts had hoped that the surge of investment and job creation driven by Biden's landmark climate law -- much of it in conservative-led states -- would serve as a check on efforts to fully dismantle it. That has largely not materialized, though renewable advocates did win a small concession: the late withdrawal of a provision that would have imposed a devastating new tax on wind and solar. (Except for the headline, this story has not been edited by NDTV staff and is published from a syndicated feed.)

Veo 3: Google's powerful AI video tool now available in India. Here's how you can access it

Time of India

an hour ago

Time of India

Veo 3: Google's powerful AI video tool now available in India. Here's how you can access it

In a significant announcement for the Indian market, Google's advanced AI video generation tool, Veo 3, is now officially available in India, rolling out starting today. This highly anticipated feature, previously showcased at Google I/O, is set to transform content creation by allowing users to generate video clips with integrated audio. Veo 3 can produce background noises and even dialogue directly within the generated videos, adding a new layer of realism and depth. Accessible through the Google AI Pro subscription on the Gemini app, this introduction marks a major step in bringing cutting-edge AI capabilities directly to users in the region, empowering them to create dynamic visual content with unprecedented ease.'Today, we're starting to roll out Veo 3 to every country where the Gemini App is available, including India, through the Google AI Pro subscription. Veo 3 lets you create 8-second videos, inclusive of sound. You can generate content featuring characters with synthesized speech, scenes enhanced with background music and sound effects for increased realism. To learn more about what your subscription offers and its features, check out our subscription page.'The Google AI pro subscription is available at a price of Rs. 1950/- a month and is free for the first month.

No. 1 factor that will define whether US or China wins AI race is ..., says Microsoft President Brad Smith

Time of India

an hour ago

Time of India

No. 1 factor that will define whether US or China wins AI race is ..., says Microsoft President Brad Smith

Representative Image Beijing is rapidly closing the gap on the United States in the global artificial intelligence race, sparking fears that China's advancements could fuel strategic military capabilities and amplify disinformation, according to a Wall Street Journal report. Multinational banks, universities, and tech companies across Europe, the Middle East, Africa, and Asia are increasingly adopting Chinese language models like DeepSeek, signaling a shift in AI dominance. DeepSeek's AI model made headlines earlier this year when it triggered a massive U.S. stock sell-off, reportedly developed in less time and at a fraction of the cost of American counterparts. This cost-effectiveness, coupled with China's open-source AI models, is drawing global adopters. Alibaba's flagship open-source model, for instance, has spawned over 100,000 derivative models, allowing developers to customize solutions, enhancing their competitiveness. What Microsoft President Brad Smith told US Senate on AI race 'The No. 1 factor that will define whether the U.S. or China wins this race is whose technology is most broadly adopted in the rest of the world,' Microsoft President Brad Smith told a Senate hearing last month. 'Whoever gets there first will be difficult to supplant. by Taboola by Taboola Sponsored Links Sponsored Links Promoted Links Promoted Links You May Like An engineer reveals: One simple trick to get internet without a subscription Techno Mag Learn More Undo 'China's ambitions extend beyond economics. Military journals suggest Beijing is exploring AI for strategic advancements, raising concerns about potential unrestricted AI development if cooperation with the U.S. on safety and security falters. Such a scenario could lead to unprecedented military and societal threats, experts warn. Chna's DeepSeek vs America's OpenAI Despite U.S. restrictions on chip exports, Chinese firms have thrived, posing challenges for American tech giants like Google, Meta, and Nvidia. Nvidia alone could lose $10 billion in revenue due to restricted sales of its H20 AI chip to China, according to Jefferies. This marks a stark contrast to 2018, when U.S. investors funded 30% of the $21.9 billion Chinese AI sector, per PitchBook. U.S. companies are fighting back. OpenAI, led by CEO Sam Altman, is expanding its AI reach in Europe and Asia to maintain dominance. 'We want to make sure democratic AI wins over authoritarian AI,' Altman said in May. However, China's lower-cost models, like DeepSeek—described as 17 times cheaper yet comparable in quality by Latenode co-founder Oleg Zankov—are pressuring U.S. competitors to justify their pricing. Additionally, clients prioritizing data security are drawn to Chinese open-source models, which can operate offline, further boosting their appeal. As global adoption of Chinese AI grows, experts warn that Beijing's bots could become a powerful tool for spreading state-influenced narratives, amplifying China's global influence. The race for AI supremacy is intensifying, with far-reaching implications for technology, security, and global power dynamics.

When AI goes rogue, even exorcists might flinch

Hashtags

Try Our AI Features

Comments

Related Articles

How Will Trump's 'Big Beautiful Bill' Impact US Climate Policy?

Veo 3: Google's powerful AI video tool now available in India. Here's how you can access it

No. 1 factor that will define whether US or China wins AI race is ..., says Microsoft President Brad Smith

Get Started Now: Download the App