
Dodgy aides: What can we do about AI models that defy humans?
Artificial intelligence (AI) going rogue has been the stuff of dystopic science fiction. Could fiction be giving way to fact, with several AI models reportedly disobeying explicit instructions to shut down when a third-party tester asked them to? On a recent test done by Palisade Research, the most glaring refusenik belonged to OpenAI, with some AI models of Google and Anthropic also showing a tendency to evade shutdown.
It is not yet time to rewatch Terminator 3: Rise of the Machines (2003) for a vivid nightmare scenario of malign AI running amok, but it would be a good idea to adopt caution while integrating AI bots and modules into Enterprise Resource Planning systems. If something goes wrong, the system would likely need a reboot; and if its AI bits scuttle a shutdown, a digital hostage crisis could arise.
Also Read: Rahul Matthan: Brace for a wave of AI-enabled criminal enterprise
That's what users of AI have to worry about. Developers and regulators of AI, meanwhile, must accelerate efforts to address the challenges thrown open by the rise of AI that can defy human orders.
Silicon Valley is used to privileging speed-to-market over full system integrity and safety. This urge is baked into the business model of multiple startups in pursuit of similar wonders, with venture capital breathing down executive necks to play the pioneer in a potentially winner-takes-all setting. Investors often need their hot ventures to prove their mettle double-quick so that they can either cash out or stem losses before moving on to other bets. 'Move fast and break things' is fine as a motto while developing apps to share videos, compare pet pranks or disrupt our online lives in other small ways.
Also Read: When AI gets a manager, you know the game has changed
But when it comes to AI, which is rapidly being given agency, nobody can afford to be cavalier about what may end up broken. If one thing snaps, multiple breakdowns could follow. AI is given to hallucination and training input biases. It can also learn the wrong thing if it is fed carelessly crafted synthetic data, for example, like broad estimates with low fidelity to actual numbers. This problem goes by the bland title of 'misalignment.'
Today, what risks going askew is the course taken by AI from the path planned for AI development. Among the techniques used to keep alignment in check, there is one whose name harks back to war games of the Cold War era: Red Teaming. The Red Team represented the bad guys, of course, and the aim was to get into the head of the enemy and anticipate its conduct. Applied to AI, it would entail provoking it to expose its follies.
If the AI models that dodged orders to shut down had been Red Teamed properly while under development, developers need to come up with better ways to exorcise their software of potential demons. If the makers of these tools fail to keep AI aligned with desirable outcomes, then regulation would be the only security we have against a big threat in the making.
Also Read: Biases aren't useless: Let's cut AI some slack on these
The EU's regulatory approach to AI invites criticism for being too stiff for innovation to thrive, but it is spot-on in its demand for safe, transparent, traceable, eco-friendly and non-discriminatory AI. Human oversight of AI systems, as the EU requires, should be universally adopted even if it slows down AI evolution.
We must minimize risks by specifying limits and insisting on transparency. In all AI labs, developers and whistleblowers alike should know what lines must not be crossed. Rules are rarely perfect at the outset, but we all have a stake in this. Let's ensure that AI is here to serve and not subvert human welfare.
Hashtags

Try Our AI Features
Explore what Daily8 AI can do for you:
Comments
No comments yet...
Related Articles
&w=3840&q=100)

Business Standard
41 minutes ago
- Business Standard
Android 16: Google Pixel phones may soon get theme packs for customising UI
Google is reportedly planning to bring native theme packs to its Pixel smartphones with the upcoming Android 16 update. According to a report from Android Authority, the latest Android 16 beta includes evidence of a new 'Pixel themes' feature under development. While many Android smartphone brands already let users apply pre-made themes—often through their own app stores—Pixel phones currently offer only limited personalisation, requiring users to tweak individual settings. That might soon change, as Google appears to be working on full-fledged themes that bundle multiple user interface elements in one go. Google Pixel theme packs: What we know The Android 16 QPR1 Beta 1 update includes an updated version of the 'Wallpaper and Style' app, and strings found in the code point to a new 'Themes' option coming in a future release. The new section is expected to appear at the top of both the Lock Screen and Home Screen tabs, labelled 'Discover Pixel themes.' According to the report, each theme pack will likely include a combination of icon styles, animation effects, wallpapers, and other visual tweaks. It's not yet clear how many themes will be available at launch, or exactly which parts of the UI they'll customise. However, it's possible users will also be able to tweak or build their own themes for a more personalised experience. The updated app also brings a cleaner layout, a new slider to adjust the lock screen clock size, and support for Magic Portrait wallpapers. Android 16: Material 3 Expressive design At its recent Android Show: I/O Edition event, Google introduced Material 3 Expressive—its next big visual update for Android 16. Building on the Material You design philosophy, Material 3 Expressive brings more colour, personality, and motion to the user interface. It introduces spring-like animations, more dynamic and responsive components, revamped typography, and updated colour theming—all aimed at delivering a smoother and more immersive experience. These new visual styles won't be limited to Android 16 either. Google said that Material 3 Expressive will also roll out across many of its core apps, including Gmail and Google Photos.

The Hindu
an hour ago
- The Hindu
OpenAI argues to keep countersuit against Musk; Meta and Anduril to make MR gears for soldiers; Germany weighs 10% tax on online platforms
OpenAI argues to keep countersuit against Musk OpenAI said it should be allowed to keep its countersuit against billionaire Elon Musk, saying the Tesla CEO's motion to dismiss the ChatGPT maker's claims has 'no grounding in facts'. In a court filing late on Wednesday, OpenAI said its countersuit, which accuses Musk of engaging in fraudulent business practices under California law, should be included in the expedited trial, rather than put on hold. OpenAI has argued that a $97.4 billion takeover bid for the company earlier this year from a Musk-led consortium was a 'sham bid' aimed at drumming up media frenzy. OpenAI alleged the bid was leaked to the media before the proposal reached OpenAI's board. Last year, Musk, who co-founded OpenAI in 2015 but left before the firm became an AI juggernaut, sued OpenAI and its CEO Sam Altman over the company's transition to a for-profit model. He accused OpenAI of straying from its founding mission. Meta and Anduril to make MR gears for soldiers Meta and defence tech startup Anduril Industries on Thursday announced a partnership to build mixed reality gear for 'warfighters' (soldiers) to control autonomous systems on battlefields. Meta will incorporate augmented reality and AI, presumably in the likes of glasses, goggles, or visors, with an Anduril data analytics platform called Lattice, the companies said in a joint release. 'Meta has spent the last decade building AI and AR to enable the computing platform of the future,' Meta chief executive Mark Zuckerberg said. 'We're proud to partner with Anduril to help bring these technologies to the American servicemembers that protect our interests at home and abroad.' Since Trump took back the White House, Zuckerberg has courted the president with frequent visits and notable changes to corporate policies on matters like content moderation, aligning himself politically with the Republican administration. The Anduril alliance will have Meta taking part in courting the U.S. military. Germany weighs 10% tax on online platforms Germany is considering a 10% tax on platforms like Google and Facebook, its new minister of state for culture told magazine Stern, in a move likely to heighten trade tensions with the Trump administration. The proposal comes as Chancellor Friedrich Merz is expected to travel to Washington soon to meet with the U.S. President, although a trip has not yet been officially announced. Trump has in the past said he will not allow foreign governments to 'appropriate America's tax base for their own benefit'. Culture Minister of State Wolfram Weimer said officials were drafting a legislative proposal while also seeking talks with platform operators that he accused of 'cunning tax evasion' to explore alternative solutions like voluntary contributions. Germany's ruling parties agreed in a deal earlier this year to consider the introduction of a digital services levy, but this was not on the list of projects the coalition wants to prioritise.


Hindustan Times
an hour ago
- Hindustan Times
Spending hours scrolling reels on Instagram? Here's how to set limits and activate sleep mode to save time
Instagram attracts millions of users who often spend more time scrolling reels on the app than they realise. With many users struggling to keep their screen time in check, Instagram offers features designed to help manage and control usage. Two such tools are the daily time limit and Sleep Mode. These settings help users reduce distractions, stay focused, and maintain a better balance between their digital and offline lives. Here's a step-by-step guide on how to activate these features and help you keep track of your time. Many users find themselves scrolling through Instagram Reels and feeds for long periods without noticing how much time has passed. Setting a daily time limit can help avoid excessive use by sending reminders when users reach their set limit. Also read: Google to let users test Android 16 desktop mode on phones with external display support, here's how After setting this, Instagram will alert you when you approach your daily usage limit, encouraging you to take a break. Also read: Uber users can now book Delhi Metro tickets within the app: Here's how to do it Sleep Mode is designed to mute Instagram notifications during a set time frame. When turned on, this feature notifies friends that you are in Sleep Mode and pauses notifications to reduce interruptions. Once configured, Instagram will mute notifications automatically during the hours you specify, helping you focus or rest without disturbance.