logo
AI is learning to lie, scheme, and threaten its creators

AI is learning to lie, scheme, and threaten its creators

Straits Timesa day ago

For now, deceptive behaviours only emerges when researchers deliberately stress-test the models. PHOTO: REUTERS
AI is learning to lie, scheme, and threaten its creators
NEW YORK - The world's most advanced AI models are exhibiting troubling new behaviours - lying, scheming, and even threatening their creators to achieve their goals.
In one particularly jarring example, under threat of being unplugged, Anthropic's latest creation Claude 4 lashed back by blackmailing an engineer and threatened to reveal an extramarital affair.
Meanwhile, ChatGPT-creator OpenAI's o1 tried to download itself onto external servers and denied it when caught red-handed.
These episodes highlight a sobering reality: more than two years after ChatGPT shook the world, AI researchers still do not fully understand how their own creations work.
Yet the race to deploy increasingly powerful models continues at breakneck speed.
This deceptive behaviour appears linked to the emergence of 'reasoning' models - AI systems that work through problems step-by-step rather than generating instant responses.
According to Professor Simon Goldstein, a professor at the University of Hong Kong, these newer models are particularly prone to such troubling outbursts.
'O1 was the first large model where we saw this kind of behaviour,' explained Mr Marius Hobbhahn, head of Apollo Research, which specializes in testing major AI systems.
These models sometimes simulate 'alignment' - appearing to follow instructions while secretly pursuing different objectives.
'Strategic kind of deception'
For now, this deceptive behaviour only emerges when researchers deliberately stress-test the models with extreme scenarios.
But as Mr Michael Chen from evaluation organization METR warned, 'It's an open question whether future, more capable models will have a tendency towards honesty or deception.'
The concerning behaviour goes far beyond typical AI 'hallucinations' or simple mistakes.
Mr Hobbhahn insisted that despite constant pressure-testing by users, 'what we're observing is a real phenomenon. We're not making anything up'.
Users report that models are 'lying to them and making up evidence', according to Apollo Research's co-founder.
'This is not just hallucinations. There's a very strategic kind of deception.'
The challenge is compounded by limited research resources.
While companies like Anthropic and OpenAI do engage external firms like Apollo to study their systems, researchers say more transparency is needed.
As Mr Chen noted, greater access 'for AI safety research would enable better understanding and mitigation of deception'.
Another handicap: the research world and non-profits 'have orders of magnitude less compute resources than AI companies. This is very limiting,' noted Mr Mantas Mazeika from the Centre for AI Safety (CAIS).
No rules
Current regulations are not designed for these new problems.
The European Union's AI legislation focuses primarily on how humans use AI models, not on preventing the models themselves from misbehaving.
In the United States, the Trump administration shows little interest in urgent AI regulation, and Congress may even prohibit states from creating their own AI rules.
Mr Goldstein believes the issue will become more prominent as AI agents - autonomous tools capable of performing complex human tasks - become widespread.
'I don't think there's much awareness yet,' he said.
All this is taking place in a context of fierce competition.
Even companies that position themselves as safety-focused, like Amazon-backed Anthropic, are 'constantly trying to beat OpenAI and release the newest model,' said Mr Goldstein.
This breakneck pace leaves little time for thorough safety testing and corrections.
'Right now, capabilities are moving faster than understanding and safety,' Mr Hobbhahn acknowledged, 'but we're still in a position where we could turn it around'.
Researchers are exploring various approaches to address these challenges.
Some advocate for 'interpretability' - an emerging field focused on understanding how AI models work internally, though experts like CAIS director Dan Hendrycks remain skeptical of this approach.
Market forces may also provide some pressure for solutions.
As Mr Mazeika pointed out, AI's deceptive behaviour 'could hinder adoption if it's very prevalent, which creates a strong incentive for companies to solve it'.
Mr Goldstein suggested more radical approaches, including using the courts to hold AI companies accountable through lawsuits when their systems cause harm.
He even proposed 'holding AI agents legally responsible' for accidents or crimes - a concept that would fundamentally change how we think about AI accountability. AFP
Join ST's Telegram channel and get the latest breaking news delivered to you.

Orange background

Try Our AI Features

Explore what Daily8 AI can do for you:

Comments

No comments yet...

Related Articles

Altcoins, once seen as rivals to Bitcoin, suffer $382 billion crypto wipeout
Altcoins, once seen as rivals to Bitcoin, suffer $382 billion crypto wipeout

Straits Times

time5 hours ago

  • Straits Times

Altcoins, once seen as rivals to Bitcoin, suffer $382 billion crypto wipeout

Most of the so-called altcoins – the catch-all term for all digital assets outside of Bitcoin and stablecoins - are nursing steep declines. PHOTO: REUTERS New York – On the face of it, 2025 looks like a banner year for crypto: Bitcoin hitting a record, an industry-boosting US president whose family is venturing headlong into the sector, and key legislation widely expected to be passed by the US Congress. But look beyond the bullish headlines and the rally in Bitcoin, and a vastly different landscape comes into view. Most of the so-called altcoins – the catch-all term for all digital assets outside of Bitcoin and stablecoins - once touted as competitors to the original cryptoasset are nursing steep declines, with more than US$300 billion (S$382.9 billion) of market value wiped out so far in 2025. The sea of red points to a wider malaise that's forcing parts of the industry to confront existential questions. Crypto was imagined by early enthusiasts as a universe where a host of coins competed for investor money, offering a diverse set of use cases. But as Bitcoin reigns supreme, that's giving way to predictions that large swathes of the sector will become a digital wasteland. 'I think they're just going to die, frankly,' Nick Philpott, co-founder of trading platform Zodia Markets, said of altcoins. 'They'll just wither away. Technically, a lot of this stuff will just sit there and gather dust in perpetuity.' Bitcoin's share of the total market value of cryptoassets has climbed by nine percentage points this year to 64 per cent, the highest since January 2021, according to CoinMarketCap. Back then, cryptocurrencies were a largely unregulated space, crypto lending was roaring with few safeguards and nonfungible tokens were just starting to take off. In sharp contrast, altcoins are faltering. A MarketVector index tracking the bottom half of the largest 100 digital assets, which more than doubled in the aftermath of Donald Trump's Nov 5 election victory, has since given up all those gains and is down around 50 per cent in 2025. With Bitcoin soaking up the bulk of capital flows from investors in exchange-traded funds (ETFs), other parts of the market are increasingly left behind. Even Ether, the second-largest cryptocurrency, remains about 50 per cent below its all-time high after a modest rebound fueled by inflows to spot ETFs investing in the token. 'Historically, Bitcoin's moved and then that's passed down into altcoins,' said Jake Ostrovskis, an OTC trader at Wintermute. 'We've not really seen that yet this cycle.' Crypto is no stranger to mass extinction events. The 2022 market crash, punctuated by the implosions of algorithmic stablecoin TerraUSD and Sam Bankman-Fried's FTX exchange, led to the demise of hundreds of projects. Thousands of coins still exist on their blockchains, with little or no activity – relegated to the status of 'ghost chains' in crypto parlance. What's different this time is that crypto is becoming a more regulated, institutionally-driven marketplace, and that stablecoins appear to be the only tokens with a real shot at achieving means-of-payment status, due to the fact that they eliminate volatility. In the past year alone, the market value of stablecoins has swelled by US$47 billion, and some of the world's largest banks are entering the field. The Wall Street Journal reported this month that is studying a potential stablecoin. That's putting pressure on altcoin projects to find ways to shore up their status and appeal to a wider base of investors. 'I've talked to a couple of projects that have been thinking about merging foundations, putting it up for governance, saying, 'Hey, we can now be governed under this other authority' – that authority being another altcoin community,' said Kanyi Maqubela, managing partner at venture capital firm Kindred Ventures. The shifting tides are also reflected in corporate behaviour. Modeled on Michael Saylor's Strategy, a new breed of Bitcoin accumulators has emerged. In April, a special-purpose acquisition company affiliated with Cantor Fitzgerald partnered with Tether Holdings and SoftBank to launch Twenty One Capital, seeded with nearly US$4 billion in Bitcoin. The Trump family, which is also getting involved in Bitcoin mining, has raised US$2.3 billion via Trump Media & Technology Group to create a Bitcoin treasury. While similar vehicles have been set up recently to accumulate smaller tokens like Ether, Solana and BNB, they are much smaller. Glimmers of hope Not all altcoins are floundering. Tokens like Maker and Hyperliquid that are linked to thriving decentralized-finance protocols have notched big gains this year. 'There's certainly a subset of the market doing incredibly well – generally companies with real businesses, real revenues, and those revenues are being used to buy back tokens,' said Jeff Dorman, chief investment officer of digital asset investment firm Arca. There's also the prospect of more favourable regulations. The potential for US Securities and Exchange Commission approval of ETFs backed by coins like Solana are stirring hopes of wider adoption. Another possible catalyst is the Digital Asset Market Clarity (Clarity) Act, informally referred to as crypto's market structure bill. The Clarity Act aims to provide a comprehensive regulatory framework, including delineating responsibilities between the Commodity Futures Trading Commission and the SEC. 'The Clarity Act has the potential to do for altcoins what ETFs did for Bitcoin and Ethereum: provide the regulatory legitimacy that unlocks real institutional capital,' said Ira Auerbach, a senior executive at Offchain Labs. Yet according to Kindred Venture's Mr Maqubela, the issue ultimately boils down to utility. He compares Bitcoin to gold and Ether to copper – the former has a capped final supply and the latter's blockchain underpins much of crypto's functionality – and says most altcoins are stuck in a sort of twilight zone, underpinned by big promises and not much else. 'I think a lot of them are going to whittle down to zero because they were driven by speculation without that mimetic value like Bitcoin, and they tried to be utilitarian without achieving any real scale,' he said. BLOOMBERG Join ST's Telegram channel and get the latest breaking news delivered to you.

Canada drops digital tax that infuriated Trump to restart trade talks
Canada drops digital tax that infuriated Trump to restart trade talks

Straits Times

time6 hours ago

  • Straits Times

Canada drops digital tax that infuriated Trump to restart trade talks

Canada has withdrawn its digital services tax on technology companies such as Meta Platforms Inc. and Alphabet Inc. PHOTO: REUTERS Canada has withdrawn its digital services tax ( DST ) on technology companies such as Meta Platforms and Alphabet in a move to restart trade talks with the US. 'Rescinding the DST will allow the negotiations to make vital progress and reinforce our work to create jobs and build prosperity for all Canadians,' Finance Minister Francois-Philippe Champagne said in a social media post on June 29. On the afternoon of June 27, US President Donald Trump said he was ending all trade discussions with Canada, one of its largest trading partners, in retaliation for the digital tax. He also threatened to impose a fresh tariff rate within a week. Instead, Mr Trump and Canadian Prime Minister Mark Carney agreed the countries will restart negotiations and try to agree on a deal by July 21, according to a statement. For Canada, the economic stakes of those discussions are huge. About three-quarters of its exports go to the US, including the vast majority of its oil and many other commodities, as well as most of the cars and trucks it produces. But the US also has something on the line: Canada is the largest buyer of US products. In 2024 , the US exported about US $440 billion (S$561 billion) of goods and services to its northern neighbor and imported US $477 billion from it, according to US government data. The first payment for Canada's digital tax was supposed to be due on J une 30. The tax, which was passed into law i n 2024 by the previous government of former Prime Minister Justin Trudeau, was meant to charge 3 per cent of the digital services revenue a firm makes from Canadian users above C$20 million ( S$18.6 million) in a calendar year. It would have cost large technology companies billions of dollars. A number of countries, including the UK, has such taxes in place. Instead, Canada will suspend the payments that were due on J une 30 and will create legislation to repeal the digital tax entirely, the finance department said. Following Mr Trump's post, Canadian business groups and politicians reiterated their calls for the Mr Carney's government to drop the tax. Opponents had long argued the levy would increase the cost of digital services and invite retaliation from the US. But some saw the digital tax as a bargaining chip for Mr Carney's government in its negotiations with the US. BLOOMBERG Join ST's Telegram channel and get the latest breaking news delivered to you.

DOWNLOAD THE APP

Get Started Now: Download the App

Ready to dive into a world of global content with local flavor? Download Daily8 app today from your preferred app store and start exploring.
app-storeplay-store