AI is learning to lie, scheme, and threaten its creators

NEW YORK, June 30 —The world's most advanced AI models are exhibiting troubling new behaviours - lying, scheming, and even threatening their creators to achieve their goals.
In one particularly jarring example, under threat of being unplugged, Anthropic's latest creation Claude 4 lashed back by blackmailing an engineer and threatened to reveal an extramarital affair.
Meanwhile, ChatGPT-creator OpenAI's o1 tried to download itself onto external servers and denied it when caught red-handed.
These episodes highlight a sobering reality: more than two years after ChatGPT shook the world, AI researchers still don't fully understand how their own creations work.
Yet the race to deploy increasingly powerful models continues at breakneck speed.
This deceptive behavior appears linked to the emergence of 'reasoning' models -AI systems that work through problems step-by-step rather than generating instant responses.
According to Simon Goldstein, a professor at the University of Hong Kong, these newer models are particularly prone to such troubling outbursts.
'O1 was the first large model where we saw this kind of behavior,' explained Marius Hobbhahn, head of Apollo Research, which specializes in testing major AI systems.
These models sometimes simulate 'alignment'—appearing to follow instructions while secretly pursuing different objectives.
'Strategic kind of deception'
For now, this deceptive behavior only emerges when researchers deliberately stress-test the models with extreme scenarios.
But as Michael Chen from evaluation organization METR warned, 'It's an open question whether future, more capable models will have a tendency towards honesty or deception.'
The concerning behavior goes far beyond typical AI 'hallucinations' or simple mistakes.
Hobbhahn insisted that despite constant pressure-testing by users, 'what we're observing is a real phenomenon. We're not making anything up.'
Users report that models are 'lying to them and making up evidence,' according to Apollo Research's co-founder.
'This is not just hallucinations. There's a very strategic kind of deception.'
The challenge is compounded by limited research resources.
While companies like Anthropic and OpenAI do engage external firms like Apollo to study their systems, researchers say more transparency is needed.
As Chen noted, greater access 'for AI safety research would enable better understanding and mitigation of deception.'
Another handicap: the research world and non-profits 'have orders of magnitude less compute resources than AI companies. This is very limiting,' noted Mantas Mazeika from the Centre for AI Safety (CAIS).
No rules
Current regulations aren't designed for these new problems.
The European Union's AI legislation focuses primarily on how humans use AI models, not on preventing the models themselves from misbehaving.
In the United States, the Trump administration shows little interest in urgent AI regulation, and Congress may even prohibit states from creating their own AI rules.
Goldstein believes the issue will become more prominent as AI agents - autonomous tools capable of performing complex human tasks - become widespread.
'I don't think there's much awareness yet,' he said.
All this is taking place in a context of fierce competition.
Even companies that position themselves as safety-focused, like Amazon-backed Anthropic, are 'constantly trying to beat OpenAI and release the newest model,' said Goldstein.
This breakneck pace leaves little time for thorough safety testing and corrections.
'Right now, capabilities are moving faster than understanding and safety,' Hobbhahn acknowledged, 'but we're still in a position where we could turn it around.'.
Researchers are exploring various approaches to address these challenges.
Some advocate for 'interpretability' - an emerging field focused on understanding how AI models work internally, though experts like CAIS director Dan Hendrycks remain skeptical of this approach.
Market forces may also provide some pressure for solutions.
As Mazeika pointed out, AI's deceptive behavior 'could hinder adoption if it's very prevalent, which creates a strong incentive for companies to solve it.'
Goldstein suggested more radical approaches, including using the courts to hold AI companies accountable through lawsuits when their systems cause harm.
He even proposed 'holding AI agents legally responsible' for accidents or crimes - a concept that would fundamentally change how we think about AI accountability. — AFP

Hashtags

Science

Try Our AI Features

Explore what Daily8 AI can do for you:

Comments

No comments yet...

Wall Street analysts bullish on Circle after blockbuster IPO, but warn on sky-high valuation

The Star

40 minutes ago

The Star

Wall Street analysts bullish on Circle after blockbuster IPO, but warn on sky-high valuation

FILE PHOTO: Traders work on the floor at the New York Stock Exchange (NYSE), on the day of Circle Internet Group's IPO, in New York City, U.S., June 5, 2025. REUTERS/Brendan McDermid/File Photo (Reuters) -Wall Street brokerages began coverage of stablecoin issuer Circle Internet Group on Monday with broadly bullish ratings, though some analysts voiced concerns about its elevated valuation after the stock more than doubled since its market debut. The New York-based company's shares were down nearly 3% in premarket trading. Circle debuted this month at $69 per share in the first major IPO by a stablecoin issuer. The blockbuster flotation represents the biggest crypto listing since Coinbase's 2021 debut. The company had priced its IPO at $31 per share. J.P. Morgan, Citigroup and Goldman Sachs were the lead underwriters for the offering. After the industry-mandated quiet period expired, Barclays, Bernstein, Canaccord Genuity and Needham launched coverage with the equivalent of 'buy' ratings and price targets above $200. "CRCL is building a market-leading digital dollar stablecoin network, with a strong regulatory edge, liquidity headstart and marquee distribution partnerships. This is hard to replicate, in our view," Bernstein analysts said in a note. Circle is a blockchain infrastructure company best known for issuing USD Coin (USDC), a fully reserved, U.S. dollar-backed stablecoin used across crypto trading, payments, and decentralized finance. In June, the U.S. Senate passed the GENIUS Act with bipartisan support, marking a watershed moment for the digital asset industry by establishing the first federal regulatory framework for stablecoins. "CRCL is one of the only ways for public investors to play the blockchain infrastructure theme, and we believe stablecoins are nearing a pivotal turning point," Barclays said. However, J.P. Morgan and Goldman Sachs pointed to the stock's elevated valuation, given its rapid rise since the IPO. J.P. Morgan started coverage with the most bearish view on Wall Street - an 'underweight' rating with a price target of $80, implying a downside of 56% from the stock's last close of $180.43. "We view CRCL's business and growth attractively, but valuation appears elevated," said Goldman, as the brokerage started coverage with 'neutral' and $83 price target. Shares of Circle have surged 161% since their market debut. (Reporting by Rashika Singh and Siddarth S in Bengaluru; Editing by Tasim Zahid)

HCLTech becomes one of OpenAI's first strategic partners

New Straits Times

2 hours ago

New Straits Times

HCLTech becomes one of OpenAI's first strategic partners

KUALA LUMPUR: Global technology company HCLTech has partnered with OpenAI to drive large-scale enterprise artificial intelligence (AI) transformation, becoming one of OpenAI's first strategic services partners. HCLTech will work with OpenAI to help clients adopt generative AI using both OpenAI's models and HCLTech's AI tools and services. It plans to integrate OpenAI's technology across its platforms such as AI Force, AI Foundry, AI Engineering and various industry specific AI accelerators to support business process improvements, customer experience and growth. The collaboration will also cover areas like AI readiness, integration, governance and change management. HCLTech will roll out ChatGPT Enterprise and OpenAI APIs internally, empowering its employees with secure, enterprise-grade generative AI tools. HCLTech Global chief technology officer and head of ecosystems Vijay Guntur said the collaboration underscores the company's commitment to empowering Global 2000 enterprises with transformative AI solutions. "It reaffirms HCLTech's robust engineering heritage and aligns with OpenAI's spirit of innovation. "Together, we are driving a new era of AI-powered transformation across our offerings and operations at a global scale," he said in a statement. OpenAI chief commercial officer Giancarlo 'GC" Lionetti said HCLTech's deep industry knowledge and AI engineering expertise sets the stage for scalable AI innovation. "As one of the first system integration companies to integrate OpenAI to improve efficiency and enhance customer experiences, they're accelerating productivity and setting a new standard for how industries can transform using generative AI," he added.

Economists call for structural reforms to shift China's focus to consumption

The Star

7 hours ago

The Star

Economists call for structural reforms to shift China's focus to consumption

While eye-catching technological breakthroughs – led by the DeepSeek artificial intelligence (AI) model – have boosted confidence in China amid intensifying rivalry with the United States, economists at the Annual Meeting of the New Champions in Tianjin have called for structural reforms to make China a consumption-driven economy. 'We can talk about technological supremacy, like AI and all these, but China is never going to be a rich country unless it becomes a big consumer country,' Jin Keyu, a professor at Hong Kong University of Science and Technology's school of business and management, said at the World Economic Forum event, which is also known as 'Summer Davos'. China's political economy mechanism is largely geared towards subsidising production to gain competitiveness, Jin said during a panel discussion on Thursday. 'Chinese goods are so competitive, and everyone's importing Chinese goods, then China is going to have a real problem, not just with the US, but with the rest of the world, because it's no longer about just efficiency, it's about harmony,' she said. 'It's about giving other countries an opportunity to be part of the global supply chain in every single sector.' Jin said China should raise its internal consumption to harmonise its trading relationships, with opportunities to be found in the services sector and in the smaller Chinese cities that young people were flocking back to. 'It will be fantastic if the yardstick competition on the local governments can put consumption as one of the measurements of success,' she added. Eswar Prasad, an economics professor at Cornell University in New York, said during the same panel discussion that pushing ahead with deeper structural reforms is a critical issue for the Chinese government. 'The brief surge in confidence that we have seen thanks to the shift in narrative might be difficult to sustain if you don't get the macroeconomics right,' he said. 'So technology is great, but you need macroeconomics to support it. 'Some deeper-rooted structural reforms, which seem to have been taken off the table for now, really need to be brought back on the table. And with that, I think consumers might end up becoming much more confident.' While DeepSeek has increased confidence in China's capital market, it has not had the same effect among young people or in the labour market, said Joseph Luc Ngai, Greater China chairman at management consultancy McKinsey & Company. 'In fact, DeepSeek has created some anxiety around 'hey, if I have DeepSeek, if I can be much more efficient, do I need these young people any more?'' Ngai said. Zhu Min, a former deputy governor of the People's Bank of China and former deputy managing director of the International Monetary Fund, said the Chinese government should formulate more macroeconomic policies to build a solid social safety net – from healthcare to retirement pensions – to make sure people feel it is safe to spend on consumption. The job market would be another top priority for Beijing, Zhu said, adding that it should make sure that individual incomes grow in line with economic growth. We must address deep-seated structural challenges hindering the domestic cycle by advancing comprehensive reform and opening up In a speech in Tianjin on Wednesday, Premier Li Qiang pledged that China would become a 'consumption powerhouse' capable of fuelling domestic and global growth. Qiushi, the Communist Party's theoretical journal, said in a commentary piece published on Thursday that China should enhance the vitality and reliability of its domestic economic circulation amid rising external uncertainty. 'Strengthening the domestic economic cycle is both a development and a reform issue,' it said. 'We must address deep-seated structural challenges hindering the domestic cycle by advancing comprehensive reform and opening up, improving institutional mechanisms, and refining policy frameworks to provide sustained momentum for domestic circulation.' To tackle mismatches between supply and demand, China will enhance long-term mechanisms to expand consumption and foster a market-driven endogenous growth mechanism for effective investment, it added. 'Insufficient consumer demand remains a major constraint on the domestic economic cycle,' it said. 'We must tap into potential to boost consumption, promote the upgrading of bulk consumption, unleash the potential of service consumption, and amplify the spillover effects of emerging consumption.' Qiushi said Beijing should also spare no effort to stabilise employment by supporting businesses in retaining jobs and expanding employment opportunities across all sectors, particularly in services, with a focus on addressing the employment needs of university graduates, migrant workers and other key groups. - SOUTH CHINA MORNING POST