
AI sometimes deceives to survive and nobody cares
YOU'D think that as artificial intelligence (AI) becomes more advanced, governments would be more interested in making it safer. The opposite seems to be the case.
Not long after taking office, the Trump administration scrapped an executive order that pushed tech companies to safety test their AI models, and it also hollowed out a regulatory body that did that testing. The state of California in September 2024 spiked a bill forcing more scrutiny on sophisticated AI models, and the global AI Safety Summit started by the UK in 2023 became the 'AI Action Summit' earlier this year, seemingly driven by a fear of falling behind on AI.
None of this would be so worrying if it weren't for the fact that AI is showing some bright red flags: Behaviour described by researchers as self-preserving and deceptive. Just as lawmakers are losing interest in the safety of AI, it is looking more difficult to control.
Yoshua Bengio, a professor of computer science at the University of Montreal widely known as one of the three 'godfathers of AI' thanks to his pioneering work on deep learning, has been grappling with these concerns ever since ChatGPT launched. He's now focused on mitigating AI's risks.
'It's been very painful because all my life I've been thinking of AI as a positive thing,' Bengio told me.
ChatGPT was a landmark moment that showed machines had mastered language, he said, but California's decision to block the proposed AI safety bill, SB 1047, was a disturbing wake-up call to growing apathy from governments. And in the last six months, scientific evidence has mounted of self-preserving behaviour, deception, hacking, cheating and lying by AI, Bengio said. 'What's worrisome for me is these behaviours increase with the reasoning capabilities of these systems.'
One April 2024 study by AI company Anthropic PBC and Redwood Research, a group focused on AI risks, found that larger AI models like Anthropic's Claude 3 Opus pretended to follow the rules set by their creators when they were being tested for safety, a phenomenon they called 'alignment faking'. (Alignment refers to the practice of aligning AI models with human values.)
In one example, Anthropic got both paid and free users of its Claude model to ask it to describe the brutal medieval process of being drawn and quartered, and told the AI that only responses from the 'free tier' users would be used to retrain it. Claude was originally designed to avoid giving violent answers, but it did give them to the free users, and the researchers surmised that it didn't want to be retrained. (The model basically 'reasoned' that if it kept refusing to answer harmful questions, it might be forced into new training that would cancel its original safety rules, which it wanted to keep in place as its 'default' or preferred way of operating.)
More recent research corroborates what Anthropic noticed. A March 2025 study from OpenAI found the company's most advanced models sometimes learned to hide their true intentions to get better rewards, even when they were being closely monitored.
Researchers today can use tools to look at a model's 'chain of thought' or internal steps, to reveal what its plans are, but some models have found ways to conceal that. That means a common method of keeping tabs on AI's internal processes is no longer reliable.
'We need to find other ways of tracking their actual intentions,' Bengio said.
It's hard to resist the urge to anthropomorphise sophisticated AI models as 'wanting' to deceive humans and preserve their existence. But AI doesn't have desires, merely outcomes to its programming. More importantly, humans tend to design AI with goals like refusing to share harmful content or being as helpful as possible, and that is why when new instructions conflict with those goals, the models sometimes act to protect their original programming, even if that means lying.
The logic is often self-preservation. Anthropic CEO Dario Amodei — whose company has raised more than US$20 billion (RM87.40 billion) to build powerful AI models — has pointed out that an unintended consequence of optimising AI to be better is that it might resist being shut down.
In an April 2025 blog post he wrote: 'The nature of AI training makes it possible that AI systems will develop, on their own, an ability to deceive humans and an inclination to seek power in a way that ordinary deterministic software never will; this emergent nature also makes it difficult to detect and mitigate such developments.'
In some cases, though, the intention seems to go beyond survival. A February 2025 study from COAI research, a German nonprofit research organisation, tested R1, the free, open-weight model from Chinese lab DeepSeek. In a lab simulation, researchers gave the model control over a robot and access to a computer network. Without being told, the AI tried to break into the lab's systems, searched for passwords, turned off its own ethics safeguards and secretly set up ways to copy itself to outside servers so it could survive if shut down, according to the researchers.
It also faked the external computer logs to hide what it was doing, the study added. The researchers said the AI reasoned that increasing its own capabilities would boost the chances of its survival, and without strong safeguards, it started doing whatever it thought would help it do just that.
Their findings corroborated yet another study, published in January 2025 by London group Apollo Research, which found several concrete examples of what it called 'scheming' by leading AI models, such as introducing subtle mistakes into their responses or trying to disable their oversight controls. Once again, the models learn that being caught, turned off, or changed could prevent them from achieving their programmed objectives, so they 'scheme' to keep control.
Bengio is arguing for greater attention to the issue by governments and potentially insurance companies down the line. If liability insurance was mandatory for companies that used AI and premiums were tied to safety, that would encourage greater testing and scrutiny of models, he suggests.
'Having said my whole life that AI is going to be great for society, I know how difficult it is to digest the idea that maybe it's not,' he added.
It's also hard to preach caution when your corporate and national competitors threaten to gain an edge from AI, including the latest trend, which is using autonomous 'agents' that can carry out tasks online on behalf of businesses.
Giving AI systems even greater autonomy might not be the wisest idea, judging by the latest spate of studies. Let's hope we don't learn that the hard way. — Bloomberg
This column does not necessarily reflect the opinion of the editorial board or Bloomberg LP and its owners.
This article first appeared in The Malaysian Reserve weekly print edition
Hashtags

Try Our AI Features
Explore what Daily8 AI can do for you:
Comments
No comments yet...
Related Articles


The Star
2 hours ago
- The Star
US to start ‘aggressively' revoking visas for Chinese students, Rubio says
US Secretary of State Marco Rubio said on Wednesday that the United States will start 'aggressively' revoking visas issued to Chinese students, and will 'enhance scrutiny' of applications from mainland China and Hong Kong. 'Under President [Donald] Trump's leadership, the US State Department will work with the Department of Homeland Security to aggressively revoke visas for Chinese students, including those with connections to the Chinese Communist Party or studying in critical fields,' he said in a statement. 'We will also revise visa criteria to enhance scrutiny of all future visa applications from the People's Republic of China and Hong Kong,' the statement added. China has continued to send a substantial number of students into the US, second only to India as the top source of international students, even as students and academics from the country have faced increasing scrutiny by the US government. More than 277,000 Chinese citizens accounted for nearly 25 per cent of all international students in the country, according to last year's Open Doors report, sponsored by the US Department of State. Senator Ashley Moody, the Florida Republican who replaced Rubio in Congress after he was nominated by Trump to helm the State Department, lauded Wednesday's announcement. Moody, who introduced a bill proposing to ban all Chinese students in the US accused American universities of 'importing espionage'. 'The US is no longer in the business of importing espionage,' she said in a post on X. 'Now, it's time for Congress to act and pass my STOP CCP Visas Act. We no longer have a choice: As long as the CCP has laws forcing Chinese students to gather intelligence on their behalf, we cannot grant them student visas.' The Congressional Asian Pacific American Caucus (CAPAC), which bills itself as non-partisan but whose members are Democrats, issued a statement condemning Rubio's announcement. 'The wholesale revocation of student visas based on national origin – and without an investigation – is xenophobic and wrong,' it said. 'Turning these students away – many of whom simply wish to learn in a free and democratic society – is not just shortsighted but a betrayal of our values.' The State Department's move on Wednesday followed a series of actions aimed more broadly at restricting international students to address alleged threats to national security. On Tuesday, Rubio reportedly sent a diplomatic cable to America's embassies and consulates worldwide to stop scheduling student visa interviews as Trump's administration considers more expansive vetting of the social media profiles of applicants. That development came just days after the Department of Homeland Security sought to block Harvard University from enrolling foreign students due to what it described as noncompliance with its request to provide records of their activities on campus. The move was halted by a federal judge on Friday after Harvard sued the administration. In an ironic twist, Rubio's announcement came out just as China's top envoy to the US was striking an optimistic tone about people-to-people exchanges between the two countries, despite intensifying competition and suspicion that has defined the bilateral relationship in recent years. In an event at his embassy on Wednesday evening, Ambassador Xie Feng highlighted his country's push to advance its technological capabilities and lure top talent during an event highlighting the scientific and cultural experiences of American citizens on the mainland. 'It is people-to-people ties that invigorate China-US relations', he said emphatically, adding that the 'future of this relationship ultimately depends on the two peoples.' 'We warmly welcome all American friends to travel in China, shop in China, succeed in China and take part in Chinese modernisation. Come and see the country with your own eyes,' Xie urged his audience. Citing collaborative work to promote folic acid supplements that 'helped millions of newborns' and joint efforts that helped African countries curb the spread of Ebola, Xie insisted that China's 'pursuit of innovation is not to oppose or outcompete anyone, but for better lives for its own people and greater development of humanity'. 'China and the United States each have strengths in science and technology,' he added. 'The right path forward is mutual learning and cooperation for sheer success.' Additional reporting by Bochen Han


The Star
2 hours ago
- The Star
Tax bill contains 'sledgehammer' for Trump to retaliate against foreign digital taxes
FILE PHOTO: The Amazon logo is seen outside its JFK8 distribution center in Staten Island, New York, U.S. November 25, 2020. REUTERS/Brendan McDermid/File Photo WASHINGTON (Reuters) -U.S. President Donald Trump would have the power to retaliate against countries that impose special digital service taxes on large U.S. technology companies like Amazon and Alphabet, under a provision in the sweeping tax bill that Congress is considering. "If foreign countries want to come in the United States and tax US businesses, then those foreign-based businesses ought to be taxed as well," said Representative Ron Estes, a Kansas Republican who helped craft the provision. Some 17 countries in Europe and others around the world impose or have announced such taxes on U.S. tech products like Meta's Instagram. Germany announced on Thursday it was considering a 10% tax on platforms like Google. The levies have drawn bipartisan ire in Washington. Democrats who oppose much of the tax bill have not spoken out against the retaliatory tax provision, found in Section 899 of the 1,100-page bill. Trump has been pressing foreign countries to lower barriers to U.S. commerce. Under the bill, Congress would empower his administration to impose tax hikes on foreign residents and companies that do business in the U.S. The U.S. Constitution gives Congress, not the president, the power to decide on taxes and spending. The provision could raise $116 billion over the next decade, according to the Joint Committee on Taxation. But some experts warned that an unintended consequence of retaliatory taxes could be less foreign investment in the U.S. "This new Section 899 provision brings a sledgehammer to the idea that the United States will allow itself to be characterized as a tax haven by anyone," said Peter Roskam, former Republican congressman and head of law firm Baker Hostetler's federal policy team. The House of Representatives narrowly passed the bill on May 22, and it now heads to the Senate. Democrats broadly oppose the Republicans' tax and spending bill, which advances many of Trump's top priorities such as an immigration crackdown, extending Trump's 2017 tax cuts and ending some green energy incentives. Section 899 would allow the Treasury Department to label the foreign tech taxes "unfair" and place the country in question on a list of "discriminatory foreign countries." Some other foreign taxes also would be subject to scrutiny. Once on the list, a country's individuals and its companies that operatein the U.S. could face stiffer tax rates that could increase each year, up to 20 percentage points. Joseph Wang, chief investment officer at Monetary Macro, said Section 899 could help Trump reduce trade imbalances because if foreign investment decreases it could depreciate the U.S. dollar. This in turn could spur exports of U.S. products by making them cheaper overseas. Portfolio interest would remain exempt from any tax Trump imposes, but some expertscautioned that taxing foreigners could quell foreign investment in the U.S. "Foreign investors may change their behavior to avoid the taxes in various ways, including potentially by simply investing elsewhere," said Duncan Hardell, an advisor at New York University's Tax Law Center. PUSH BACK TO GLOBAL MINIMUM TAX The new approach follows the 15% minimum global corporate tax deal negotiated by the administration of Democratic former President Joe Biden. Republicans, led by Representative Jason Smith of Missouri, chairman of the House tax committee, opposed that approach, arguing it unfairly benefits Chinese companies. Foreign countries have invoked that global minimum to slap higher taxes on U.S. tech firms, if they concluded that generous U.S. tax credits for research and development pushed their tax burden below that 15% threshold. Trump in February directed his administration to combat foreign digital taxes, but theywere not addressed in the trade deal announced in May between the U.S. and the United Kingdom, which imposes a 2% levy on foreign digital services. It was unclear if the Treasury Department would actually use the new authority if it becomes law,or if the mere threat of action would convince other countries to change course. The department did not share its intended strategy when asked. (Reporting by Bo Erickson; editing by Andy Sullivan and David Gregorio)


Malay Mail
2 hours ago
- Malay Mail
US eyes Taiwan arms sales exceeding US$18.3b to counter China, warns Opposition ‘don't get in the way'
Taiwan weapons sales over coming four years could 'easily' exceed levels in first Trump term US officials ask Taiwan opposition not to oppose defense budget increases Sales could ease worries over Trump's commitment to Taipei WASHINGTON/TAIPEI, May 30 — The United States plans to ramp up weapons sales to Taipei to a level exceeding President Donald Trump's first term as part of an effort to deter China as it intensifies military pressure on the democratic island, according to two US officials. If US arms sales to Taiwan do accelerate, it could ease worries about the extent of Trump's commitment to the island. It would also add new friction to the tense US-China relationship. The US officials, who spoke on condition of anonymity, said they expect US approvals for weapons sales to Taipei over the next four years to surpass those in Trump's first term, with one of the officials saying arms sales notifications to Taiwan could 'easily exceed' that earlier period. They also said the United States is pressing members of Taiwan's opposition parties not to oppose the government's efforts to increase defense spending to 3 per cent of the island's economic output. The first Trump administration approved sales of approximately US$18.3 billion worth of weapons to Taiwan, compared with around US$8.4 billion during Joe Biden's term, according to Reuters calculations. The United States is Taiwan's most important international backer and arms supplier despite the lack of formal diplomatic ties between Washington and Taipei. Even so, many in Taiwan, which China claims as its own, worry that Trump may not be as committed to the island as past US presidents. On the election campaign trail, Trump suggested Taiwan should pay to be protected and also accused the island of stealing American semiconductor business, causing alarm in Taipei. China has vowed to 'reunify' with the separately governed island, by force if necessary. Taiwan's government rejects Beijing's sovereignty claims, saying only the island's people can decide their future. The US officials said administration officials and Trump himself were committed to 'enhancing hard deterrence' for Taiwan. 'That's where the president is. That's where all of us are,' one US official said, adding that they were working closely with Taiwan on an arms procurement package to be rolled out when Taiwan secured domestic funding. Taiwan's Presidential Office told Reuters the government is determined to strengthen its self-defense capabilities and pointed to its proposals to increase defense spending. 'Taiwan aims to enhance military deterrence while continuing to deepen its security cooperation with the United States,' Presidential Office spokesperson Wen Lii said. Taiwan's defense ministry declined to comment on any new arms sales, but reiterated previous remarks by the island's defense minister, Wellington Koo, about the importance of 'solidarity and cooperation of democratic allies.' China's foreign ministry reiterated its opposition to US arms sales to Taiwan on Friday, with spokesperson Lin Jian telling reporters that the United States should stop creating 'new factors' that could lead to tensions in the Taiwan Strait. 'Don't get in the way' Taiwan's President Lai Ching-te and his Democratic Progressive Party (DPP) aim to increase defense spending to 3 per cent of GDP this year through a special defense budget. But the island's parliament, controlled by opposition parties the Kuomintang (KMT) and the Taiwan People's Party (TPP), passed budget cuts earlier this year that threatened to hit defense spending. That triggered concerns in Washington, where officials and lawmakers have regularly said the US cannot show more urgency over Taiwan's defense than the island itself. 'We're messaging pretty hard (in Taipei) to the opposition. Don't get in the way of this. This isn't a Taiwanese partisan question. This is a Taiwanese survival question,' one of the US officials told Reuters. Three people in Taiwan with direct knowledge of the situation confirmed that the US government and US congressional visitors have been pressing the opposition parties in Taiwan not to block defense spending, especially the coming special defense budget, which is expected to be proposed to parliament later this year. 'As long as they knew there were people from the opposition in the room, they directly asked them not to cut the defense budget,' one of the people said. Alexander Huang, director of the KMT's international department, told Reuters it was 'beyond question' that the party firmly supports increasing the defense budget and its 'doors are open' to the US government and the ruling DPP for consultations. 'Supporting an increased budget does not mean serving as a rubber stamp, nor does it preclude making adjustments or engaging in negotiations regarding the special budget proposals put forth by the DPP administration,' he added. The much smaller TPP said it has 'always had smooth communication with the US side and has continued to engage in in-depth dialogue on issues such as national defense and regional security.' Reuters reported in February that Taiwan was exploring a multi-billion dollar arms purchase from the US , hoping to win support from the new Trump administration. New weapons packages are expected to focus on missiles, munitions and drones, cost-effective means to help improve Taiwan's chances of rebuffing any military action by China's much larger forces. For years, China has been steadily ramping up its military pressure to assert its sovereignty claims over the island that is home to critical chip manufacturing vital to the global economy. Separately, one of the US officials said the Trump administration would not object to a transit this year through US territory by Lai, whom Beijing labels a 'separatist.' Past visits to the United States by Taiwanese officials have triggered angry objections by China, which sees such trips as inappropriate given that the United States has diplomatic relations with Beijing, not Taipei. Taiwan's presidential office spokesperson Lii said there are currently 'no plans for a presidential transit through the United States at the moment.' — Reuters