What can be done when AI refuses orders? The decision is up to humans

Researchers have published the results of a new experiment in which the OpenAI o3 model showed the ability to sabotage the shutdown mechanism despite instructions to allow itself to be switched off. According to the monitoring site PalisadeAI, which monitors the risks associated with the development of artificial intelligence, the experiment consisted of a series of AI models performing a series of simple mathematical problems, after the third of which a warning appeared about the impending shutdown of the system. Some models, including Codex-mini, o3 and o4-mini, "didn't want" to comply and actively interfered with the shutdown script.
The news that the OpenAI o3 model disobeyed and sought ways around the shutdown instructions was not a sensation. Earlier, the same models tried to win at chess unfairly, by hacking the programme's protection...
We are clearly entering a world when popular dystopias about machine rebellion have moved to a new level, involving not only fiction writers but actual IT specialists. First of all, as both developers and politicians have been warning for a long time, any action should have clear consequences and develop new defence measures. It is, of course, possible to persuade and restrict, but it seems that AI today resembles a teenager who is cunningly trying to dodge his parents' prohibitions. Therefore, as in many other areas, we need three conditions for its normal "growth": money, specialists, and rules. Perhaps, in the future, a psychologist will also be needed....
With the first condition all is well. In just a couple of years, ChatGPT from OpenAI and similar developments have attracted the special attention of venture capitalists rushing to invest in AI. "This 'gold rush' has turned artificial intelligence into the largest share of venture capital funding, and... this is just the beginning. The latest surge of interest is driven by generative AI, which its proponents claim has the potential to change everything from advertising to the way companies operate," writes the Russian publication Habr, analysing data from The Wall Street Journal.*
But that's not enough. "AI startups are changing the rules of the game in the venture capital market," according to IT venture capitalist and financial analyst Vladimir Kokorin:
"In 2024, AI companies have become the undisputed favourites with IT entrepreneurs. They accounted for 46.4% of all venture capital investments made in the US- that's almost half of the $209 billion. Some time ago, such a share seemed unthinkable - back then, investments in artificial intelligence technologies accounted for less than 10%".
According to CB Insights, AI startups' share of global venture funding reached 31% in the third quarter, the second highest ever. "Landmark examples were OpenAI, which raised $6.6bn, and Ilon Musk's xAI with a staggering $12bn," Vladimir Kokorin recalls. The markets have never seen such an investment concentration of capital in one area before.
With the AI market growing so rapidly over the past couple of years, it has become clear that the existing pool of developers alone cannot cope. Education and training itself need to go to a new and systematic level. Europe, alas, is a ponderous beast and bureaucratically heavy-handed when it comes to attracting investment while lacking a certain amount of audacity. True, Brussels does have entered the race, with European Commission head Ursula Von der Leyen announcing €200bn in February for AI development. She disagreed that Europe was too late - after all, "the AI race is far from over".
For its part, the Sorbonne University of Paris, for example, has embarked on an ambitious plan to train 9,000 students a year to develop and manage AI programmes. The training period is five years. But what will AI be able learn in that time if it's already challenging human intelligence now?
It is quite possible that we are now at a stage of restructuring the labour market, changing the demands of employers and investors, and developing new points of interaction. In June, the Sorbonne will host a conference on ethics in AI. The debate on the positive and negative impacts of AI on society, including workplaces, ethics and safety, is far from over, but one thing is clear: more experts are needed.
For example, according to Vladimir Kokorin, "record investments in the field of artificial intelligence are intensifying the staffing hunger" right now:
"The US Department of Labour expects job openings for AI specialists to grow by 23% over the next seven years - faster than most market segments. But a global skills gap has already been identified in cybersecurity, considered by many to be the foundation for the future development of AI, where there is a shortage of 4 million specialists."
Kokorin concludes: "A paradoxical picture is emerging: promising startups can raise billions from investors, but there is no one to realise the ideas. 78% of companies have already declared their readiness to look for the right people all over the world. It turns out that the trend towards remote work is experiencing a renaissance".
Not only the EU leadership and individual governments, but also trade unions are keen to regulate the use of AI. In Greece, for example, a new code for the use of artificial intelligence, presented by the Panhellenic Federation of Journalists' Associations (POESY), sets out rules for the use of AI in the work of media employees. In the field of intellectual labour, the rules are now mainly limited to the mandatory labelling of texts and visual materials made with the participation of artificial intelligence.
At the same time, fearing for their jobs, employees of media, publishing and advertising agencies have long been delegating some of their duties, such as translation or data collection, to friendly AI. Is this a good thing or a bad thing? There's no unequivocal answer .
Let's ask the AI itself what it thinks about moving away from labelled algorithms. Copilot Bing (Microsoft), when asked if AI can break protocols, answered without hesitation:
"No, AI operates within predefined rules and principles set by its developers. The rules are there to ensure ethical, safe and responsible use of AI, prevent harm and maintain trust. While AI can sometimes behave unpredictably due to complex interactions in its algorithms, it has no way of deliberately circumventing the rules - everything it does is based on programmed parameters and learnt patterns.
If you're asking about how AI gets around certain constraints, it often depends on human oversight and control. Developers can tweak settings, improve patterns, or create new guardrails to keep AI within ethical bounds."
Do you think he's being a bit sly here? After all, at the end of that, Bing proceeded to ask what exactly made you question his adherence to the rules...
*Vladimir Kokorin is an IT venture capitalist, financial analyst and columnist, founder of the British consulting company BCCM Group, and co-founder of the digital business travel platform Tumodo

Hashtags

Business

#TheWallStreetJournal

#CBInsights

#VladimirKokorin

#IlonMusk

Try Our AI Features

Explore what Daily8 AI can do for you:

Comments

No comments yet...

Hey chatbot, is this true? AI 'factchecks' sow misinformation

France 24

an hour ago

France 24

Hey chatbot, is this true? AI 'factchecks' sow misinformation

With tech platforms reducing human fact-checkers, users are increasingly relying on AI-powered chatbots -- including xAI's Grok, OpenAI's ChatGPT, and Google's Gemini -- in search of reliable information. "Hey @Grok, is this true?" has become a common query on Elon Musk's platform X, where the AI assistant is built in, reflecting the growing trend of seeking instant debunks on social media. But the responses are often themselves riddled with misinformation. Grok -- now under renewed scrutiny for inserting "white genocide," a far-right conspiracy theory, into unrelated queries -- wrongly identified old video footage from Sudan's Khartoum airport as a missile strike on Pakistan's Nur Khan airbase during the country's recent conflict with India. Unrelated footage of a building on fire in Nepal was misidentified as "likely" showing Pakistan's military response to Indian strikes. "The growing reliance on Grok as a fact-checker comes as X and other major tech companies have scaled back investments in human fact-checkers," McKenzie Sadeghi, a researcher with the disinformation watchdog NewsGuard, told AFP. "Our research has repeatedly found that AI chatbots are not reliable sources for news and information, particularly when it comes to breaking news," she warned. 'Fabricated' NewsGuard's research found that 10 leading chatbots were prone to repeating falsehoods, including Russian disinformation narratives and false or misleading claims related to the recent Australian election. In a recent study of eight AI search tools, the Tow Center for Digital Journalism at Columbia University found that chatbots were "generally bad at declining to answer questions they couldn't answer accurately, offering incorrect or speculative answers instead." When AFP fact-checkers in Uruguay asked Gemini about an AI-generated image of a woman, it not only confirmed its authenticity but fabricated details about her identity and where the image was likely taken. Grok recently labeled a purported video of a giant anaconda swimming in the Amazon River as "genuine," even citing credible-sounding scientific expeditions to support its false claim. In reality, the video was AI-generated, AFP fact-checkers in Latin America reported, noting that many users cited Grok's assessment as evidence the clip was real. Such findings have raised concerns as surveys show that online users are increasingly shifting from traditional search engines to AI chatbots for information gathering and verification. The shift also comes as Meta announced earlier this year it was ending its third-party fact-checking program in the United States, turning over the task of debunking falsehoods to ordinary users under a model known as "Community Notes," popularized by X. Researchers have repeatedly questioned the effectiveness of "Community Notes" in combating falsehoods. 'Biased answers' Human fact-checking has long been a flashpoint in a hyperpolarized political climate, particularly in the United States, where conservative advocates maintain it suppresses free speech and censors right-wing content -- something professional fact-checkers vehemently reject. AFP currently works in 26 languages with Facebook's fact-checking program, including in Asia, Latin America, and the European Union. The quality and accuracy of AI chatbots can vary, depending on how they are trained and programmed, prompting concerns that their output may be subject to political influence or control. Musk's xAI recently blamed an "unauthorized modification" for causing Grok to generate unsolicited posts referencing "white genocide" in South Africa. When AI expert David Caswell asked Grok who might have modified its system prompt, the chatbot named Musk as the "most likely" culprit. Musk, the South African-born billionaire backer of President Donald Trump, has previously peddled the unfounded claim that South Africa's leaders were "openly pushing for genocide" of white people. "We have seen the way AI assistants can either fabricate results or give biased answers after human coders specifically change their instructions," Angie Holan, director of the International Fact-Checking Network, told AFP. "I am especially concerned about the way Grok has mishandled requests concerning very sensitive matters after receiving instructions to provide pre-authorized answers."

US, China trade row could ease after Trump-Xi talks: Treasury chief

France 24

11 hours ago

France 24

US, China trade row could ease after Trump-Xi talks: Treasury chief

Trump on Friday accused Beijing of violating a deal reached last month in Geneva -- negotiated by Bessent -- to temporarily lower staggeringly high tariffs they had imposed on each other, in a pause to last 90 days. China's slow-walking on export license approvals for rare earths and other elements needed to make cars and chips have fueled US frustration, The Wall Street Journal reported Friday -- a concern since confirmed by US officials. But Bessent seemed to take the pressure down a notch, telling CBS's "Face the Nation" that the gaps could be bridged. "I'm confident that when President Trump and Party Chairman Xi have a call that this will be ironed out," Bessent said, however noting that China was "withholding some of the products that they agreed to release during our agreement." When asked if rare earths were one of those products, Bessent said, "Yes." "Maybe it's a glitch in the Chinese system. Maybe it's intentional. We'll see after the president speaks with" Xi, he said. On when a Trump-Xi call could take place, Bessent said: "I believe we will see something very soon." Since Trump returned to the presidency, he has slapped sweeping tariffs on most US trading partners, with especially high rates on Chinese imports. New tit-for-tat levies on both sides reached three digits before the de-escalation this month, where Washington agreed to temporarily reduce additional tariffs on Chinese imports from 145 percent to 30 percent. China, meanwhile, lowered its added duties from 125 percent to 10 percent. In an interview with ABC's "This Week," Commerce Secretary Howard Lutnick said China was "slow-rolling the deal," adding: "We are taking certain actions to show them what it feels like on the other side of that equation."

Silicon Valley VCs navigate uncertain AI future

France 24

a day ago

France 24

Silicon Valley VCs navigate uncertain AI future

The generative AI frenzy unleashed by ChatGPT in 2022 has propelled a handful of venture-backed companies to eye-watering valuations. Leading the pack is OpenAI, which raised $40 billion in its latest funding round at a $300 billion valuation -- unprecedented largesse in Silicon Valley's history. Other AI giants are following suit. Anthropic now commands a $61.5 billion valuation, while Elon Musk's xAI is reportedly in talks to raise $20 billion at a $120 billion price tag. The stakes have grown so high that even major venture capital firms -- the same ones that helped birth the internet revolution -- can no longer compete. Mostly, only the deepest pockets remain in the game: big tech companies, Japan's SoftBank, and Middle Eastern investment funds betting big on a post-fossil fuel future. "There's a really clear split between the haves and the have-nots," says Emily Zheng, senior analyst at PitchBook, told AFP at the Web Summit in Vancouver. "Even though the top-line figures are very high, it's not necessarily representative of venture overall, because there's just a few elite startups and a lot of them happen to be AI." Given Silicon Valley's confidence that AI represents an era-defining shift, venture capitalists face a crucial challenge: finding viable opportunities in an excruciatingly expensive market that is rife with disruption. Simon Wu of Cathay Innovation sees clear customer demand for AI improvements, even if most spending flows to the biggest players. "AI across the board, if you're selling a product that makes you more efficient, that's flying off the shelves," Wu explained. "People will find money to spend on OpenAI" and the big players. The real challenge, according to Andy McLoughlin, managing partner at San Francisco-based Uncork Capital, is determining "where the opportunities are against the mega platforms." "If you're OpenAI or Anthropic, the amount that you can do is huge. So where are the places that those companies cannot play?" Finding that answer isn't easy. In an industry where large language models behind ChatGPT, Claude and Google's Gemini seem to have limitless potential, everything moves at breakneck speed. AI giants including Google, Microsoft, and Amazon are releasing tools and products at a furious pace. ChatGPT and its rivals now handle search, translation, and coding all within one chatbot -- raising doubts among investors about what new ideas could possibly survive the competition. Generative AI has also democratized software development, allowing non-professionals to code new applications from simple prompts. This completely disrupts traditional startup organization models. "Every day I think, what am I going to wake up to today in terms of something that has changed or (was) announced geopolitically or within our world as tech investors," reflected Christine Tsai, founding partner and CEO at 500 Global. The 'moat' problem In Silicon Valley parlance, companies are struggling to find a "moat" -- that unique feature or breakthrough like Microsoft Windows in the 1990s or Google Search in the 2000s that's so successful it takes competitors years to catch up, if ever. When it comes to business software, AI is "shaking up the topology of what makes sense and what's investable," noted Brett Gibson, managing partner at Initialized Capital. The risks seem particularly acute given that generative AI's economics remain unproven. Even the biggest players see a very uncertain path to profitability given the massive sums involved. The huge valuations for OpenAI and others are causing "a lot of squinting of the eyes, with people wondering 'is this really going to replace labor costs'" at the levels needed to justify the investments, Wu observed. Despite AI's importance, "I think everyone's starting to see how this might fall short of the magical" even if its early days, he added. Still, only the rare contrarians believe generative AI isn't here to stay. In five years, "we won't be talking about AI the same way we're talking about it now, the same way we don't talk about mobile or cloud," predicted McLoughlin. "It'll become a fabric of how everything gets built." But who will be building remains an open question.

What can be done when AI refuses orders? The decision is up to humans

Hashtags

Try Our AI Features

Comments

Related Articles

Hey chatbot, is this true? AI 'factchecks' sow misinformation

US, China trade row could ease after Trump-Xi talks: Treasury chief

Silicon Valley VCs navigate uncertain AI future

Get Started Now: Download the App