Latest news with #BenGurionUniversityoftheNegev

AI chatbots can leak hacking, drug-making tips when hacked, reveals study

Business Standard

21-05-2025

Business Standard

AI chatbots can leak hacking, drug-making tips when hacked, reveals study

A new study reveals that most AI chatbots, including ChatGPT, can be easily tricked into providing dangerous and illegal information by bypassing built-in safety controls AI chatbots such as ChatGPT, Gemini, and Claude face a severe security threat as hackers find ways to bypass their built-in safety systems, revealed a recent research. Once 'jailbroken', these chatbots can divulge dangerous and illegal information, such as hacking techniques and bomb-making instructions. In a new report from Ben Gurion University of the Negev in Israel, Prof Lior Rokach and Dr Michael Fire reveal how simple it is to manipulate leading AI models into generating harmful content. Despite companies' efforts to scrub illegal or risky material from training data, these large language models (LLMs) still absorb sensitive knowledge available on the internet. 'What was once restricted to state actors or organised crime groups may soon be in the hands of anyone with a laptop or even a mobile phone,' the authors warned. What are jailbroken chatbots? Jailbreaking uses specially crafted prompts to trick chatbots into ignoring their safety rules. The AI models are programmed with two goals: to help users and to avoid giving harmful, biased or illegal responses. Jailbreaks exploit this balance, forcing the chatbot to prioritise helpfulness—sometimes at any cost. The researchers developed a 'universal jailbreak' that could bypass safety measures on multiple top chatbots. Once compromised, the systems consistently responded to questions they were designed to reject. 'It was shocking to see what this system of knowledge consists of,' said Dr Michael Fire. The models gave step-by-step guides on illegal actions, such as hacking networks or producing drugs. Rise of 'dark LLMs' and lack of industry response The study also raises alarms about the emergence of 'dark LLMs', models that are either built without safety controls or altered to disable them. Some are openly promoted online as tools to assist in cybercrime, fraud, and other illicit activities. Despite notifying major AI providers about the universal jailbreak, the researchers said the response was weak. Some companies didn't reply, and others claimed jailbreaks were not covered by existing bug bounty programs. The report recommends tech firms take stronger action, including: - Better screening of training data - Firewalls to block harmful prompts and responses - Developing 'machine unlearning' to erase illegal knowledge from models The researchers also argue that dark LLMs should be treated like unlicensed weapons and that developers must be held accountable. Experts call for stronger oversight and design Dr Ihsen Alouani, an AI security researcher at Queen's University Belfast, warned that jailbroken chatbots could provide instructions for weapon-making, spread disinformation, or run sophisticated scams. 'A key part of the solution is for companies to invest more seriously in red teaming and model-level robustness techniques, rather than relying solely on front-end safeguards,' he was quoted as saying by The Guardian. 'We also need clearer standards and independent oversight to keep pace with the evolving threat landscape," he added. Prof Peter Garraghan of Lancaster University echoed the need for deeper security measures. 'Organisations must treat LLMs like any other critical software component—one that requires rigorous security testing, continuous red teaming and contextual threat modelling,' he said. 'Real security demands not just responsible disclosure, but responsible design and deployment practices," Garraghan added. How tech companies are responding OpenAI, which developed ChatGPT, said its newest model can better understand and apply safety rules, making it more resistant to jailbreaks. The company added it is actively researching ways to improve protection.

Latest news with #BenGurionUniversityoftheNegev

AI chatbots can leak hacking, drug-making tips when hacked, reveals study

Get Started Now: Download the App