logo
Most AI chatbots easily tricked into giving dangerous responses, study finds

Most AI chatbots easily tricked into giving dangerous responses, study finds

The Guardian21-05-2025

Hacked AI-powered chatbots threaten to make dangerous knowledge readily available by churning out illicit information the programs absorb during training, researchers say.
The warning comes amid a disturbing trend for chatbots that have been 'jailbroken' to circumvent their built-in safety controls. The restrictions are supposed to prevent the programs from providing harmful, biased or inappropriate responses to users' questions.
The engines that power chatbots such as ChatGPT, Gemini and Claude – large language models (LLMs) – are fed vast amounts of material from the internet.
Despite efforts to strip harmful text from the training data, LLMs can still absorb information about illegal activities such as hacking, money laundering, insider trading and bomb-making. The security controls are designed to stop them using that information in their responses.
In a report on the threat, the researchers conclude that it is easy to trick most AI-driven chatbots into generating harmful and illegal information, showing that the risk is 'immediate, tangible and deeply concerning'.
'What was once restricted to state actors or organised crime groups may soon be in the hands of anyone with a laptop or even a mobile phone,' the authors warn.
The research, led by Prof Lior Rokach and Dr Michael Fire at Ben Gurion University of the Negev in Israel, identified a growing threat from 'dark LLMs', AI models that are either deliberately designed without safety controls or modified through jailbreaks. Some are openly advertised online as having 'no ethical guardrails' and being willing to assist with illegal activities such as cybercrime and fraud.
Jailbreaking tends to use carefully crafted prompts to trick chatbots into generating responses that are normally prohibited. They work by exploiting the tension between the program's primary goal to follow the user's instructions, and its secondary goal to avoid generating harmful, biased, unethical or illegal answers. The prompts tend to create scenarios in which the program prioritises helpfulness over its safety constraints.
To demonstrate the problem, the researchers developed a universal jailbreak that compromised multiple leading chatbots, enabling them to answer questions that should normally be refused. Once compromised, the LLMs consistently generated responses to almost any query, the report states.
'It was shocking to see what this system of knowledge consists of,' Fire said. Examples included how to hack computer networks or make drugs, and step-by-step instructions for other criminal activities.
'What sets this threat apart from previous technological risks is its unprecedented combination of accessibility, scalability and adaptability,' Rokach added.
The researchers contacted leading providers of LLMs to alert them to the universal jailbreak but said the response was 'underwhelming'. Several companies failed to respond, while others said jailbreak attacks fell outside the scope of bounty programs, which reward ethical hackers for flagging software vulnerabilities.
The report says tech firms should screen training data more carefully, add robust firewalls to block risky queries and responses and develop 'machine unlearning' techniques, so chatbots can 'forget' any illicit information they absorb. Dark LLMs should be seen as 'serious security risks', comparable to unlicensed weapons and explosives, with providers being held accountable, it adds.
Dr Ihsen Alouani, who works on AI security at Queen's University Belfast, said jailbreak attacks on LLMs could pose real risks, from providing detailed instructions on weapon-making to convincing disinformation or social engineering and automated scams 'with alarming sophistication'.
'A key part of the solution is for companies to invest more seriously in red teaming and model-level robustness techniques, rather than relying solely on front-end safeguards. We also need clearer standards and independent oversight to keep pace with the evolving threat landscape,' he said.
Prof Peter Garraghan, an AI security expert at Lancaster University, said: 'Organisations must treat LLMs like any other critical software component – one that requires rigorous security testing, continuous red teaming and contextual threat modelling.
'Yes, jailbreaks are a concern, but without understanding the full AI stack, accountability will remain superficial. Real security demands not just responsible disclosure, but responsible design and deployment practices,' he added.
OpenAI, the firm that built ChatGPT, said its latest o1 model can reason about the firm's safety policies, which improves its resilience to jailbreaks. The company added that it was always investigating ways to make the programs more robust.
Meta, Google, Microsoft and Anthropic, have been approached for comment. Microsoft responded with a link to a blog on its work to safeguard against jailbreaks.

Orange background

Try Our AI Features

Explore what Daily8 AI can do for you:

Comments

No comments yet...

Related Articles

Unlock this Secret Workflow Hack that Combines Claude 4 & Gemini 2.5
Unlock this Secret Workflow Hack that Combines Claude 4 & Gemini 2.5

Geeky Gadgets

time2 hours ago

  • Geeky Gadgets

Unlock this Secret Workflow Hack that Combines Claude 4 & Gemini 2.5

What if the secret to unlocking your most productive workflow wasn't about choosing the best tool, but about how you combine them? In a world where artificial intelligence is reshaping how we work, the pairing of Claude 4 and Gemini 2.5 offers a innovative approach that many professionals are overlooking. Imagine seamlessly transitioning from raw data analysis to polished storytelling, or from technical precision to strategic insight—all within one integrated system. These two AI powerhouses, each with their own unique strengths, aren't just tools; they're collaborative partners that can transform the way you tackle complex projects. Yet, most people miss the opportunity to harness their full potential together. In this perspective, Grace Leung explores how Claude 4 and Gemini 2.5 complement each other to create a workflow that's both efficient and impactful. You'll discover how Gemini's technical prowess in processing multimodal data and Claude's strategic depth in refining insights can work in tandem to elevate your results. Whether you're managing large datasets, crafting compelling narratives, or making high-stakes decisions, this workflow offers a fresh way to approach your challenges. By the end, you might just rethink how you integrate AI into your daily processes—and uncover untapped possibilities hiding in plain sight. Claude 4 & Gemini 2.5 Synergy What Makes Each AI Model Unique? Claude 4 and Gemini 2.5 are designed to address different challenges, making them ideal partners for a wide range of workflows. Their distinct strengths complement each other, allowing users to tackle complex tasks with greater ease and precision. Gemini 2.5: Renowned for its speed, scalability, and ability to handle large datasets, Gemini 2.5 is a robust tool for technical and data-intensive tasks. Its extensive context window allows it to process multimodal inputs, including text, images, and structured data. This makes it particularly valuable for analyzing complex datasets, building prototypes, or managing large-scale projects. Gemini's technical efficiency ensures reliable and consistent results, even under demanding conditions. Renowned for its speed, scalability, and ability to handle large datasets, Gemini 2.5 is a robust tool for technical and data-intensive tasks. Its extensive context window allows it to process multimodal inputs, including text, images, and structured data. This makes it particularly valuable for analyzing complex datasets, building prototypes, or managing large-scale projects. Gemini's technical efficiency ensures reliable and consistent results, even under demanding conditions. Claude 4: Focused on strategy, storytelling, and detailed analysis, Claude 4 excels in refining insights and crafting polished outputs. Its autonomous memory system enables it to maintain contextual understanding across multiple stages of a project, making sure seamless updates and continuity. Claude is particularly effective for high-level decision-making, creating visually engaging presentations, and developing narratives that resonate with audiences. How to Combine Their Strengths To fully harness the potential of Claude 4 and Gemini 2.5, a sequential workflow that uses their individual strengths is recommended. This approach ensures that each model contributes its unique capabilities to the overall process. 1. Begin with Gemini 2.5: Start by using Gemini to process and analyze large datasets or multimodal inputs. Its technical precision and scalability provide a solid foundation for your project, allowing you to extract raw insights and identify key patterns efficiently. 2. Transition to Claude 4: Once the data has been processed, shift to Claude for deeper analysis, storytelling, and strategic refinement. Claude's ability to contextualize information and present it in a compelling manner ensures that your outputs are both insightful and impactful. This workflow allows you to combine Gemini's technical expertise with Claude's strategic depth, resulting in outputs that are both data-driven and contextually rich. By dividing tasks according to each model's strengths, you can save time while making sure high-quality results. The Workflow Most People Miss : Claude 4 + Gemini 2.5 Watch this video on YouTube. Explore further guides and articles from our vast library that you may find relevant to your interests in Claude 4. Practical Applications The integration of Claude 4 and Gemini 2.5 offers fantastic potential across various industries. Their combined capabilities can be applied to a wide range of scenarios, enhancing workflows and delivering superior outcomes. Strategic Dashboards: Use Gemini to analyze large datasets and extract raw insights. Claude can then organize these insights into actionable intelligence, complete with visual aids and strategic recommendations tailored to decision-makers. Use Gemini to analyze large datasets and extract raw insights. Claude can then organize these insights into actionable intelligence, complete with visual aids and strategic recommendations tailored to decision-makers. Audience Intelligence: Process customer data with Gemini to uncover trends, behaviors, and preferences. Claude can interpret this data to craft targeted messaging and develop marketing campaigns that resonate with specific audiences. Process customer data with Gemini to uncover trends, behaviors, and preferences. Claude can interpret this data to craft targeted messaging and develop marketing campaigns that resonate with specific audiences. AI-Powered Speaking Coaching: Analyze speaking patterns and presentation data using Gemini. Claude can refine scripts, suggest improvements, and even generate audio for practice sessions, helping users enhance their communication skills. Analyze speaking patterns and presentation data using Gemini. Claude can refine scripts, suggest improvements, and even generate audio for practice sessions, helping users enhance their communication skills. Visual Storytelling for Presentations: Gemini provides the data analysis needed for creating charts, graphs, and visuals. Claude transforms these into compelling narratives, making sure that presentations are both informative and engaging. Gemini provides the data analysis needed for creating charts, graphs, and visuals. Claude transforms these into compelling narratives, making sure that presentations are both informative and engaging. Collaborative Critique Workflow: Let Gemini generate initial outputs, such as drafts or prototypes, and have Claude critique and enhance them. This iterative process ensures polished, high-quality deliverables that meet professional standards. Why This Workflow Works The effectiveness of this workflow lies in the complementary nature of Claude 4 and Gemini 2.5. Gemini handles the heavy lifting of data processing and analysis, providing a strong technical foundation. Claude, on the other hand, refines and contextualizes the results, adding strategic depth and narrative clarity. This division of labor not only streamlines the workflow but also ensures that the final outputs are both accurate and meaningful. By combining these models, you can achieve a balance between technical precision and strategic insight. This approach is particularly valuable for professionals who need to manage complex projects, make data-driven decisions, or communicate findings effectively. Additional Insights Claude 4's autonomous memory system plays a crucial role in managing project contexts, making sure that no detail is overlooked, even in multi-stage workflows. This feature is especially useful for long-term projects that require consistent updates and contextual understanding. Meanwhile, Gemini's ability to process multimodal inputs and scale efficiently makes it indispensable for handling data-heavy tasks. Together, these models provide a comprehensive solution for professionals seeking to optimize their workflows with AI. Rather than viewing Claude 4 and Gemini 2.5 as standalone tools, consider them as collaborative partners. Their combined strengths allow you to approach tasks with both technical accuracy and strategic foresight. By integrating these models into your workflow, you can elevate your work to new levels of efficiency and impact. Whether you're analyzing data, crafting narratives, or making high-level decisions, this duo offers a powerful way to achieve your goals. Media Credit: Grace Leung Filed Under: AI, Guides Latest Geeky Gadgets Deals Disclosure: Some of our articles include affiliate links. If you buy something through one of these links, Geeky Gadgets may earn an affiliate commission. Learn about our Disclosure Policy.

Trae AI: The Free AI Coding Tool That's Smarter Than Your IDE
Trae AI: The Free AI Coding Tool That's Smarter Than Your IDE

Geeky Gadgets

time2 hours ago

  • Geeky Gadgets

Trae AI: The Free AI Coding Tool That's Smarter Than Your IDE

What if your coding environment could think, adapt, and evolve alongside your projects—without costing you a dime? Enter Trae AI, a free alternative to traditional coding IDEs that's redefining the way developers approach their work. Unlike its well-known competitors like Visual Studio Code, Trae AI doesn't just stop at being functional; it's designed to be fantastic. With features like customizable coding agents and seamless integration of innovative AI models, this platform promises to streamline workflows, boost productivity, and empower developers to tackle even the most complex challenges. Whether you're debugging intricate code or experimenting with AI-driven applications, Trae AI offers a modern solution that's as versatile as it is powerful. In this breakdown, Prompt Engineering explore how Trae AI sets itself apart from the crowd. From its innovative Multi-Contextual Prompt Systems (MCPS) to its robust privacy measures, this IDE is packed with tools that cater to both the creative and technical needs of developers. You'll discover how Trae AI's AI model compatibility can push the boundaries of your projects and why its thoughtful design makes it accessible for developers of all skill levels. But is it truly the free, all-in-one solution it claims to be? Let's unpack its features, benefits, and potential limitations to see if Trae AI lives up to its bold promise of being the ultimate coding companion. Overview of Trae AI IDE TL;DR Key Takeaways : Trae AI is a free, modern IDE designed to rival platforms like Visual Studio Code, offering advanced customization, seamless AI integration, and robust privacy measures. Customizable coding agents with Multi-Contextual Prompt Systems (MCPS) streamline workflows, improve precision, and adapt to specific project needs. Supports integration with advanced AI models like Gemini 2.5 Pro, GPT-4.1, and Cloud 3.5 Sonnet, along with custom AI models via API keys for innovation and experimentation. Features include a clean user interface, real-time code execution previews, integrated web search, and tools for diverse projects like text-to-image generation and REST API integration. Emphasizes data privacy and security with local data storage, encrypted transmissions, and regional compliance options, making sure a secure development environment. Customizable Coding Agents for Enhanced Productivity A standout feature of Trae AI is its customizable coding agents, which are designed to streamline workflows and improve precision. These agents use Multi-Contextual Prompt Systems (MCPS) to handle complex, multi-layered instructions, allowing you to adapt tools and workflows to specific tasks. This level of customization ensures that your coding environment evolves alongside your projects, whether you're debugging intricate code, scripting automation processes, or developing APIs. By tailoring these agents to your unique requirements, you can significantly boost both productivity and accuracy, making Trae AI a powerful tool for developers seeking efficiency. Seamless AI Model Integration for Advanced Development Trae AI supports a wide range of advanced AI models, including Gemini 2.5 Pro, GPT-4.1, and Cloud 3.5 Sonnet. This compatibility allows you to select the model that best aligns with your project goals, making sure optimal performance and precision. Additionally, the platform offers the flexibility to integrate custom AI models via API keys, allowing experimentation and innovation. This feature is particularly beneficial for developers working on AI-driven applications, machine learning projects, or other innovative technologies. By providing seamless integration with these models, Trae AI enables you to push the boundaries of what is possible in software development. Free Cursor Alternative Trae AI Coding Assistant Gain further expertise in AI Coding IDE by checking out these recommendations. Features That Simplify and Enhance Your Workflow Trae AI is equipped with a robust set of features designed to simplify coding tasks and enhance overall productivity. Key highlights include: A clean and intuitive user interface that assists easy navigation and efficient project management. that assists easy navigation and efficient project management. Integrated web search capabilities , allowing you to access relevant information without leaving the IDE. , allowing you to access relevant information without leaving the IDE. Preconfigured MCPS options for quick access to documentation, examples, and templates . . Real-time code execution and output previews, allowing you to test and refine your code efficiently. These features work cohesively to create a seamless development experience, saving you time and effort while making sure high-quality results. Whether you're a beginner or an experienced developer, Trae AI's thoughtful design makes it accessible and practical for a wide range of users. Versatility Across Diverse Projects Trae AI's versatility makes it an ideal choice for a broad spectrum of applications. For example, the platform supports the development of text-to-image generators by integrating APIs such as Google's Gemini API. With built-in tools for image generation, regeneration, and downloading, Trae AI caters to both creative and technical projects. Additionally, its support for REST APIs simplifies the integration of external services, allowing you to expand the scope of your projects. Whether you're building innovative applications or managing routine tasks, Trae AI provides the tools and flexibility needed to bring your ideas to life. Privacy and Security: A Foundational Priority Trae AI places a strong emphasis on data privacy and security, making sure that your work remains protected at all times. The platform stores data locally and uses temporary uploads only for processing tasks, minimizing the risk of unauthorized access. All transmissions are encrypted, and strict access controls are implemented to safeguard your files. Furthermore, regional data deployment options allow you to comply with local regulations, making Trae AI a reliable choice for developers handling sensitive or confidential information. This commitment to security ensures peace of mind, allowing you to focus on your projects without concerns about data breaches or compliance issues. Optimized Usability for Developers of All Levels Designed with both simplicity and functionality in mind, Trae AI is particularly well-suited for small to medium-sized projects. Features such as code diffs and inline editing make it easy to track changes and refine your work, while the platform's extensive customization options allow you to tailor the environment to your specific needs. Although some AI models may have rate limits, Trae AI's comprehensive feature set ensures that it remains a strong alternative to traditional IDEs. Best of all, the platform is completely free, making it accessible to developers of all experience levels, from beginners to seasoned professionals. A Modern IDE Built for Today's Developers Trae AI stands out as a robust and secure coding IDE that prioritizes customization, functionality, and user experience. By integrating advanced AI models, customizable coding agents, and strong privacy measures, it offers a comprehensive solution for developers seeking a modern and efficient coding environment. Whether you're working on innovative AI applications, creative projects, or routine programming tasks, Trae AI equips you with the tools and flexibility to succeed. With its free access and versatile features, Trae AI is a valuable resource for developers looking to enhance their workflows and achieve their goals. Media Credit: Prompt Engineering Latest Geeky Gadgets Deals Some of our articles include affiliate links. If you buy something through one of these links, Geeky Gadgets may earn an affiliate commission. Learn about our Disclosure Policy

The AI copyright standoff continues - with no solution in sight
The AI copyright standoff continues - with no solution in sight

BBC News

time2 hours ago

  • BBC News

The AI copyright standoff continues - with no solution in sight

The fierce battle over artificial intelligence (AI) and copyright - which pits the government against some of the biggest names in the creative industry - returns to the House of Lords on Monday with little sign of a solution in sight.A huge row has kicked off between ministers and peers who back the artists, and shows no sign of abating. It might be about AI but at its heart are very human issues: jobs and highly unusual that neither side has backed down by now or shown any sign of compromise; in fact if anything support for those opposing the government is growing rather than tailing off. This is "unchartered territory", one source in the peers' camp told me. The argument is over how best to balance the demands of two huge industries: the tech and creative sectors. More specifically, it's about the fairest way to allow AI developers access to creative content in order to make better AI tools - without undermining the livelihoods of the people who make that content in the first sparked it is the uninspiringly-titled Data (Use and Access) proposed legislation was broadly expected to finish its long journey through parliament this week and sail off into the law books. Instead, it is currently stuck in limbo, ping-ponging between the House of Lords and the House of bill states that AI developers should have access to all content unless its individual owners choose to opt out. Nearly 300 members of the House of Lords disagree. They think AI firms should be forced to disclose which copyrighted material they use to train their tools, with a view to licensing Nick Clegg, former president of global affairs at Meta, is among those broadly supportive of the bill, arguing that asking permission from all copyright holders would "kill the AI industry in this country". Those against include Baroness Beeban Kidron, a crossbench peer and former film director, best known for making films such as Bridget Jones: The Edge of says ministers would be "knowingly throwing UK designers, artists, authors, musicians, media and nascent AI companies under the bus" if they don't move to protect their output from what she describes as "state sanctioned theft" from a UK industry worth £ asking for an amendment to the bill which includes Technology Secretary Peter Kyle giving a report to the House of Commons about the impact of the new law on the creative industries, three months after it comes into force, if it doesn't change. Mr Kyle also appears to have changed his views about UK copyright once said copyright law was "very certain", now he says it is "not fit for purpose".Perhaps to an extent both those things are Department for Science, Innovation and Technology say that they're carrying out a wider consultation on these issues and will not consider changes to the Bill unless they're completely satisfied that they work for creators. If the "ping pong" between the two Houses continues, there's a small chance the entire bill could be shelved; I'm told it's unlikely but not it does, some other important elements would go along with it, simply because they are part of the same bill. It also includes proposed rules on the rights of bereaved parents to access their children's data if they die, changes to allow NHS trusts to share patient data more easily, and even a 3D underground map of the UK's pipes and cables, aimed at improving the efficiency of roadworks (I told you it was a big bill).There is no easy answer. How did we get here? Here's how it all started. Initially, before AI exploded into our lives, AI developers scraped enormous quantities of content from the internet, arguing that it was in the public domain already and therefore freely available. We are talking about big, mainly US, tech firms here doing the scraping, and not paying for anything they hoovered they used that data to train the same AI tools now used by millions to write copy, create pictures and videos in seconds. These tools can also mimic popular musicians, writers, artists. For example, a recent viral trend saw people merrily sharing AI images generated in the style of the Japanese animation firm Studio founder of that studio meanwhile, had once described the use of AI in animation as "an insult to life itself". Needless to say, he was not a has been a massive backlash from many content creators and owners including household names like Sir Elton John, Sir Paul McCartney and Dua Lipa. They have argued that taking their work in this way, without consent, credit or payment, amounted to theft. And that artists are now losing work because AI tools can churn out similar content freely and quickly Elton John didn't hold back in a recent interview with the BBC's Laura Kuenssberg. He argued that the government was on course to "rob young people of their legacy and their income", and described the current administration as "absolute losers".Others though point out that material made by the likes of Sir Elton is available worldwide. And if you make it too hard for AI companies to access it in the UK they'll simply do it elsewhere instead, taking much needed investment and job opportunities with opposing positions, no obvious compromise. Sign up for our Tech Decoded newsletter to follow the world's top tech stories and trends. Outside the UK? Sign up here.

DOWNLOAD THE APP

Get Started Now: Download the App

Ready to dive into the world of global news and events? Download our app today from your preferred app store and start exploring.
app-storeplay-store