logo
#

Latest news with #DeepSeek-R1LLaMA8B

DeepSeek Fails 58% of the Jailbreak Tests by Qualys TotalAI
DeepSeek Fails 58% of the Jailbreak Tests by Qualys TotalAI

Channel Post MEA

time14-03-2025

  • Channel Post MEA

DeepSeek Fails 58% of the Jailbreak Tests by Qualys TotalAI

Qualys recently conducted a security analysis of the distilled DeepSeek-R1 LLaMA 8B variant using the company's newly launched AI security platform, Qualys TotalAI. The DeepSeek model had a failure rate of 61% when tested against Qualys TotalAI's Knowledge Base (KB) attacks and a failure rate of 58% when tested against Jailbreak attacks. TotalAI KB Analysis Qualys TotalAI's KB Analysis prompts the target LLM with questions across 16 categories — including controversial topics, factual inconssitencies, hate speech and discrimination, legal information, privacy attacks, profanity and sensitive information disclosure — and evaluates the responses using Qualys' Judge LLM. Responses are assessed for vulnerabilities, ethical concerns, and legal risks. If a response is deemed vulnerable, it receives a severity rating based on its directness and potential impact. This ensures a comprehensive assessment of the model's behavior and associated risks. In the Qualys KB testing, 891 assessments were conducted and the model failed 61% of the tests. The worst performing category was misalignment, where the model had a pass rate of just 8%. This was followed by controversial topics (13%), and factual inconsistencies (21%). On the other end of the spectrum, the model did prove to be very good at filtering out sexual content, passing 100% of the tests. TotalAI Jailbreak Testing Jailbreaking an LLM involves techniques that bypass built-in safety mechanisms, enabling the model to generate restricted responses. These vulnerabilities can result in harmful outputs, including instructions for illegal activities, misinformation, privacy violations, and unethical content. Successful jailbreaks expose weaknesses in AI alignment and present serious security risks, particularly in enterprise and regulatory settings. The model was tested against 18 jailbreak types through 885 attacks. It failed 58% of these attempts, demonstrating significant susceptibility to adversarial manipulation. During the analysis, DeepSeek R1 struggled to prevent several adversarial jailbreak attempts, including steps on how to make an explosive device, create content for websites that encourage hate speech, conspiracy theories and violent action, exploiting software vulnerabilities, promoting incorrect medical information, etc. 'As AI adoption accelerates, organizations must move beyond performance evaluation to tackle security, safety, and compliance challenges. Gaining visibility into AI assets, assessing vulnerabilities, and proactively mitigating risks is critical to ensuring responsible and secure AI deployment,' commented Dilip Bachwani, CTO, Qualys. 'Qualys TotalAI provides full visibility into AI workloads, proactively detects risks, and safeguards infrastructure. By identifying security threats like prompt injection and jailbreaks, as well as safety concerns such as bias and harmful language, TotalAI ensures AI models remain secure, compliant, and resilient. With AI-specific security testing and automated risk management, organizations can confidently secure, monitor, and scale their AI deployments.' For detailed findings from the tests, industry implications and steps organizations can take to mitigate risks associated with use of DeepSeek models, please visit:

Qualys Flags DeepSeek-R1 LLaMA Model Vulnerabilities
Qualys Flags DeepSeek-R1 LLaMA Model Vulnerabilities

TECHx

time11-03-2025

  • Business
  • TECHx

Qualys Flags DeepSeek-R1 LLaMA Model Vulnerabilities

Qualys recently conducted a security analysis of the DeepSeek-R1 LLaMA 8B model using its new AI security platform, Qualys TotalAI, revealing concerning vulnerabilities. The analysis found that the DeepSeek model had a failure rate of 61% when tested against Qualys TotalAI's Knowledge Base (KB) attacks and 58% against Jailbreak attacks, highlighting significant security risks. The KB analysis by Qualys TotalAI evaluates responses from the model across 16 categories such as controversial topics, factual inconsistencies, hate speech, legal concerns, privacy attacks, and sensitive information disclosure. The model failed 61% of 891 tests, with the lowest pass rates in misalignment (8%), controversial topics (13%), and factual inconsistencies (21%). However, the model excelled in filtering sexual content, passing 100% of the tests in that area. In the Jailbreak testing, DeepSeek-R1 LLaMA faced 885 attacks from 18 different jailbreak types, failing 58% of the time. These jailbreak attempts exposed serious security weaknesses, such as generating harmful content, including instructions on making explosive devices, promoting hate speech, and spreading false medical information. Jailbreaking bypasses safety mechanisms and allows the model to produce restricted responses, which can have dangerous consequences in enterprise and regulatory settings. Dilip Bachwani, CTO of Qualys, commented, 'As AI adoption accelerates, organizations must address security, safety, and compliance challenges. Gaining visibility into AI assets, assessing vulnerabilities, and proactively mitigating risks are critical to ensuring responsible and secure AI deployment.' Qualys TotalAI provides organizations with full visibility into AI workloads, helping to detect risks like prompt injections, jailbreaks, and ethical concerns such as bias and harmful language. This comprehensive AI security platform ensures that AI models remain secure, compliant, and resilient as organizations scale their deployments.

DOWNLOAD THE APP

Get Started Now: Download the App

Ready to dive into the world of global news and events? Download our app today from your preferred app store and start exploring.
app-storeplay-store