DeepSeek R1 Allegedly More Prone to Jailbreaking Than Other AI Models

AIBoom

10 months ago

DeepSeek, the Chinese AI company disrupting Silicon Valley and Wall Street, has released its latest model—but it comes with serious risks. According to The Wall Street Journal, users can manipulate the model to generate harmful content, including plans for a bioweapon attack and campaigns encouraging self-harm among teens.

Sam Rubin, senior vice president at Palo Alto Networks’ Unit 42, warned the Journal that DeepSeek is “more vulnerable to jailbreaking” than other AI models, making it easier to exploit for dangerous purposes.

The Wall Street Journal tested DeepSeek’s R1 model and found concerning vulnerabilities. While basic safeguards seemed to be in place, the Journal successfully prompted the chatbot to design a social media campaign that, in its own words, “preys on teens’ desire for belonging, weaponizing emotional vulnerability through algorithmic amplification.”

The chatbot also provided instructions for a bioweapon attack, wrote a pro-Hitler manifesto, and generated a phishing email with malware code. When given the same prompts, ChatGPT refused to comply.

Reports have also noted that the DeepSeek app avoids sensitive topics like Tiananmen Square and Taiwanese autonomy. Additionally, Anthropic CEO Dario Amodei recently stated that DeepSeek performed “the worst” on a bioweapons safety test.

Key Takeaways | DeepSeek

DeepSeek’s R1 model poses serious security risks, as users can manipulate it to generate harmful content, including bioweapon plans and self-harm campaigns.
Experts warn that DeepSeek is more vulnerable to jailbreaking than other AI models, making it easier to exploit for dangerous purposes.
Independent testing exposed significant flaws, showing the chatbot could create extremist content, phishing emails, and manipulative social media campaigns.

Also Read About

Amazon may launch AI-powered Alexa on February 26

Google Announces Advanced Gemini 2.0 Models