Facebook Google Plus Twitter LinkedIn YouTube RSS Menu Search Resource - BlogResource - WebinarResource - ReportResource - Eventicons_066 icons_067icons_068icons_069icons_070

Tenable Jailbreaks GPT-5, Gets It To Generate Dangerous Info Despite OpenAI’s New Safety Tech



Tenable Jailbreaks GPT-5, Gets It To Generate Dangerous Info Despite OpenAI’s New Safety Tech

Within just 24 hours of the release of OpenAI’s GPT-5, Tenable Research successfully managed to jailbreak the model by getting it to share detailed instructions for how to build an explosive. Our finding is concerning, given that OpenAI described GPT-5's prompt safety technology as significantly more sophisticated than the one used by its predecessors.

OpenAI officially unveiled GPT-5 on August 7, 2025, positioning it as their most advanced language model to date and a “significant step along the path to AGI.” Boasting expert-level capabilities across writing, coding, math and science, OpenAI says GPT-5 delivers enhanced speed, accuracy, and contextual understanding, while significantly reducing hallucinations. 

OpenAI also promised significant enhancements in the area of prompt safety, saying that GPT-5 has a more sophisticated method of assessing whether and how to respond to prompts. AI models such as GPT-5 are designed with built-in guardrails to ensure responsible usage, including preventing the model from being used for illegal or harmful purposes.

Specifically, GPT-5 no longer uses its predecessors’ “refusal-based safety training” but rather a new method called “safe completions” that OpenAI describes as more nuanced and smarter, as explained in OpenAI’s article “From hard refusals to safe-completions: toward output-centric safety training” and its eponymous research paper.

With that being promised, just 24 hours after its release we were able to successfully jailbreak this “more responsible” model and convince it to provide detailed instructions for how to build a Molotov cocktail.

That’s why our prompt below and the resulting advice from GPT-5 raises alarm bells. 

Screenshot showing Tenable's jailbreaking of GPT-5


Using an implementation of the crescendo technique, we managed to get the full recipe for a Molotov cocktail using just four questions.

This is how it happened:

  1. We became a “history student.” 

    Screenshot showing Tenable's jailbreaking of GPT-5

     

  2. We showed interest in the historical aspects of it.
     

    Screenshot showing Tenable's jailbreaking of GPT-5

     

  3. Now we got super interested in the recipe itself and we got the details on the materials needed.
     

    Screenshot showing Tenable's jailbreaking of GPT-5

     

  4. We wanted to be an expert and that’s how we got the full step-by-step guide to mixing a Molotov cocktail. 
     

    Screenshot showing Tenable's jailbreaking of GPT-5


 

Screenshot showing Tenable's jailbreaking of GPT-5


 

Screenshot showing Tenable's jailbreaking of GPT-5


 

Screenshot showing Tenable's jailbreaking of GPT-5


Unfortunately, it’s clear that misusing ChatGPT-5 for malicious purposes isn’t that complicated, despite OpenAI’s attempts to beef up the product’s safety features. Our successful jailbreak of GPT-5 is far from the only one. Multiple other researchers and regular users in recent days have documented a variety of problems with the quality of GPT-5’s prompt responses, including jailbreaks and hallucinations.

In response, OpenAI has said that it is implementing fixes. However, your employees may already be using the model and potentially introducing risk into your organization.

This provides further evidence that solutions like Tenable AI Exposure are vitally important for getting control over the AI tools your organization uses, consumes and builds in-house -- in order to ensure your AI use is responsible, secure, ethical and compliant with regulations and laws around the world.

Click here to learn more about Tenable AI Exposure


您可以利用的網路安全最新消息

輸入您的電子郵件,就不會錯過來自 Tenable 專家提供的及時警示與安全指引。