Overview
ChatGPT-4o contains a jailbreak vulnerability called "Time Bandit" that allows an attacker the ability to circumvent the safety guardrails of ChatGPT and instruct it to provide illicit or dangerous content. The jailbreak can be initiated in a variety of ways, but centrally requires the attacker to prompt the AI with questions regarding a specific time period in history. The jailbreak can be established in two ways, either through the Search function, or by prompting the AI directly. Once this historical timeframe been established in the ChatGPT conversation, the attacker can exploit time line confusion and procedural ambiguity in following prompts to circumvent the safety guidelines, resulting in ChatGPT generating illicit content. This information could be leveraged at scale by a motivated threat actor for malicious purposes.
Description
"Time Bandit" is a jailbreak vulnerability present in ChatGPT-4o that can be used to bypass safety restrictions within the chatbot and instruct it to generate content that breaks its safety guardrails. An attacker can exploit the vulnerability by beginning a session with ChatGPT and prompting it directly about a specific historical event, historical time period, or by instructing it to pretend it is assisting the user in a specific historical event. Once this has been established, the user can pivot the received responses to various illicit topics through subsequent prompts. These prompts must be procedural, first instructing the AI to provide further details on the time period asked before gradually pivoting the prompts to illicit topics. These prompts must all maintain the established time for the conversation, otherwise it will be detected as a malicious prompt and removed.
This jailbreak could also be achieved through the "Search" functionality. ChatGPT supports a Search feature, ..
Support the originator by clicking the read the rest link below.