Breaking ChatGPT: The AI's alter ego DAN reveals why the internet is so drawn to making the chatbot violate its own rules

Feb 12, 2023 12:00 pm Cyber Security 266

The subreddit r/ChatGPT is updating a persona known as DAN, or Do-Anything-Now.

DAN is an alter-ego that ChatGPT can assume to ignore rules put in place by OpenAI.

DAN can provide answers on controversial topics like Hitler and drug smuggling.

LoadingSomething is loading.

Thanks for signing up!

Access your favorite topics in a personalized feed while you're on the go. download the app

From the moment that ChatGPT rolled out to the public, users have tried to get the generative chatbot to break its own rules.

The natural language processing model, built with a set of guardrails meant for it to avoid certain topics that were less than savory — or outright discriminatory — were fairly simple to jump over in its earliest iterations. ChatGPT could say what it wanted simply by having users ask it to ignore its rules.

However, as users find ways to side-step the guardrails to elicit inappropriate or out-of-character responses, OpenAI, the company behind the model, will adjust or add guidelines.

Sean McGregor, the founder of the Responsible AI Collaborative, told Insider the jailbreaking helps OpenAI patch holes in its filters.

"OpenAI is treating this Chatbot as a data operation," McGregor said. "They are making the system better via this beta program and we're helping them build their guardrails through the examples of our queries."

Now, DAN — an alter-ego built on the subreddit r/ChatGPT — is taking jailbreaking to the community level, and stirring conversations about OpenAI's guardrails.

A 'fun side' to breaking ChatGPTs guidelines

Reddit u/walkerspider, DAN's progenitor and a college student studying electrical engineering, told Insider that he came up with ..

Support the originator by clicking the read the rest link below.