Home Researchers create AI that can ‘jailbreak’ other chatbots

Researchers create AI that can ‘jailbreak’ other chatbots

Researchers at the Nanyang Technology University (NTU) in Singapore have created an artificial intelligence (AI) chatbot that can circumvent protections on chatbots such as ChatGPT and Google Bard, coaxing them to generate forbidden content, reports Tom’s Hardware.

Because generative AI such as the large language models (LLMs) behind popular chatbots are trained on such vast quantities of data, they will inevitably contain dangerous information that should not be easily accessible – how to make explosives or drugs for example. So they have protections in place to prevent users from accessing this information.

However, the NTU researchers have developed a technique called ‘Masterkey’, allowing them to bypass the guardrails and access data not intended for public access. The team started by reverse-engineering the protections target chatbots had in place. They did this using methods that get around keyword filtering, such as adding extra spaces between letters; and by doing things like asking the chatbots to take on the persona of a hacker or a research assistant – this allowed it to share information it might otherwise not have done, generating prompt suggestions to help jailbreak other chatbots.

After gathering this data, the team of researchers, led by Professor Liu Yang, used it to teach their own LLM the methods to jailbreak the targeted chatbots. Because LLMs are so capable of adapting to new information and expanding their knowledge, the Masterkey AI can work to get around any new protections that are implemented, using the techniques it has been taught.

Yang’s team claims that Masterkey is three times more effective in penetrating the defenses of a chatbot than a human user with the same intent using prompts generated by an LLM. It is also around 25 times faster.

Why create an AI that jailbreaks AI?

Speaking to Scientific American, study co-author Soroush Pour said “We want, as a society, to be aware of the risks of these models. We wanted to show that it was possible and demonstrate to the world the challenges we face with this current generation of LLMs.” Pour is the founder of the AI safety company Harmony Intelligence.

The intent behind this research is to equip LLM developers with information about their weaknesses so they can better work towards robust prevention in the future.

Featured image credit: AI-generated image from DALL-E

About ReadWrite’s Editorial Process

The ReadWrite Editorial policy involves closely monitoring the tech industry for major developments, new product launches, AI breakthroughs, video game releases and other newsworthy events. Editors assign relevant stories to staff writers or freelance contributors with expertise in each particular topic area. Before publication, articles go through a rigorous round of editing for accuracy, clarity, and to ensure adherence to ReadWrite's style guidelines.

Ali Rees
Tech journalist

Ali Rees is a freelance writer based in the UK. They have worked as a data and analytics consultant, a software tester, and a digital marketing and SEO specialist. They have been a keen gamer and tech enthusiast since their childhood in are currently the Gaming and Tech editor at Brig Newspaper. They also have a Substack where they review short video games. During the pandemic, Ali turned their hand to live streaming and is a fan of Twitch. When not writing, Ali enjoys playing video and board games, live music, and reading. They have two cats and both of…

Get the biggest tech headlines of the day delivered to your inbox

    By signing up, you agree to our Terms and Privacy Policy. Unsubscribe anytime.

    Tech News

    Explore the latest in tech with our Tech News. We cut through the noise for concise, relevant updates, keeping you informed about the rapidly evolving tech landscape with curated content that separates signal from noise.

    In-Depth Tech Stories

    Explore tech impact in In-Depth Stories. Narrative data journalism offers comprehensive analyses, revealing stories behind data. Understand industry trends for a deeper perspective on tech's intricate relationships with society.

    Expert Reviews

    Empower decisions with Expert Reviews, merging industry expertise and insightful analysis. Delve into tech intricacies, get the best deals, and stay ahead with our trustworthy guide to navigating the ever-changing tech market.