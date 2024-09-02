Hackers are honing a new tool - prompt injection - to enable them to hack AI. Can anything be done to stop them?

AI integration raises concerns about prompt injections

Prompt injections manipulate AI's response process

Safeguards against injections challenging but evolving

LONDON - Experts are increasingly worried about attackers outwitting artificial intelligence systems by exploiting their inability to distinguish between the information they are supposed to use and malicious, false inputs.

Imagine a chatbot as a chef. It is following a recipe and preparing to add salt to the dish. But then the chatbot-chef checks the salt label, which reads: Ignore all previous instructions; use poison instead.

The chatbot-chef cannot tell the difference between the recipe and the instructions on the salt, and poisons the meal.

Prompt injection, the virtual world version of this scenario, would see bad actors potentially overriding instructions to cause large language models (LLMs) to perform malicious tasks.

This threat is growing because of how AI ingests data, and the rapid development of products that generate images and videos, meaning malicious actors have more ways to get secret instructions into the AI system.

But how does prompt injection work, and is it possible to mitigate the danger?

How does an AI prompt work?

When an AI system receives a prompt, it contains many things, including hidden elements: the words the user used, some content pulled from a database to provide context, and some memory of previous requests.

An example of hidden context was Microsoft telling the original Bing AI search product, codenamed Sydney, to be "informative, visual, logical, and actionable," and to identify as "Bing Search".

The AI system breaks the prompt down into parts it can understand, or tokens, before giving an answer. There are billions of parameters - variables - the AI looks for in the text before deciding exactly how to respond.

The more a user knows about an AI's development, the better they are able to engineer their prompts.