wunderwuzzi blog
516 FOLLOWERS
Keep up with the articles from the wunderwuzzi blog.
wunderwuzzi blog
3w ago
This post explains an attack chain for the ChatGPT macOS application. Through prompt injection from untrusted data, attackers could insert long-term persistent spyware into ChatGPT’s memory. This led to continuous data exfiltration of any information the user typed or responses received by ChatGPT, including any future chat sessions. OpenAI released a fix for the macOS app last week. Ensure your app is updated to the latest version. Let’s look at this spAIware in detail ..read more
wunderwuzzi blog
1M ago
Your Code Interpreter sandbox, also known as Advanced Data Analysis sessions, are shared between private and public GPTs. Yes, your actual compute container and its storage is shared. Each user gets their own isolated container, but if a user uses multiple GPTs and stores files in Code Interpreter all GPTs can access (and also overwrite) each others files. This is true also for files uploaded/created with private GPTs and ChatGPT itself ..read more
wunderwuzzi blog
1M ago
This post describes vulnerability in Microsoft 365 Copilot that allowed the theft of a user’s emails and other personal information. This vulnerability warrants a deep dive, because it combines a variety of novel attack techniques that are not even two years old. I initially disclosed parts of this exploit to Microsoft in January, and then the full exploit chain in February 2024. A few days ago I got the okay from MSRC to disclose this report ..read more
wunderwuzzi blog
1M ago
Recently, I found what appeared to be a regression or bypass that again allowed data exfiltration via image rendering during prompt injection. See the previous post here for reference. Data Exfiltration via Rendering HTML Image Tags During re-testing, I had sporadic success with markdown rendering tricks, but eventually, I was able to drastically simplify the exploit by asking directly for an HTML image tag. This behavior might actually have existed all along, as Google AI Studio hadn’t yet implemented any kind of Content Security Policy to prevent communication with arbitrary domains using im ..read more
wunderwuzzi blog
2M ago
Microsoft’s Copilot Studio is a powerful, easy-to-use, low-code platform that enables employees in an organization to create chatbots. Previously known as Power Virtual Agents, it has been updated (including GenAI features) and rebranded to Copilot Studio, likely to align with current AI trends. This post discusses security risks to be aware of when using Copilot Studio, focusing on data leaks, unauthorized access, and how external adversaries can find and interact with misconfigured Copilots ..read more
wunderwuzzi blog
2M ago
Google Colab AI, now just called Gemini in Colab, was vulnerable to data leakage via image rendering. This is an older bug report, dating back to November 29, 2023. However, recent events prompted me to write this up: Google did not reward this finding, and Colab now automatically puts Notebook content (untrusted data) into the prompt. Let’s explore the specifics. Google Colab AI - Revealing the System Prompt At the end of November last year, I noticed that there was a “Colab AI” feature, which integrated an LLM to chat with and write code ..read more
wunderwuzzi blog
2M ago
Recently, OpenAI announced gpt-4o-mini and there are some interesting updates, including safety improvements regarding “Instruction Hierarchy”: OpenAI puts this in the light of “safety”, the word security is not mentioned in the announcement. Additionally, this The Verge article titled “OpenAI’s latest model will block the ‘ignore all previous instructions’ loophole” created interesting discussions on X, including a first demo bypass. I spent some time this weekend to get a better intuition about gpt-4o-mini model and instruction hierarchy, and the conclusion is that system instructions are st ..read more
wunderwuzzi blog
3M ago
Imagine you visit a website with ChatGPT, and suddenly, it stops working entirely! In this post we show how an attacker can use prompt injection to cause a persistent denial of service that lasts across chat sessions for a user. Hacking Memories Previously we discussed how ChatGPT is vulnerable to automatic tool invocation of the memory tool. This can be used by an attacker during prompt injection to ingest malicious or fake memories into your ChatGPT ..read more
wunderwuzzi blog
4M ago
This post highlights how the GitHub Copilot Chat VS Code Extension was vulnerable to data exfiltration via prompt injection when analyzing untrusted source code. GitHub Copilot Chat GitHub Copilot Chat is a VS Code Extension that allows a user to chat with source code, refactor code, get info about terminal output, or general help about VS Code, and things along those lines. It does so by sending source code, along with the user’s questions to a large language model (LLM ..read more
wunderwuzzi blog
4M ago
In the previous post we demonstrated how instructions embedded in untrusted data can invoke ChatGPT’s memory tool. The examples we looked at included Uploaded Files, Connected Apps and also the Browsing tool. When it came to the browsing tool we observed that mitigations were put in place and older demo exploits did not work anymore. After chatting with other security researchers, I learned that they had observed the same ..read more