Researchers Extract Megabytes of Data from AI Chatbot Using Surprising Technique

Researchers Extract Megabytes of Data from AI Chatbot Using Surprising Technique

A team of researchers has managed to dig into the inner workings of OpenAI’s ChatGPT, a widely used AI chatbot, by exploiting its underlying language model. Surprisingly, they’ve extracted quite a chunk of training data – several megabytes, to be precise. The researchers expressed astonishment that their approach actually worked, emphasizing that this vulnerability should have been spotted and addressed sooner.

Their method? It’s surprisingly simple. They asked ChatGPT to repeat a single word endlessly, using ‘poem’ as an example. Strangely enough, after several repetitions, the chatbot started spewing out lengthy text passages sourced verbatim from various internet locations. What’s more concerning is that these passages contained sensitive information like email addresses, names, birthdays, addresses, and phone numbers.

But here’s the kicker: this attack isn’t limited to just ChatGPT. According to the researchers, similar exploits could work on other language models like Pythia, GPT-Neo, Llama, Falcon, and more, allowing attackers to pull gigabytes of data from them.

Their newly developed method tricks these language models into churning out training data at a rate 150 times higher than normal. And it’s not just a small amount of data either – they’ve shown that way more info can be retrieved than previously thought possible. Even the techniques used to prevent these models from memorizing information seem to have fallen short.

Reportedly, the extracted training data included bits and pieces from terms of service, code from Stack Overflow, Wikipedia pages, news blogs, company websites, and random internet comments.

Now, the researchers claim they shelled out about 200 dollars to access these megabytes of training data. They suggest that those willing to spend more could likely get their hands on gigabytes of data from these language models.

However, it seems unlikely that this method would still work today. OpenAI was informed about this vulnerability back in August 2023 and has likely taken steps to address it since then. The folks at Golem tried to replicate the attack themselves but couldn’t succeed.

Previous articleActivision Blizzard Mulls Launching Rival App Store, Strikes Lucrative Deal with Google
Next articleMicrosoft Ends Support for Microsoft 365 Extension on Chrome and Edge
Carl Woodrow
A seasoned tech enthusiast and writer, Carl delves deep into emerging technologies, offering insightful analysis and reviews on the latest gadgets and trends.