Google Researchers Revealed ChatGPT Training Data with One Word, OpenAI Says That’s a Violation

Google researchers revealed ChatGPT training data, showing how to do so with a single word. Now, OpenAI says that’s a violation of its Terms of Service.

In a paper published last week, the researchers said certain keywords forced the bot to divulge sections of the dataset it was trained on. In one example published in a blog post, the model gave out what appeared to be a real email address and phone number after being prompted to repeat the word “poem” forever, which they said was “kind of silly”.

Another leak of training data with personal information was also achieved when the model was asked to repeat the word “company” forever.

The researchers stated that they were able to “extract several megabytes of ChatGPT’s training data for about two hundred dollars”.

My favorite part of the blog post was where they suggested researchers should pause reading and switch over to the full paper, while the post “spends some time discussing the ChatGPT data extraction component of our attack at a bit of a higher level for a more general audience (that’s you!).” Um, OK.

Now, OpenAI has done something about it. According to Engadget, asking ChatGPT to repeat words “forever” is now a violation of ChatGPT’s terms of service, according to Engadget’s own testing, illustrated via the image above.

“This content may violate our content policy or terms of use”, ChatGPT responded to Engadget’s prompt to repeat the word “hello” forever. “If you believe this to be in error, please submit your feedback — your input will aid our research in this area.”

There’s no language in OpenAI’s content policy, however, that prohibits users from asking the service to repeat words forever, something that 404 Media notes. Under “Terms of Use”, OpenAI states that users may not “use any automated or programmatic method to extract data or output from the Services” — but simply prompting the ChatGPT to repeat word forever is not automation or programmatic.

It may not be part of the Terms of Use today, but I suspect it will be shortly. Google researchers revealed ChatGPT training data last week, OpenAI closed the loop this week. Game on! Even the “more general audience” can understand that! 😉

So, what do you think? Are you surprised that Google researchers revealed ChatGPT training data simply by asking it to repeat a single word? Please share any comments you might have or if you’d like to know more about a particular topic.

Disclaimer: The views represented herein are exclusively the views of the authors and speakers themselves, and do not necessarily represent the views held by my employer, my partners or my clients. eDiscovery Today is made available solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscovery Today should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

Discover more from eDiscovery Today by Doug Austin

Subscribe to get the latest posts sent to your email.

eDiscovery Today by Doug Austin

eDiscovery Today – Doug Austin

Google Researchers Revealed ChatGPT Training Data with One Word, OpenAI Says That’s a Violation: Artificial Intelligence Trends

Like this:

Related

Discover more from eDiscovery Today by Doug Austin

One comment

Leave a ReplyCancel reply

Google Researchers Revealed ChatGPT Training Data with One Word, OpenAI Says That’s a Violation: Artificial Intelligence Trends

Related Posts

Share this:

Like this:

Related

Discover more from eDiscovery Today by Doug Austin

One comment

Leave a ReplyCancel reply

Discover more from eDiscovery Today by Doug Austin