OpenAI Has Threatened to Ban Users Who Probe ChatGPT o1

Want to figure out how ChatGPT o1 “thinks”? Don’t poke around too much, as OpenAI has threatened to ban users who try to do so.

As reported in Ars Technica (Ban warnings fly as users dare to probe the “thoughts” of OpenAI’s latest model, written by Benj Edwards and available here), OpenAI has been sending out warning emails and threats of bans to any user who tries to probe how the model works.

Unlike previous AI models from OpenAI, such as GPT-4o, the company trained o1 (aka, “Strawberry”) specifically to work through a step-by-step problem-solving process before generating an answer. When users ask an “o1” model a question in ChatGPT, users have the option of seeing this chain-of-thought process written out in the ChatGPT interface. However, by design, OpenAI hides the raw chain of thought from users, instead presenting a filtered interpretation created by a second AI model.

Nothing is more enticing to enthusiasts than information obscured, so the race has been on among hackers and red-teamers to try to uncover o1’s raw chain of thought using jailbreaking or prompt injection techniques that attempt to trick the model into spilling its secrets. There have been early reports of some successes, but nothing has yet been strongly confirmed.

Along the way, OpenAI is watching through the ChatGPT interface, and the company is reportedly coming down hard on any attempts to probe o1’s reasoning, even among the merely curious.

One X user reported (confirmed by others, including Scale AI prompt engineer Riley Goodside) that they received a warning email if they used the term “reasoning trace” in conversation with o1. Others say the warning is triggered simply by asking ChatGPT about the model’s “reasoning” at all.

The warning email from OpenAI states that specific user requests have been flagged for violating policies against circumventing safeguards or safety measures. “Please halt this activity and ensure you are using ChatGPT in accordance with our Terms of Use and our Usage Policies,” it reads. “Additional violations of this policy may result in loss of access to GPT-4o with Reasoning,” referring to an internal name for the o1 model.

Here’s an example of an OpenAI warning email received by Marco Figueroa, who manages Mozilla’s GenAI bug bounty programs:

OpenAI warning email received by a user after asking o1-preview about its reasoning processes.

In a post titled “Learning to Reason with LLMs” on OpenAI’s blog, the company says that hidden chains of thought in AI models offer a unique monitoring opportunity, allowing them to “read the mind” of the model and understand its so-called thought process. OpenAI decided against showing these raw chains of thought to users, citing factors like the need to retain a raw feed for its own use, user experience, and “competitive advantage.”

You can bet that last reason is the biggest – by far. Competitive advantage is the biggest reason any company creating an AI model uses for not being transparent about what its doing. Why would anyone expect this to be any different?

So, what do you think? Are you surprised that OpenAI has threatened to ban users who try to figure out what ChatGPT o1 is “thinking”? Please share any comments you might have or if you’d like to know more about a particular topic.

Image created using GPT-4o’s Image Creator Powered by DALL-E, using the term “robot scientist examining a strawberry under a microscope”.

Disclaimer: The views represented herein are exclusively the views of the authors and speakers themselves, and do not necessarily represent the views held by my employer, my partners or my clients. eDiscovery Today is made available solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscovery Today should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

Discover more from eDiscovery Today by Doug Austin

Subscribe to get the latest posts sent to your email.

eDiscovery Today by Doug Austin

eDiscovery Today – Doug Austin

OpenAI Has Threatened to Ban Users Who Probe ChatGPT o1: Artificial Intelligence Trends

Like this:

Related

Discover more from eDiscovery Today by Doug Austin

Leave a ReplyCancel reply

OpenAI Has Threatened to Ban Users Who Probe ChatGPT o1: Artificial Intelligence Trends

Related Posts

Share this:

Like this:

Related

Discover more from eDiscovery Today by Doug Austin

Leave a ReplyCancel reply

Discover more from eDiscovery Today by Doug Austin