Proposed Additional Search Terms Denied by Court

In Tremblay v. OpenAI, Inc., No. 23-cv-03223-AMO (RMI) (N.D. Cal. Feb. 27, 2025), California Magistrate Judge Robert M. Illman ruled on several disputes, including denying plaintiffs’ proposed additional search terms, after having previously denied their request to have input on the search process two other times.

Case Discussion and Judge’s Ruling

This case involves copyright disputes between authors and publishers and OpenAI. Judge Illman noted that the Court had denied Plaintiffs’ suggestion that the requesting Party have input in determining the search terms on July 31, 2024, and also denied it for a second time on January 13, 2025 ([f]inding that Plaintiffs had failed to show why the production was insufficient).

Judge Illman stated: “It appears that Plaintiffs now seek to litigate that issue for a third time, and by doing so, it appears that the objective of the court’s initial ruling (i.e., to prevent delays and the stalling of the discovery process stemming from endless discovery disputes over methodology and search-term formulation) has been nevertheless frustrated. While OpenAI submits that Plaintiffs have identified no deficiencies in its productions… – Plaintiffs essentially respond to the effect that they ‘by definition know little about documents OpenAI hasn’t produced.’…Plaintiffs note that their review of ‘OpenAI’s productions and search terms, identified areas where the terms and connectors appeared too narrow or didn’t capture lingo used by OpenAI’s own employees, and proposed additional search terms.’…Plaintiffs add that their proposed eight search ‘strings’ are not burdensome because they would yield only 345,000 documents, which ‘[w]ith technology assistance, e-discovery vendor DISCO estimates a 345,000- document review takes 2-3 weeks, [from] upload through production.’”

Continuing, he added: “OpenAI responds by noting that Plaintiffs’ request now not only seeks the addition of ‘numerous torrent-related search terms, but also…hundreds of more terms packed into eight search strings, including nonsensically overbroad terms like (ChatGPT AND ‘be doing’), (memori* AND data), and (seed*) . . . [which] [b]ased on review time to date, Plaintiffs’ terms would add over 9,000 hours of attorney review time with no showing that OpenAI’s productions are deficient.’…As to proportionality, OpenAI submits that ‘Plaintiffs purport to propose only 8 search ‘strings,’ but those search strings consist of 362 search terms, hitting on an additional 345,000 documents beyond the over 640,000 documents that OpenAI has already agreed to review.’…OpenAI submits that requiting it to introduce hundreds of thousands of non-responsive documents into its review queue will only prolong fact discovery while failing to add to the substance of Plaintiffs’ case…The court agrees with OpenAI’s assessment here. Without repeating the details of the Parties’ disputes as to the numerous search terms included in Plaintiffs’ proposed search strings – of which, string numbers 1 and 4 were set forth by OpenAI as exemplars… – the court cannot avoid the conclusion that Plaintiffs are trying to relitigate an issue they have already lost (i.e., that they should be involved in the search term formulation process without making a clear showing of prejudice stemming from gaps or deficiencies in the search terms disclosed by the producing Party).” So, he denied plaintiffs’ proposed additional search terms.

OpenAI also sought discovery from individuals and entities working on behalf of the Plaintiffs (literary agents, ghostwriters, loan-out companies, assistants, production companies) and from Plaintiffs’ publishers.

Saying “The court finds that this dispute is representative of the Parties talking past one another – that it is the manifestation of poor meet-and-confer efforts – and, that the “dispute” evades judicial resolution as it has been presented”, Judge Illman ordered the parties to meet and confer regarding agents. Regarding publishers, he denied the request as to documents beyond the scope of Plaintiffs’ rights under their publishing agreements but granted it as to any relevant material within the scope of those agreements. He suggested that OpenAI use Rule 45 subpoenas to obtain documents from publishers that Plaintiffs cannot obtain themselves.

Plaintiffs also sought the inclusion of file names in OpenAI’s privilege logs. Judge Illman denied the request, saying: “The court finds this sort of back-and-forth to constitute yet another manifestation of the Parties’ drumming up a discovery fight about a subject that should have been worked out by the Parties themselves without the need for court intervention. The dispute itself is insubstantial – the essence of which boils down to little more than petty finger-pointing about whose time was wasted by whom.”

OpenAI also sought discovery from Plaintiffs’ separate, purportedly similar, copyright infringement case against Meta Platforms. Stating: “Right off the bat, the undersigned struggles to comprehend the usefulness of OpenAI’s interest in seeking information from the Kadrey case as to whether or not Plaintiffs actually own the copyrighted works they have asserted in both cases. Unless there is a specific issue as to ownership (and none has been articulated in this letter brief by OpenAI) the undersigned sees no material advancement of OpenAI’s defense in this case by putting Plaintiffs to the burden of producing material from the Kadrey case that simply confirms what OpenAI already knows (or will know) to be true in this case – that is, that Plaintiffs are the owners of the works at issue in this case”, Judge Illman denied that request.

So, what do you think? Should the Court have denied plaintiffs’ proposed additional search terms? Please share any comments you might have or if you’d like to know more about a particular topic.

Case opinion link courtesy of Minerva26, an Affinity partner of eDiscovery Today.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by my employer, my partners or my clients. eDiscovery Today is made available solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscovery Today should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

Discover more from eDiscovery Today by Doug Austin

Subscribe to get the latest posts sent to your email.

2 comments

This Week in eDiscovery: Smartwatch Data in Personal Injury Litigation | No Evidence, No New Search Terms - Array says:

May 1, 2025 at 1:55 pm

[…] likely to yield information relevant to the litigation, eDiscovery Today writes about the matter of Tremblay v. OpenAI, Inc., in which authors and publishers allege copyright infringement. After the court twice denied the […]

Loading...

Third Installment of Key eDiscovery Points with Brett Burney of Nextpoint says:

August 6, 2025 at 11:30 am

[…] Terms Results in Wasted Time: Here, we discuss key takeaways from Tremblay v. OpenAI (covered by me here), where the court denied plaintiffs’ proposed search terms for a third time. We discuss […]

Loading...

eDiscovery Today by Doug Austin

eDiscovery Today – Doug Austin

Proposed Additional Search Terms Denied by Court: eDiscovery Case Law

Like this:

Related

Discover more from eDiscovery Today by Doug Austin

2 comments

Leave a ReplyCancel reply

Proposed Additional Search Terms Denied by Court: eDiscovery Case Law

Related Posts

Share this:

Like this:

Related

Discover more from eDiscovery Today by Doug Austin

2 comments

Leave a ReplyCancel reply

Discover more from eDiscovery Today by Doug Austin