What is the blank page problem in eDiscovery? And how can you solve the problem in your case? Here are some solutions to consider.
Back in the early 2000s, I remember when attorneys, investigators, and paralegals would cram into a room filled with boxes of discovery to do reviews. It was a painstaking linear process. However, with the evolution of eDiscovery tools, we have all that evidence digitally and from the comfort of our computers. So, you get your litigation technologist to process and organize all your data, then put it in a fancy review platform……then what? Do you still do a linear review? It’s not so easy with the massive amounts of data we receive today. You might find yourself saying where do I begin? What do I even have? That is what we call the blank page problem.
The blank page problem in eDiscovery refers to the challenge of starting from scratch when reviewing vast amounts of documents, identifying key insights, or drafting legal documents like case narratives, investigative reports, or briefs. Given the massive volume of data in eDiscovery, legal teams often struggle with many obstacles. With thousands or even millions of documents, it can be overwhelming to determine the most relevant starting point for review or analysis. Teams may waste time sifting through irrelevant material instead of focusing on crucial information. Also, deadlines often make it challenging to conduct deep manual reviews before drafting motions, summaries, or reports.
Many review platforms have built-in tools to help solve this issue. In this article, I will discuss different categories of TAR (Technology-Assisted Review) tools, among other technologies, so use it as a framework for what you can leverage. Before I start, I want to ensure you understand that these technologies are not the final solution to your litigation questions. These tools provide a catalyst for starting your review, discovering patterns, and quicker collaboration.
Clustering
Clustering helps solve the blank page problem by visually organizing large document sets, making it easier to identify key themes and start writing with a clear direction. Instead of manually sifting through thousands of documents, clustering groups of similar documents based on content. This lets you see patterns and themes at a glance, giving you a structured starting point for writing summaries, case narratives, or reports. The cluster visualization helps you pinpoint the most relevant document groups, reducing information overload. If drafting a legal brief or case summary, you can focus on the most significant clusters instead of reviewing everything simultaneously. By grouping related documents, clustering speeds up review, allowing you to extract key evidence and legal arguments faster.
Data Visualizer
Data Visualizers help solve the blank page problem by transforming raw data into meaningful visual insights. This makes it easier to identify key themes and filter down to relevant documents. Your data is displayed in interactive charts, graphs, and timelines, which can streamline eDiscovery, investigations, and case strategy development.
Example Use Cases of a Data Visualizer:
- Communication & Timeline Analysis – This shows who communicated with whom and when, helping identify key custodians, conversation patterns, and spikes in activity that may be relevant to the case.
- Email Domains Analysis – A pie chart breaking down emails by domain is helpful in spotting key communicators early.
- Custodian Analysis – A bar chart displaying document counts per custodian, aiding in relevance assessment.
- File Type Breakdown – A visualization of different document types (PDFs, Word, Excel, etc.) in a dataset. If you know the bulk of the evidence is from a particular document type, then you can dive into that specific dataset first.
Predictive Coding
Predictive coding in eDiscovery refers to using machine learning algorithms to help review and classify large sets of ESI (electronically stored information). It is a type of TAR that helps streamline document review by training a system to recognize relevant and non-relevant documents based on human input. Predictive coding may not be an instant solution to the blank page problem, but after a couple of days of tagging documents, it can quickly boost the visibility of pertinent data.
How Predictive Coding Works
- Training Phase – Legal experts manually review a small set of documents, tagging them as relevant or non-relevant. You can create multiple models using various criteria, enabling you to focus on documents relevant to different aspects or issues of your case.
- Machine Learning – The predictive coding software analyzes these tagged documents and learns patterns.
- Prediction Phase – The system applies what it learned to the rest of the document set, prioritizing or categorizing them.
- Validation – Experts review a sample of the machine’s decisions to ensure accuracy.
- Refinement – The system is fine-tuned as needed through additional training rounds.
Search Term Reports
Search Term Reports (STRs) are detailed reports that provide insights into search terms used within an eDiscovery process. These reports help legal teams analyze the effectiveness of search queries and assess the relevance of search results. By analyzing hit counts and trends across different search terms, STRs help reveal key themes in the data. If initial search terms return too many or too few results, STRs allow for refinement of keyword lists. STRs quickly identify the most relevant documents, allowing you to focus on important evidence instead of aimlessly reviewing large datasets.
Narrative-builder (Shared Legal Pad)
Tools like LexisNexis’ CaseMap and Everlaw’s Storybuilder can efficiently organize and visualize case details. They help legal professionals track facts, issues, key players, and evidence to build a more potent case strategy. Think of it as if your legal pad and Excel were fused into one but with collaboration functions.
Key features toward solving the blank page problem:
- Fact Management – Organizes case facts with timestamps, sources, and relevance. These tools will let you link the actual document that supports the fact. Getting the pieces of the puzzle together helps to accelerate finding the answers to your litigation questions.
- Issue Linking – Connects facts to legal issues for quicker argument development.
- Collaboration – Allows teams to work together on case analysis. This may be the most important in my opinion. If your team is sharing the same “legal pad” then that blank page is filled faster. Also seeing what your team is finding in real time can help aid you in your understating of the data.
- Timelines & Reports – Generates detailed reports and visual timelines based on the facts discovered for case progression.
Conclusion
The blank page problem in eDiscovery can be overwhelming, but with the right tools and strategies, legal professionals can efficiently navigate vast datasets and extract meaningful insights. Technology-assisted review methods such as clustering, data visualization, predictive coding, and search term reports provide structured starting points, helping to streamline document review and case analysis. Additionally, tools like narrative-builders enhance collaboration, ensuring that teams can work together effectively to construct compelling case narratives.
While these technologies are not a one-size-fits-all solution, they serve as powerful catalysts in the legal discovery process, allowing teams to focus on critical information faster and confidently make informed decisions. By leveraging these technologies, legal professionals can transform the challenge of a blank page into an opportunity for more efficient and effective case preparation.
Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by my employer, my partners or my clients. eDiscovery Today is made available solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscovery Today should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.
Discover more from eDiscovery Today by Doug Austin
Subscribe to get the latest posts sent to your email.




Another approach that I developed back in 2013 is “contextual diversity”. it’s the notion of having an algorithm that explicitly surfaces what you don’t know that you don’t know. And moreover, surfaces the biggest pockets of those unknown unknowns in continuously iterative smaller and smaller radii, i.e. from the big to the small, relative to what you’ve encountered in the past.
Contextual = everything you’ve seen up to that point
Diversity = the biggest pocks of unknown unknowns, relative to that context
[…] The Blank Page Problem in eDiscovery and How to Solve It (eDiscovery Today) […]