"One Ring to Rule Them All?" E-Discovery Search Methodology in Patent Litigation in Light of Recent Model Orders and Case Law

Two Model Rules from the E-discovery-Kings under the sky:
Five or eight custodians for Tech-Lords in their courts of stone;
The vast production of metadata, perhaps doomed to die;
Five or ten search terms for the Dark Lord's e-mail on his dark throne
In the Land of Litigants where the patent Trolls lie.
But is there One Ring to rule them all? One Ring to find them?
One Ring to search them all and then produce and bind them,

In the Land of Litigants where patent cases lie?


"It's a dangerous business, Frodo, going out of your door . . .You step into the Road, and if you don't keep your feet, there is no knowing where you might be swept off to.”
              -- J.R.R. Tolkien, Lord of the Rings: The Fellowship of the Ring

Somewhere along the road of litigation and technology, e-discovery's All-Seeing Eye grew bigger than its stomach. Overall, only .0074% of documents requested and produced in litigation (less than 1 in 10,000) wind up on trial exhibit lists.  Still less are actually used. For e-mail, hotly demanded due to the hopes of finding a smoking gun in informal and hastily-sent communications, the proportion is even lower. This trend is especially concerning in intellectual property litigation -- patent cases in particular.

To combat this trend, two sets of courts -- let's call them the Fellowship of the E-Discovery Kings -- recently set on journeys to narrow the range of the All-Seeing Eye in patent litigation, issuing similar and helpful Model Orders for e-discovery to curtail mass and unnecessary production.  But whether there is really One Ring to Rule Them All when it comes to search methodologies -- one workable solution -- may not be as clear as the E-Discovery Kings propose.


"Advice is a dangerous gift, even from the wise to the wise, and all courses may run ill."
                  -- J.R.R. Tolkien, Lord of the Rings: The Fellowship of the Ring

First, in November 2011, the Advisory Council of the Federal Circuit promulgated a model rule for patent cases “to streamline e-discovery, particularly email production.” This Model Order's provisions (a) exclude e-mail from general production requests for ESI, requiring parties to serve requests seeking email production on specific issues; (b) limit to five, both the number of custodians whose email must be searched and the number of terms that can be used in Boolean searches of a party’s electronic correspondence; (c) preclude production of all but limited metadata absent good cause; and (d) require that if a party serves broader discovery, it would bear all reasonable costs. Courts have already adopted at least parts of the Model Order, applying it not just to "patent troll" cases but competitor-based patent litigation as well. See, e.g., DCG Sys., Inc. v. Checkpoint Techs., LLC, 2011 WL 5244356 (N.D. Cal. Nov. 2, 2011); Effectively Illuminated Pathways v. Aston Martin, No. 6:11-cv-00034 (E.D. Tex., Oct. 20, 2011).

Second, in February 2012 the Eastern District of Texas, one of the nation’s most popular patent litigation venues for plaintiffs (and the dwelling of trolls in particular), followed up its own Model Order for e-discovery in patent cases. The Texas version differs from the Federal Circuit version in that it (a) permits requesting e-mail from eight custodians; (b) doubles the permissible search terms to ten; and (c) does not contain the Federal Circuit's flexibility for additional discovery or its cost-shifting provisions for adding more custodians or search terms, but does allow parties to move to expand discovery for less than "good cause." Other provisions include requiring the parties to exchange the identities of the fifteen most significant email custodians, allowing for targeted early discovery, and providing guidance on the format of ESI production -- for example providing for production as TIFF images, governing when documents must be produced in searchable format, delineating the sources of data that must be preserved, and excusing parties from restoring back-ups and from preserving and collecting data from voicemails, PDAs and mobile phones, absent good cause.


"Short cuts make long delays."
       -- J.R.R. Tolkien, Lord of the Rings: Fellowship of the Rings

Together, the two Model Orders promise to be valuable resources, brainchildren of jurists with among the most patent litigation experience in the judiciary. However, a major presumption underlying both Model Orders is that key-word search terms are the One Ring to Rule Them All: the optimal and only search methodology. Nor is there guidance on how the parties should forge those terms to make sure they return the bulk of relevant and responsive material. In truth, key-word searching may be well on the way to becoming the hard-copy paper document review in today's electronically-favored process -- outmoded and outdated. As such, propounding keyword searching as the gold standard threatens to elevate process over substance, expediency over efficacy.

For example, the results returned from a simple key-word search can be over-inclusive. The terms may be so broad that they return a googol of "hits" to wade through, many of which have nothing to do with the core issues. The collection of terms can likewise be under-inclusive, returning very few hits due to a party's internal use of code words (i.e., "Operation Rivendell" for references to a particular patent or invention). Or perhaps the propounding party simply failed to guess or the searching party's failed to volunteer, "precious" catchphrases that would uncover what a party conceals in its back e-pocket. Ralph Losey, in his popular e-Discovery Team blog, has even referred to keyword searching as similar to a child's game of Go Fish, in which both players try to guess the other side's cards while attempting to conceal their own. Finally, even obvious terms can be overlooked in simple keyword searches: in Wingnut Films v. Katja Motion Pictures Corp., No. 05-1516-RSWL, 2007 U.S. Dist. LEXIS 72953 (C.D. Cal. 2007), for example, a litigation surrounding the Lord of the Rings movies, the target of a discovery request was admonished for having failed to search its servers for the simple phrase ‘‘Lord of the Rings’,’ And this was without a single-digit limit on search terms.


"Far below the deepest delving of the dwarves, the world is gnawed by nameless things." 
                  - J.R.R. Tolkien, Lord of the Rings: The Two Towers

Another important fallacy of keyword searching is the fact that it is increasingly not the names and words (the sole focus of keyword searching) that dictate importance. The exact terminology used in an e-mail is becoming less meaningful, with the context -- the individual(s) who sent and received the message, the timing, and its location on the system -- mattering more and more. Keyword searches alone will fail as an All-Seeing Eye:  it will miss many of these features.

What, then, is the One Ring to Rule Them All that should have been utilized? And is there even one? In parallel with our Fellowship's journey and the forging of their Model Orders in the hot fires of patent litigation, another power was on the rise: predictive coding. While such computer-assisted review tools (which allow for automation of a major proportion of document review, with less need for human management) have been around for several years, it was June 2011 when one of the larger e-discovery vendors was issued a patent on the process, thrusting it into even greater prominence. Unlike keyword search terms, predictive coding teaches computers to "predict" the relevant documents based not only on key terminology, but features like dates, names, broader phrases, and other items of context.  Moreover, it is estimated that by automating a significant amount of e-discovery review, predictive coding can save up to 70% of review costs.

While courts and attorneys have been slow to adopt this new technology, in the same month the Fellowship of the Eastern District of Texas forged its Model Order, Magistrate Judge Andrew Peck of the U.S. District Court for the Southern District of New York approved the use of predictive coding in Monique Da Silva Moore, et al. v. Publicis Group SA, et al, 2012 U.S. Dist. LEXIS 23350 (S.D.N.Y. Feb. 24, 2012), aff'd, 2012 U.S. Dist. LEXIS 58742 (S.D.N.Y. Apr. 26, 2012). While neither party in Da Silva Moore actually objected to the use of the technology -- the decision addressed implementation rather than use -- that same week a state court judge approved the use of predictive coding over one party's objection. Global Aerospace v. Landow Aviation , No. CL 61040 (Vir. Cir. Ct. Apr. 23, 2012). Between Da Silva Moore and Global Aerospace, it is thus becoming clear that the judiciary will not hesitate to incorporate predictive coding into e-discovery where appropriate. In fact, Judge Peck noted that computer-assisted review "should be seriously considered for use in large-data-volume cases," and Judge Andrew Carter, who affirmed Da Silva Moore, acknowledged that manual review, upon which keyword searching relies, "is prone to human error and marred with inconsistencies from the various attorneys determination of whether a document is responsive."


All that is gold does not glitter,
Not all those who wander are lost;
The old that is strong does not wither,
Deep roots are not reached by the frost. 

         --  J.R.R. Tolkien, Lord of the Rings: The Fellowship of the Ring

But despite the limits to keyword searches and the advantages of predictive coding, the latter may still not be an All-Seeing Eye, or the One Ring to Rule Them All. There are times in which the Old Ways of keyword searching may still be the best candidate. In patent troll cases, for example, where discovery tends to be disproportionately heavy on the accused infringer's side and lighter for the troll (who usually has no product or business other than patent licensing and enforcement to speak of), focused keyword searching may be sufficient. The same is true for other smaller patent and IP litigation disputes. Even then, however, the limitations of keyword searching make it prudent to use it only in conjunction with other search tools, and not as a stand-alone methodology. (Case in point: the arsenal of tools in predictive coding includes keyword searching). But for non-patent troll cases, and for more complex, competitor-based patent and other IP disputes (for example, trade secret misappropriation), predictive coding presents a new One Ring to Rule Them All, ensuring the capture of the greatest amount of relevant and responsive materials while still conserving costs.  At the very least, it presents a viable alternative to the Model Orders' presumption of keyword searching.

Unfortunately, at the time the Fellowship's Model Orders were written a year ago, predictive coding, like Strider, the Ranger of the North, had still not quite revealed itself as an Aragorn, a potential heir to the throne. If the Orders had come out today, perhaps they would have accounted for alternate methodologies. Today's Seventh Circuit E-Discovery Pilot Program, for example, is seriously considering the merits and pitfalls of various search methodologies, including predictive coding. Unfortunately, for now there remains a disconnect between the Model Orders and the growing acknowledgment of predictive coding among attorneys, their clients and the courts. Moreover, the best practices remain to be written, both for utilizing predictive coding and for choosing the best keyword searches under the Model Orders and otherwise. Thus, even if we could choose and carry One Ring to Rule Them All through the rocky terrain of patent litigation, the path on which to carry it remains unclear, much as it was for Frodo Baggins:

At last with an effort he spoke, and wondered to hear his own words, as if some other will was using his small voice. "I will take the Ring," he said, "though I do not know the way.”

Four Lessons Counsel can learn about Da Silva Moore and Predictive Coding

There’s good news in the world of electronic discovery. This February in New York, Magistrate Judge Andrew Peck and counsel for the parties in Da Silva Moore v. Publicis Groupe gave us a magnificent e-discovery lesson and pushed open the door for the utilization of advanced search technologies -- namely predictive coding, an increasingly used methodology of computer-assisted review.

The Case

The plaintiff filed a Title VII class action gender discrimination claim against defendant Publicis Groupe, alleging she and other female employees at Publicis Groupe endured discriminatory terminations, demotions and job reassignments. The plaintiff (who had very little, if any, electronically stored information (ESI) of her own to produce) demanded that Publicis Groupe produce documents (including ESI) that related to whether Publicis Groupe:

  1. Compensated female employees less than comparably situated males through salaries, bonuses or perks.
  2. Precluded or delayed selection and promotion of females into higher-level jobs held by male employees.
  3. Disproportionately terminated or reassigned female employees when the company was reorganized in 2008.

Based on the records requested and the number of custodians, the parties anticipated the document pool would be around three million documents, which would have likely cost in excess of $1 million with traditional keyword search methods. Instead of going this route, the parties agreed to something bold: review the documents using what has come to be called predictive coding, a methodology of computer-assisted review. By using these methods, the parties hoped to reduce the number of manually-reviewed documents from 3 million to 20,000.

The implementation of predictive coding is not simple. Fortunately, Da Silva Moore v. Publicis Groupe provides a lengthy guide on important topics such as methods to identify the initial seed set, iterative training rounds to refine the “predictive coding to assure reasonable recall” and methods of sampling to validate levels of confidence and confidence intervals.

What the Case Means for Your Business

Although discovery in Publicis Groupe is far from over, and the parties have each filed motions challenging portions of Judge Peck’s ruling, there are already lessons to be learned for how to effectively deploy computer-assisted review to reduce the cost of electronic discovery in your cases:

1. Have an expert, knowledgeable about the review tool you intend to use.  Judge Peck turned to the parties’ technical experts to explain the effect of the review protocol on the validity of the ultimate production. Surely judges less familiar with the technology could benefit from hearing from an expert in the field. Since experts tend to disagree (as they did in Publicis Groupe), it’s an absolute requirement to provide testimony about the operation and testing of the search tool chosen for the case.

2. Be willing to accept that you will not receive every potentially relevant document. Judge Peck put it best when he reminded counsel, “By the time you go to trial, even with six plaintiffs, if you have more than 100 trial exhibits it will be a miracle.” He also explained that, “The idea is not to make this perfect, it’s not going to be perfect. The idea is to make it significantly better than the alternative (human review) without nearly as much cost.”

Consequently, you have to be willing to risk that a computer will miss more documents than the recent law grad you would normally pay to sift through each page. Keep in mind that human review and key word search strings are far from perfect. Predictive coding when properly applied will likely enhance both recall and precision.

3. Cooperate with the opposition. The utilization of this technology requires engaged cooperation between the parties. Counsel must review and share the initial seed set with the opposition, and agree on statistical sampling techniques. Notwithstanding subsequent disputes, Da Silva Moore v. Publicis Groupe illustrates competent counsel working closely on e-discovery to meet the interests of both the plaintiffs and the defendant. Keep in mind that both sides agreed to utilize this advanced technology in this case. Of course, the devil is in the details, where reasonable litigants can disagree.

4. Understand the technology.  Even with technology experts at the ready, counsel were still necessary to advocate for their client’s interest in balancing the cost of discovery against the completeness of the final set of documents produced. Predictive coding is not right for all cases. It is not inexpensive—counsel must expend considerable up-front fees identifying the seed set and fine tuning the technology.

Touted as a practical, cost-saving and revolutionary solution, computer-assisted review is finally getting its chance to show what it’s worth. The private bar is watching anxiously to see whether it lives up to its billing.


This article was originally published in Inside Counsel.

E-Discovery: Cutting Costs with Predictive Coding

The cost of e-discovery is forcing good companies to settle bad cases—but not for long. If your litigation budget had ears, “predictive coding” would be music to them.

How it works

Predictive coding is a “technology-assisted classifying process” in which “a human reviewer codes documents the computer identifies (as responsive)—a tiny fraction of the entire collection. Then, using the results of the human review, the computer codes the remaining documents in the collection for responsiveness.” There are four phases to the predictive-coding process:

  • Phase 1: A senior lawyer chooses the responsive electronic documents based on his or her review of a sample of the electronic documents
  • Phase 2: Phase 1 is repeated with senior lawyers until the computer is sufficiently “trained” to apply their conclusions across a wide set of documents (or the whole document set)
  • Phase 3: The predictive coding software is deployed against the entire document set and will distinguish between relevant and non-relevant documents, or prioritize the documents on a scale of one to 100 (depending on the software you select)
  • Phase 4: The documents that are machine-coded as responsive are subjected to a final human quality review and produced to the opponent

How it saves your company money

Using predictive coding software replaces the once overcrowded rooms of contract attorneys who pored over millions of records and billed by the hour. Rather than hiring 15 $80-per-hour reviewers working 40 hours per week for three weeks for a total review cost of $144,000, your company could conduct the same review with three senior lawyers at $600 per hour for eight hours at a total labor cost of $9,600, saving $134,400 without the cost of using the software. Furthermore, the empirical data on predictive coding confirms “the levels of performance achieved by ... technology-assisted processes exceed those that would have been achieved by ... the law students and lawyers employed by professional document-review companies — had they conducted a manual review of the entire document collection.”

So, why isn’t anyone using predictive coding yet?

No one wants to be the guinea pig. To date, no court has evaluated (or endorsed) the use of predictive coding.

However, a forceful judicial “endorsement” has been asserted by Andrew Peck, U.S. Magistrate Judge for the Southern District of New York:

I know what you’re waiting for: You think one day a judge will deliver an opinion or a judgment which says in terms that a particular kind of technology is approved by the court. ... Perhaps you have a mental picture of the occasion: “It is the opinion of this court that the use of predictive coding is a proper and acceptable means of conducting searches under the Federal Rules of Civil Procedure. ...” Perhaps the judge will go on to praise the car which he or she drove to work, offer an endorsement of the floor polish used in the court, and give a quick puff, as it were, for his own favorite brand of cigarette. IT’S NOT GOING TO HAPPEN!

Take advantage of predictive coding now

Given that both counsel and clients risk hefty sanctions (including default or dismissal) if the predictive coding software fails to “predict” the relevance of an important document, it is wise to be cautious. Litigators, however, can and should take advantage of the cost-saving benefits of predictive coding now by involving the court and the opponent in the predictive-coding process.

1. Learn about predictive coding technology and select a vendor;

2. Seek the opponent’s agreement to use the technology after fully disclosing the risks (in writing);

3. If the opponent agrees, identify to the opponent the documents you have identified as relevant that will guide the software;

4. If the opposition does not agree, run a demonstration on a sample set to prove to the opposition the validity of the software and method;

5. If the opposition still does not agree, move the court to compel your opponent to pay for the cost of a manual review.

Predictive coding is far too enticing a cost-saving mechanism to remain in the shadows for very long. Use the above approach to introduce predictive coding into your cases, and your outside counsel will be able to get back to spending your litigation budget to win bad cases instead of settling them.


This article was originally published in Inside Counsel.