Crunching the Numbers for Predictive Coding
What the 2012 presidential election tells us about Predictive Coding
No matter whether your candidate won or lost, a fascinating story emerged in the wake of the 2012 presidential election. Stories continue to be written that suggest the way candidates play the political game fundamentally changed this year (for better or worse remains to be seen). One very interesting article from Time Magazine, “Inside the Secret World of the Data Crunchers Who Helped Obama Win”, provides a glimpse at the data crunchers (or “quants”) who helped President Obama retain that title for another four years. For me, the most interesting aspect of this article was the tacit recognition that the days of those who could intuit what would happen in a given political cycle have come to an end. Other stories seem to reach this same conclusion. As another journalist from the LA Times observed, “statistical methods and advanced computing power” triumphed over “partisan bias and conventional wisdom.” And why shouldn’t it – after all, in the Big Data era why should we continue to rely upon outdated tools to help pundits guess as to how voters will behave?
In the legal space, the idea of tracking metrics to make better decisions also has been gaining acceptance. Last week during a webinar for Inside Counsel we discussed how effective project management coupled with advanced review strategies such as predictive coding could allow companies the ability to dissect their legal spend and ultimately reduce outside counsel fees. During our discussion we stressed the idea of deconstructing the litigation into component pieces and tracking every metric throughout the legal process so that cost projections can move away from the guestimate of how expensive the case “feels” and move towards a more refined approach, i.e. based on these data points and given these variables the cost of this litigation should be X (within an appropriate margin of error).
As is the case with the political pundits, no longer should clients ask their lawyers to give them cost projections only to receive a cost estimate that was divined from the lead lawyer’s intuition about costs. Although this approach might be successful in providing a rough estimate of the total cost of the litigation, it does little to help the company determine the costs associated with components of the work. Because it is an imprecise look at the costs of the litigation, it also denies the client the ability to decide which of these components can be performed more cheaply, faster, more efficiently, etc. by outsourcing, insourcing, automating those components. In essence, the company is reduced to a low information consumer and is apt to making a poor decision.
A similar problem plagues the emerging market around predictive coding and other varieties of technology assisted review. The data gathered shows a clear return on investment provided the case has a large enough volume of data. Based on these figures, many eDiscovery vendors, outside counsel, and even judges have drawn the faulty conclusion that predictive coding is only appropriate for “big cases.” The argument goes that the technology is too expensive to use in every case, and this means that traditional linear review workflows should remain in the market. Of course, when pressed for data to support this position there is none to offer. A discussion ensues about the lack of evidence to back up the argument, and ultimately the position against widespread adoption of predictive coding is based more upon instinct and intuition. In this era of metric tracking and analysis it is time for this intuitive argument to either put up or shut up.
The ROI on Predictive Coding for all cases – pay less and work more quickly
Predictive coding is a tool that comprises a larger set of advanced review strategies. Recommind’s Axcelerate eDisocovery offering uses all of these review strategies to allow the most knowledgeable attorneys to develop and assess the facts of the case more rapidly. The return on investment is still present in the smaller case; it is just not as large a return on investment as we find in the larger cases. Predictive coding does not require large seed sets or large data populations to function appropriately. Instead, the efficiencies in these larger cases are simply more pronounced. All reviews can benefit from some aspects of Axcelerate’s CORE technology (even if it is simply to prioritize the review process or as a quality control measure). Again, while the savings might not be as large with a “small” case, there will still be savings. After all, wouldn’t your company always prefer to pay less and work more quickly when it comes to the document review process?
The bottom line is that predictive coding is a powerful tool in the larger set of advanced review strategies and these strategies should be utilized in any sized case. The obvious cost savings associated with predictive coding are a function of the size of each case; however, the benefit of leveraging the technology to find documents that might be missed by human reviewers is beneficial in every instance. So, just as the political experts of the 20th century must give way to those new data scientists who draw their conclusions from data instead of finding data points to support their conclusions, so too must those in the eDiscovery ecosystem who say that predictive coding is only for “big” cases without any data to back it up. And for those clients (and potential clients) who think that advanced review strategies, including predictive coding, are not worth the investment because the litigation does not seem that big, I challenge you to see what the data says and not be fooled by the person who is long on feeling, but short on facts.