Statistical Sampling in Document Review Table of Contents

Statistical Sampling in Document Review

Table of Contents • • • Overview of sampling in e-Discovery Why sample? Types of sampling What is “statistically valid” sampling? Mathematical considerations before sampling How to sample in document review Judicial acceptability Potential pitfalls of sampling Resources

Overview of Sampling in e. Discovery • What is statistical sampling? – a method of selecting a portion of a population, by means of mathematical calculations and probabilities, for the purpose of making scientifically and mathematically sound inferences regarding the characteristics of the entire population • How is sampling being used in e-Discovery? – Collection • What to collect – Processing • Quality testing – Review • What to review • Quality of review decisions • Quality of keywords

Why sample? • Sample when examining an object alters or destroys it. • Sample when it is too time consuming and/or expensive to analyze the entire population. • Sample when the human error rate associated with analyzing the entire population is too high. • Human error rates run in the neighborhood of 3% for large, repetitive tasks • The sampling error rate can be controlled to very fine tolerances based on the sample size examine

Types of Sampling • Judgmental sampling • Statistically valid – Simple random sampling – Stratified random sampling – Systematic sampling

What is “statistically valid” Sampling? “Laws” of Sampling • The Law of Large Numbers says that the average of a sufficiently large random sample drawn from a large population is likely to be close to the mean of the entire population. • The Poisson Distribution expresses the probability of a number of events occurring in a fixed time or space if these events occur with a known average rate, and are independent of each other. We can use the Poisson distribution to tell us how a big a sample size is required to satisfy the Law of Large Numbers.

(Mathematical) Considerations before sampling • Define the confidence level – The higher the confidence level the larger the sample size • Acceptable error rate

How to sample in document review? • Define the objective – asking questions you can’t stand the answer to? – What will the result tell you? – What population to sample? (reviewed documents, not reviewed, etc. ) • Next step – Define the population to collect… – Redo the review … – Refine the keywords … • Logistical and Technical considerations – – Manual process: no applications designed specifically to conduct sampling Sample size Generating the random sample Documenting / capturing the decisions made during the review of the sample documents – Document the process

Potential pitfalls of sampling • • • Defending judgmental sampling Inadequate documentation of the process Not a truly random sample Sample size not accurate Sampled the inappropriate population for the circumstance Can’t stand the answer but now you have it

Judicial Acceptability • A (judgmental) sampling of the case Law: – “Sampling has long been considered an acceptable method of determining the characteristics of a large universe … such mathematical and statistical methods are well recognized by courts as reliable and acceptable in determine adjudicative facts. ” See Rosado v. Wyman, 322 F. Supp. 1173, 1180 (E. D. N. Y. ), aff’d 437 F. 2 d. 619 (2 d Cir. 1970), aff’d 402 U. S. 991, 91 S. Ct. 2169, 29 L. Ed. 2 d 157 (1971) – In Re: Vioxx Product Liability Litigation the U. S. Court of Appeals concludes that it is appropriate for the district court to implement a sampling protocol to review documents in a discovery dispute related to claims of privilege – Farmers Insurance Company, Inc. vs. Peterson, 81 P. 3 d 659, 2003 OK 99; “…discoverable information may be obtained by use of a statistical sampling method without the actual examination of each and every file…” – Cimino, et al. v. Raymark Industries, Inc. et al. , 751 F. Supp. 649; 1990 U. S. Dist. LEXIS 15708; the USDC employed statistical sampling of a representative sample of the cases as a means to award damages in lieu of individual adjudication – Zurich American Insurance Co. v. Ace American Reinsurance Co. , 2006 U. S. Dist. LEXIS 92958 (S. D. N. Y. Dec. 22, 2006); the court ordered the parties to devise a protocol for sampling data in a discovery dispute confidential

Resources • Manual for Complex Litigation, Federal Judicial Center • Reference Manual on Scientific Evidence, Federal Judicial Center • Sample Data as Evidence: Meeting the Requirements of Daubert and the Recently Amended Federal Rules of Evidence, Georgia State University Law Review, Volume 18, Number 3, Spring 2002 • American National Standards Institute and American Society for Quality, Standard Z 1. 4 -2003, Sampling Procedures and Tables for Inspection by Attributes
- Slides: 11