- eDiscovery 101: Best Practices For Setting Up An eDiscovery Operation
- 1. What is eDiscovery?
- 2. The EDRM Framework
- 3. Information Governance
- 4. Discovering Data Sources
- 5. Retaining & Preserving data
- 6. Data Collection
- 7. Data Processing
- 8. Data Review & Analysis
- 9. Production & Presentation
- 10. Conclusion
eDiscovery 101: Best Practices For Setting Up An eDiscovery Operation
eDiscovery refers to discovery in legal proceedings such as litigation, government investigations, or Freedom of Information Act requests, where the information sought is in electronic format. eDiscovery tools can also be used to aid internal investigations and manage information governance.
1. What is eDiscovery, and why does it matter?
An eDiscovery software automates and facilitates the identification, preservation, collection, processing, review, analysis and production of digital data supporting the discovery process in litigation or other investigative proceedings.
eDiscovery mainly concerns large investigations, involving a multitude of custodians, multiple reviewers and large amounts of data. However, as data volumes continue to grow, smaller investigations increasingly fall into scope, as even a single reviewer and/or custodian increasingly find themselves sifting through large amounts of data.
eDiscovery best practices for an optimized operation
eDiscovery contains a complex set of operations, and requires specialized knowledge of legal procedures, information technology and the flow of information within the company. By their very nature, procedures that require eDiscovery operations are disruptive, as courts and regulatory organizations often set tight deadlines for meeting requirements in terms of evidence production.
For any organization aiming to set up or optimize an eDiscovery operation, having a clear path to success means making use of best practices and learnings from peers, in order to avoid having to reinvent an especially complicated wheel.
Taking advantage of our own experiences assisting in the establishment of many eDiscovery teams for a variety of organizations, we’ve put together this playbook. Our purpose is to present the best practices and takeaways from our experiences, in order to help organizations create a predictable, repeatable and accountable eDiscovery process.
We aim to help organizations avoid ad-hoc decision-making, and provide a clear path to achieving a productive, solution-based eDiscovery operation, that helps team members understand their roles and responsibilities.
the EDRM
2. The EDRM Framework
The reference process (see figure 1) is the Electronic Discovery Reference Model (EDRM) which has been widely adopted by the legal industry. It provides the framework for the eDiscovery cycle and consists of nine phases; four dealing with on-premises identification, preservation, and collection of data (also referred to as the “left-side”) and five (referred to the “right-side”) that deal with the analysis, preparation, and presentation of data.
Retention refers to the process utilized to manage documents from creation to destruction (also referred to as the record life cycle). Generally, the principle criteria that governs a record life cycle is the purpose for which the record is created and utilized. Retention programs, also known as record management programs, most often direct that records should be discarded once they are no longer used by an organization. The scope and nature of retention programs and schedules are based on its regulatory retention obligations, operational needs, resources and risk tolerance. The retention periods must meet the requirements from the Minimum Standard Record Retention.
Preservation: refers to the preservation of information and records specifically for purposes of litigation and other legal matters, such as governmental investigations and third-party subpoenas. They commence upon the occurrence of a Triggering Event and end when the relevant dispute or investigation is finally resolved.
Electronically Stored Information (ESI): refers to information created, manipulated, communicated, stored or utilized in digital form (i.e., on computer equipment, servers, hard drives, personal digital assistants, smartphone devices, back-up tapes, sensors, web-based storage, etc.). ESI includes, among other things, employee e-mail, home directories and chat logs. Next to these, sources are groups directories, payments systems, log files and other production systems.
Litigation Hold (or Legal Hold): refers to the procedure utilized to preserve materials that are potentially relevant to a pending or threatened legal matter. The purpose of a Litigation Hold is two-fold: first, to identify material potentially relevant to a specific legal action; second, to direct that those in possession, custody or control of such material take appropriate preservation measures.
Triggering Event: refers to any act or occurrence upon which the company reasonably should anticipate that a legal action has been or will be brought in which the company is a party, target or material witness. Such act or occurrence can either be formal (e.g., service of process on the company) or informal (e.g., threats of litigation or the occurrence of certain corporate events). The occurrence of a Triggering Event triggers the company’s obligation to preserve all potentially relevant documents and ESI, beginning with the initiation of a “Litigation Hold”.
3. Square One: Information Governance
All things must end. In the case of the Legal Hold process, typically no one involved is sad to see the end. Nevertheless, from different points of view the process might end in different ways and for different reasons.
The mandate & responsibilities of an eDiscovery investigation
When an investigation requires eDiscovery, the investigating party must have a clear idea of the mandate this investigation carries, and what roles and responsibilities fall on the various people involved in or related to the investigation. For internal and regulatory investigation, a mandate allows the investigation to collect, process, produce and deliver data, first and foremost. This should include (but not be limited to) personal data. Without such a mandate, an investigation simply cannot perform its duties.
When dealing with an external regulator or requester, the investigative mandate ought to include a provision that allows the investigation to approve or decline requests to process employee data (provided reason exists to decline, obviously). The investigation should also be empowered to seek out tooling to better perform its requirements and, if necessary, seek out third party assistance by outsourcing (some of) its workload.
At the same time, while granting a degree of independence to the investigation, the mandate does not relieve the organization as whole of responsibility when it comes to privacy and data security requirements. That responsibility remains with the organization subject to the information request.
This means that the organization must continue to act as gatekeeper when it comes to employee (personal) data, and provide assistance to the investigation where needed, in the form of legal support and/or advice, tooling, data storage, as well as manpower (through staffing of the investigation). So although the investigation is given a degree of autonomy, it remains dependent on the organization for the supplies it needs to function.
Who does what during an eDiscovery investigation?
To keep Information Management at the forefront and maintain compliance throughout, investigative teams have a pre-made roster of positions to be filled. Every position includes a set of duties and responsibilities to ensure an investigation is able to perform its tasks. In larger investigations, certain roles can be shared by multiple people. When a role is shared, it does not mean the responsibility of the role itself is divided: each person fulfilling the role carries the same amount of responsibility a single person would. For smaller investigations, the smaller number of investigators will instead occupy multiple roles simultaneously.
The Requestor is the entity (either a person or a department) that requests the investigation take place. As the instigating party, the requestor has to ensure legal approval for an eDiscovery investigation is in place and provide the initial set of persons and/or sources to be investigated. As the origin point for the investigation, it retains the responsibility for the data involved, meaning that the analysis of the reviewed data, as well as the scope and redaction of the production dataset happens under their authority. Whenever an investigation feels it must expand its scope, the decision to provide a go-ahead lies with the requestor as well.
The Case Manager is typically appointed by the requestor to be the primary point of contact of the investigation. Having received the initial set of persons and/or sources to be investigated from the requestor, the case manager identifies records, documents and data relevant to the investigation. The case manager also creates the case in the case management overview, tracks the progress of the investigation there, eventually closing and archiving the investigation once its purpose has been fulfilled. As the primary point of contact, the Case Manager reports on the progress of the investigation to the requestor, ensures the adherence to forensic standards within the investigation (on the requestor’s behalf), and is in charge of requesting additional permissions for data collection and/or processing from the requester.
The Data Officer is the Case Manager’s “right-hand man” in charge of data collection. The data officer manages the use of and access to the eDiscovery solution during the investigation. This means he or she is in charge of opening the case (or ‘matter’) in the solution and decides which members of the investigation are given access to the tooling. Considering datasets typically contain large amounts of personal data, managing access is vital for compliance with Data Privacy regulations.
The ESI product owner is typically the IT administrator in charge of the specific data source from which data needs to be collected. In the role of product owner for the Electronically Stored Information (ESI), this administrator executes litigation holds on the data he or she is responsible for, carries out the collection of data, and delivers that data to the investigation (or grants access, whichever is applicable).
Custodians and Custodian Managers aren’t part of the investigation directly, but are responsible for data they hold. Custodian managers are, as the term implies, the managers of the actual custodians. These custodians hold data outside of the purview of ESI product owner, which leads to custodians being asked to provide data they hold. Here, custodian managers act as an extra level of gatekeeping to ensure the data collected falls within the scope of the investigation, and that the collection of data from their department in the context of the investigation is justified. Depending on the size and scope of an investigation and the way information is distributed within the organization, the number of custodians (and managers) can vary.
Custodian Managers aren’t part of the investigation directly, but are responsible for the data held by their subordinates. These subordinates, who function as data custodians during an investigation, may be asked by investigators to provide data they hold. In effect, custodian managers act as an extra level of gatekeeping to ensure the data collected falls within the purview of the investigation, and that the collection of data from their department in the context of the investigation is justified.
The Custodian isn’t part of the investigation, but rather subject to it. Custodians may hold information outside the purview of ESI product owners, in which case they ought to retain and deliver that information to investigators. Depending on the size and scope of an investigation and the way information is distributed within the organization, the number of custodians can be any number of individuals.
The Reviewer is the person who ends up performing the actual review of the collected data. Reviewers can be internal (employees designated by the Case Manager) or, if review is outsourced, from third parties that provide expertise, typically law firms or legal services companies. Of course, a combination of internal and external reviews is also possible. Regardless of their origin, all reviewers perform their tasks according to the review plan set up by the Case Manager and Data Officer.
“While granting a degree of independence to the investigation, the investigative mandate does not relieve the organization of responsibility for privacy and data security requirements.”
4. Discovering Data Sources
The first order of business for an investigation is to identify potential sources of relevant information. Nowadays, most if not all data will be found in the IT systems, but in order to issue an effective legal hold, identification of data subject to discovery should be as thorough and comprehensive as possible. Finding the exact sources, typically means speaking to key players in order to find out what type of relevant records they may or may not be holding. IT staff and, if applicable, records management personnel, should be consulted in order to establish storage locations, retention policies, data accessibility, and the availability of tools to assist in the identification process.
The process of mapping data sources and collecting data revolves around answering four major questions: What are the potential data sources? What are the appropriate methods for collecting relevant data? Should collection be done internally or be outsourced to a third party? Who should be involved in the execution of data collection? Once these questions have been answered, identification is complete, and the investigation can begin with said data. From a case management point of view, executing a clear and defensible investigation is paramount, regardless if it concerns an internal investigation, an audit, legal issues, or regulatory requests.
Potential data sources for an eDiscovery investigation
Data may be stored at a number of sources including email, computers, mobile devices, databases, tape backups, and third-party sources such as cloud storage and cloud backup sites. The Case Manager should coordinate with the requestor, and make use of appropriate legal and IT resources to determine where relevant data may be located. If it is possible to inform and involve custodians, questionnaires can be sent to the custodians to confirm that all potential sources of data are identified.
In addition to the coordination with legal and IT, other interviews may be necessary to clear up any uncertainty. For defensibility, the most important thing is to ask custodians the same set of questions, both in the questionnaire as in the interview.
Aside from identifying custodians, learning who they are and what work they do, the interviews should identify where they store data, including: identifying equipment that houses Electronically Stored Information (ESI), cloud storage, and hosted applications. Furthermore, the interviews should seek to understand how custodians communicate with the company and who they communicate with. The overall goal of these initial interviews is to establish the scope of the data map.
Appropriate data collection during an eDiscovery investigation
Prior to the start of collection, Case Managers should consider the trade-offs involved in different collection methods. Collection methods may include computer imaging, remote collections, or even assisted self-collection. Each method has its advantages and drawbacks regarding effort, cost, and completeness.
Invariably, weighing the pros and cons of every method falls on the Case Manager, who can seek advice form IT and legal. Defensibility is still paramount, but consistently going above and beyond the requirements of a case and completely disregarding the cost aspect of the equation leads to an investigation that struggles to justify it’s cost. Keeping an eye on proportionality, ensuring the ends justify the means, helps keep costs under control while ensuring the defensibility of the results.
Methods of collection may include computer imaging, remote collections, as well as assisted and self-collections. There are trade-offs to each approach in terms of efforts, costs, and completeness, but key is that the process needs to be defensible. For example, computer imaging is more likely to be appropriate for cases involving suspected wrongdoing, whereas self-collections may suffice for regulatory demands
Internal Data Collection versus External Data Collection
Once the size, scope, and means of collection are clear, investigators may want to take a moment to consider their options. Will they rely on internal resources to perform the tasks with regards to collection, or will they instead rely on an external party?
It’s a decision that can be based on a number of factors, considering that internal resources need to be available, the right skillsets need to be present, and experience with such matters is very desirable. What’s more, the data itself may pose some challenges in terms of expected volume, geographic location of sources, and legal restrictions (i.e. privacy laws) that restrict movement of data. Furthermore, an important consideration is to weigh the time and resource cost of the operation, especially since discovery typically employs tight deadlines. Finally, it is vital to measure the importance of the operation: if a vendor is to be used, and a possibility of sanctions (or even criminal charges) exists, that has to factor into the choice of vendor.
Who is involved in the execution of Data Collection?
If it is decided not to outsource collection, it’s key for the Case Manager to select employees who are up to the task of collecting materials internally. Employees involved in the collection activities should be versed with data handling procedures and the case management tools, where the process of handling evidence needs to be tracked.
Employees who handle the ESI should update the Case Management tool at each and every transfer in order to create a proper chain of custody for the evidence. If these updates are not properly performed, it could result in data being rendered inadmissible in a court of law or other legal proceedings. Even if the case in question doesn’t involve the courts, a defensible collection methodology is vital to meet the needs of regulators.
5. eDiscovery Best Practices: Retaining & Preserving Data
Generally, a data retention policy is mandatory for any organization that deals with data. More specifically, it is essential for companies that are subject to external oversight or are active in litigation-sensitive fields. Retention policies focus on managing (from creation to destruction) the records necessary for the organization to conduct business. It follows that retention policies and procedures are utilized to create, store, use and discard business records. Having such a policy (or set of policies) in place allows the preservation process, which kicks in once a company ‘reasonably expects’ litigation (or governmental/regulatory investigation), to start without issues regarding data availability.
Exempting data from the Data Retention Policy: the importance of Data Preservation
The need for data preservation arises from the anticipation of pending or possible litigations or governmental investigations. As such, although data preservation is related to general data retention policies, it has a few significant differences. As said, a preservation period can arise at any time during a record life cycle (unlike a retention process, which always starts at the creation of a record).
In a way, data preservation exempts the subject data from following the regular data retention process. Simply put, a record cannot be destroyed until the pending or threatened litigation (or a governmental investigation) is finally resolved and complete. Once completed, the preservation is lifted and the appropriate retention policies once again govern the lifecycle of the data within the company.
The types of records that ought to be preserved are established by the scope and nature of the related legal matter. This means the types of records to be preserved will be specific to each case.
Failure to properly preserve information and records may result in sanctions ranging from default judgements in civil cases, to monetary fines or even imprisonment in relation to governmental investigations. In short, data preservation is the process by which all relevant data is preserved for the duration of both the investigation and the proceedings that necessitate the investigation: retention policies, especially with regards to the destruction of records, do not (and cannot be allowed to) apply to any data subject to data preservation.
Legal Hold orders
As part of the investigation, the legal department can send so-called hold notices to custodians and/or ESI product owners believed to be sources of potentially relevant information. If a custodian then fails to preserve data subject to the hold notice, that can be one of a company’s biggest legal liabilities.
In some cases, companies may get a second chance to comply if it fails to produce. Repeated failures to preserve data however, can and will lead to fines and court-ordered sanctions. Therefore, it is the legal department’s duty to notify employees of their obligation to preserve data, and to ensure that custodians understand both their responsibilities and the possible consequences of failing to comply with the hold notice. This part of the preservation process repeats itself whenever a company becomes aware of possible litigation.
The Data Preservation Process
In the early stages of the preservation process, the data map created at the start of the investigation is used to determine which data should be subject to a litigation hold. Once that determination has been made, the legal department will issue the hold notices to the custodians. Hold notices go out to every identified custodian, these are the key individuals in the case who possess and control paper documents and ESI, the hold notice may also include people around these custodians, i.e. superiors or subordinates who may have received data from the custodian. If third parties control data which is to be preserved, through the use of cloud computing or outsourcing of traditional business functions, these third parties should also be treated as custodians.
Every custodian should be notified to preserve data, according to the following preconditions which is to be communicated in the hold notice:
- The time period which is subject to investigation;
- The categories of data to be preserved, possibly including email, electronic files, phone and text messages, data from structured databases such as sales and human resources, and computer backups. In practice, the hold notice will be sent out to make sure custodians preserve whichever type of information they hold and the investigation needs.
At the same time, the legal department will need to take the following steps to ensure the data that is part of the preservation process is no longer subject to the retention policies:
- Instruct the IT department (and, if applicable, the records management department) to suspend the destruction of paper documents and ESI;
- Suspend the automated deletion of data including email, server backups, and recycling of backup tapes;
- Copy data onto secure, centralized locations (data includes email boxes and computer files of custodians);
- Preserve communication from specified custodians;
- Maintain a master list of the preserved data.
Reassessing the Legal Hold
At times, reassessment of the Legal Hold needs to take place. This can be necessary when additional custodians are identified or when the scope of the discovery changes. Scope changes can have several causes:
- Court orders;
- Agreements amongst the parties;
- Changes to the scope of a government investigation
- Changes to technology used.
While the preservation process is ongoing, the Case Manager and the legal department should keep in contact, so new hold notices can be issued if the legal hold needs to be expanded or lifted for certain data and/or custodians. The legal department will need to communicate the changes in question to the custodians and/or ESI product owners, and notify the Case Manager that the changes are made. This process or reassessment repeats itself whenever changes are made to the scope of discovery.
Lifting the Legal Hold
If and when no further claims or lawsuits can be expected for the investigation, the Case Manager can decide to close the case. As part of the case closing, the legal hold is lifted, ending the data preservation process as a consequence.
Once data is no longer required to be preserved, the regular retention process picks back up where it started and is once again in charge of governing the lifecycle of the ESI. That includes data that was set to be deleted or destroyed, or that has exceeded the standard retention period while subject to Legal Hold. In this case, providing the ESI is not subject to other ongoing litigation, the ESI can be deleted in accordance with standard retention procedure. Notifying the custodians and/or ESI product owners of the lifting of Legal Hold should be the responsibility of the Legal department.
6. eDiscovery Best Practices: Data Collection
Having mapped the data, and ensured its availability though Legal Holds, the time has come to begin actual collection of the data for review. In terms of the eDiscovery process, collection is the acquisition of potentially relevant ESI, as defined by the identification phase.
In the context of litigation, governmental inquiries, and internal investigations, this information needs to be collected, along with its related metadata, in such a way that the process is justified, proportionate, efficient, and targeted. As such, the collection process needs to be well-documented and defensible at all these levels. At the same time, as data is collected, its contents may provide feedback to the identification process itself, which may impact the scope of the overall process, leading to more (or sometimes less) collection to take place than assumed at the outset.
Effective, qualitative data collection
Because of the requirements associated with data collection, the process can be time-consuming and disruptive. However, there’s little to no room for cutting corners: even after collection, it is important to check the quality of the collected data as it is delivered. These checks are part of the validation stage of the collection process.
A common approach to validate the integrity of the data is to apply hashing to the original and copied data. These results can be compared afterwards: if the identifiers of the original and copied data match, the pieces of data are considered identical because the odds of two non-identical pieces of data generating the same hash value are remote. If the data validation checks make it clear data is missing, incomplete, or incorrect, additional collection may be necessary.
7. eDiscovery Best Practices: Data Processing
With the data properly mapped and now under control due to collection, it’s time to begin processing it. During the processing phrase, data is analyzed and prepared for review. This is where data is culled prior to review, in order to cut down on the amount of data the reviewers eventually must sift through. Unlike mapping and collection, processing isn’t always an essential part of the investigation: Sometimes, especially during law enforcement investigations, the requestor asks for the raw data sets to be delivered. Even when this is the case, the raw data should also be processed by the investigation, during the execution of the so-called shadow investigation which is held based on the knowledge provided to the investigation by the requestor.
What is data culling?
Data culling is pinpointing documents based on a certain set of criteria, such as keywords and date ranges, which would be isolated from the document review. There are three common methods of data culling:
- Deduplication (identifying and separating duplicate documents, emails);
- DeNISTing (reducing non-relevant documents);
- Search terms (identifying the appropriate terms would maximize the output of the search);
Saving time, effort and resources by proper data culling
That isn’t to say processing isn’t useful. Unless the requestor explicitly requests (or requires) data be unprocessed, processing it prior to handing off to reviewers can help save time, effort, and resources by reducing the raw amount of data to be reviewed. Tools need to be in place for data to be culled properly: during the processing stage, as any part of an investigation, as defensibility remains essential. Specialized eDiscovery tools are built for this purpose, and cull data in such a way that the processing itself is documented, allowing recipients to retrace the steps taken to cull data, so they can rest assured that the culling has not removed data that could be instrumental to the purposes of their request.
Processing isn’t just culling, though. Extraction is also an important function. In the context of processing, extraction means creating (machine) readable data out of compound or non-searchable objects. Compound objects include compressed files, imagine a ZIP file attached to a message in someone’s inbox: a ZIP file can contain multiple separate files that may contain important information. An eDiscovery tool can unpack such files and (we’ll get to this later) check if there’s anything important in there.
Investigating non-searchable objects
Non-searchable objects may sound like something that doesn’t come up frequently, but think about scanned receipts for example: those aren’t readily searchable. In addition to scanned documents, PDF files, bitmaps, video/audio files, etc. are all non-searchable by default, and need processing to be made readable by a search tool. eDiscovery tools may use automatic OCR (optical character recognition) and/or audio search to transcribe sound files. Through these methods, eDiscovery tools transform the non-searchable into a format both human and machine reviewers can work with.
Once extracted, the culling stage of the processing can begin. At the very least, tools should ensure the data is without double entries through what’s called deduplication. More advanced tools (such as ZyLAB ONE) enable users to cull data by using filters and queries, reducing the amount of data eligible for review.
8. eDiscovery Best Practices: Data Review & Analysis
Finally, after all that work to map, collect, and process the data involved in the investigation, it’s time to start doing what the investigation has been wanting to do since before data mapping was but a glimmer in the eye of the Case Manager: investigate. Reviewing and analyzing data are a package deal here: reviewing data simply means evaluating the data for relevance, while analyzing evaluates that relevant data for content and context.
For the purposes of this document, the focus will be on the review phase. Although analysis is an important part of the process as a whole, we’ll not discuss it in-depth here: it is exceedingly difficult to standardize weighing context and content in a dataset as that process is obviously very much defined by the content and context. There are a few notes however, and we’ll get to these towards the end of this section.
Preparing the data for review
Up to this point, most of the process as a whole has been about finding data and moving data from custodians to investigators. Now that the dataset has arrived, reviewers need to prepare for their task, which is to establish relevance of the data in their set. Regardless of the size of the review team, two things need to be established prior to starting the review: the review strategy and the review environment.
The review strategy means setting up (or putting in place) protocols that define how the review will be conducted, set up a timetable, and establish terminology to be used for tags, codes and annotations. If the dataset contains some amount of foreign-language materials, the strategy should define if this should be left as-is, machine translated, or (if possible) translated by humans. If tools permit, the usage of Technology Assisted Review (TAR) should be noted here as well. Finally, a protocol for handling sensitive, confidential, or privileged data should be put in place.
Once the review strategy is finished, the review environment needs to be prepared. Reviewers need to receive user access rights to the tools they need to perform their duties, and be given instructions and (if needed) training to execute their part of the investigation.
Reviewing the dataset
With the dataset accounted for and the strategies and tools in place, the review can begin. While the data is weighed for relevance, detailed logs should be kept, so it can be included later during the presentation of the data as a technical report.
The low-tech way of reviewing is, as the name implies, low-tech. It mostly consists of reviewers sifting through data manually and weighing documents for relevance individually. Though it is most definitely the classic way of handling discovery, it is also by far the most time-consuming, labor intensive, and error prone.
Through no fault of their own, at the end of the day, reviewers are human. That means the droning, repetitive nature of reviewing will eventually get to them, which invariably leads to mistakes, inconsistency, or concentration lapses. Also, since humans are only equipped with a single pair of hands and eyes, they’re fairly limited in terms of how much data can go through their hands for them to see. This means low-tech manual review tends to take a lot of time.
Conversely, the high-tech way of reviewing means using an eDiscovery solution to help mitigate the limitations of human reviewers. Modern eDiscovery software has advanced tools to quickly and automatically cull irrelevant data from a set using Technology Assisted Review (TAR), a process through which a reviewer ‘trains’ the solution, powered by Artificial Intelligence (AI) to tell the difference between relevant and irrelevant data. Once the AI understands the difference, it can classify documents based on input from reviewers, in order to expedite the organization and prioritization of the dataset. This classification may include broad topics pertaining to discovery responsiveness, privilege, and other designated issues.
TAR can dramatically cut down the time (and cost) of reviewing, as reviewers now only need to review a dataset pre-selected for relevance. Of course, the input process for the AI’s training set will be documented in order to preserve defensibility. It’s important to note that TAR doesn’t mean human reviewers are not involved at all; verification remains important – the A in TAR stands for Assisted, not Autonomous.
Regardless of which method of review is used, the end of the review phase yields a culled dataset comprised of only relevant material.
Analyzing the dataset
Whether or not review is the final step depends on the context of the investigation. If it concerns an external request, the review dataset moves directly to the presentation phase. For internal investigations, the review dataset will need to be analyzed as well. If the information request originates externally, the analysis will be performed by the requestor. No matter who does the analysis, this part of the process may create a feedback loop: if analysis shows that the dataset is missing information, the review process starts up again to provide it. If it turns out the missing information is not present in the review dataset, the loop-back can go all the way back to the identification phase.
As mentioned earlier, analysis is the practice of evaluating the review dataset for content and context. The goal is to find key patterns, topics, people and conversations. Essentially, if review answers the question ‘where is the relevant information?’, analysis answers ‘what is the relevant information?’. We won’t delve too deep into analysis here, but suffice to say that modern end-to-end eDiscovery solutions offer a wide range of visualization options and analytical tools to identify and show the connections and content in a dataset.
9. eDiscovery Best Practices: Production & Presentation
The final phase of the investigation combines two phases of the EDRM model. For the purposes of this document, we will assume the investigation has occurred at the behest of an external party, which means we pick up after review has concluded, and focus on producing the review dataset for analysis by the requester.
In terms of this phase, production is the act of preparing the review dataset for handover to the requestor. Presentation goes a bit further than that, and occurs after the analysis phase: it is the act of presenting the outcomes of the fact-finding operations on the review dataset. As such, we’ll mostly focus on production in this section, since presentation is out of scope.
Completing the investigation: producing eDiscovery results
Once the review dataset is believed by investigators to be complete, the documents will need to be prepared for the requestor to analyze. The most important decision at this stage is determining the production format. A few options are available:
- Native format: files are produced as they were originally – the disadvantage to native production is that it doesn’t easily allow for redaction.
- Image formats: files are reproduced into an image format, such as PDF or TIFF; - this is the most common output format used in an eDiscovery solution as it allows for the redaction of information, the embossment of Bates numbers as well as other information right on the document. Image files can also be opened on any system without the need for access to the original application used to create it.
- Metadata or Load Files: this is typically outputted during production to provide the metadata of documents in the production dataset and often the tagging work product performed within the eDiscovery solution. These files can take a variety of industry-standard formats and allow the information allow with the native or images files to be imported to other eDiscovery systems.
- Extracted Text: this is the full text of the documents separated from its original format.
- Printed: files are printed out and handed over.
Once produced, the production dataset should be delivered to the requestor. The primary concern here should be the security of the dataset in transit. It’d be a shame to compromise the contents of the dataset within arm’s reach of the finish line. Ideally, make use of secure file sharing services or physical data carriers to perform the transfer. Encrypting the files prior to transfer is also very much preferred to further ensure security. The best and preferred way is, of course, delivery through the same eDiscovery platform used for the investigation. This can be done simply by giving (limited) access to the requestor, so they can download a final production.
Completing the investigation: presenting eDiscovery results
For presentation purposes, the requester may still have need of the Case Manager’s assistance. Most commonly, requestors may need a copy of the technical report of the dataset they received. Depending on the type of investigation, the technical report may include the following:
- The data map used for identification of the ESI. This data map can also provide event tracking for the process from data identification to the upload to the eDiscovery software.
- A report of data that could not be collected (and the reasons why).
- An event log of actions performed (successful and/or unsuccessful) during processing;
- A log of the search queries applied to cull the dataset;
- A copy of the review strategy document;
- A log of the data produced and formats used;
- Hashes of the produced data.
Once the production data and technical report have been handed over, the case can be closed. The data within the eDiscovery software can be archived according to the case retention period set in the retention policies.
“Specialized eDiscovery tools cull data in such a way that the processing itself is documented, allowing recipients to retrace the steps taken.”
10. Conclusion
For any type of investigation, defensibility is the most important thing, bar none. An investigation could come up with incredible, groundbreaking results, but without a defensible process those results mean very little. For data investigations, defensibility means documenting every step taken and every choice made. Essentially, when the materials are handed off to a requestor, that requestor ought to be able to retrace the steps of the investigation to understand how the dataset they ended up with came about.
In an ideal world, where everyone always gets along and the money trees bloom year-round, trust is implicit. We, however, do not live in that world. Therefore, it is vital that investigators provide receipts for their choices. This is why solutions for eDiscovery are of key importance to defensibility when it comes to ESI investigations: they provide a means of weighing and culling data without bias. eDiscovery software can provide technical event logs of their process: what queries were used, what filters applied, etc.. A requestor who wants to understand how a dataset came to be can read these technical logs and be sure that the logs are accurate.
The importance of defensibility in eDiscovery investigations
Throughout the process of an investigation, this defensibility should be at the forefront of everyone’s mind. An indefensible investigation is essentially useless, and just serves to waste the time and resources of everyone involved. If nothing else, eDiscovery software offers a degree of note-taking and record keeping during the investigative process that cannot be questioned. At the end of the day, manual record-keeping is a flawed process, no matter how capable the people keeping the records are: mistakes can be made knowingly or unknowingly. It doesn’t matter why or even if a mistake was made, the fact that it is possible in the first place can be enough cause to second-guess the investigative process.
With data volumes exploding across the board, making use of an eDiscovery software to conduct data investigations is a no-brainer. If the sheer amount of data to be reviewed doesn’t make that clear, the amount of people and actions needed to review it should. Utilizing an end-to-end solution to conduct the investigation in its entirety, allows investigators to use state-of-the-art tools both for their efforts in gathering and culling data, as well as keeping records along the way. ZyLAB ONE is one such solution, and we’d love to show you around.
Ready to talk to a specialist?
Our experts are standing by to help you overcome your Legal Discovery challenges.
The #1 eDiscovery platform for legal discovery. Trusted by governments, law firms and corporations.
Subscribe to our latest insights
© ZyLAB