Managing the Challenges of Change – The Weight of Multiple Email Archives Put to Rest

By Geoffrey Sherman, RVM Chief Technology Officer

In May 2018, I was featured in an article on RVM’s blog titled “How to Manage the Challenges of Change in Technology.” We walked through the business drivers, general concerns with change, and criteria for success in this article. In addition, we highlighted how RVM can assist in situations involving legacy system migrations to reduce the burden already present with busy in-house resources.

As an update, I wanted to share a recent project where RVM was asked to migrate multiple legacy email archive platforms.

Cloud-Hosted Archive System

For the migration of a cloud-hosted archive system our client required each custodian for each matter to be exported in a PST format. This is an arduous process that requires attention to detail and strong quality assurance to successfully complete. To accomplish this RVM created an intuitive folder structure to mimic the matter and their respective custodians. In addition, RVM utilized a combination of PowerShell scripts and a PST split tool to appropriately size exported files as well as validate their counts. Completion of this migration phase allowed our client to reduce their licensing expenses and decommission this legacy system.

On Premise Archive System

For the migration of an on premise system our client also required the full export of relevant messages. This system had a massive volume of messages, nearly 60 million, across archive and journal email stores. Due to the system’s age and overhead, the ability to export items from the user interface became problematic. The system became overwhelmed by the volume of exports which ultimately caused instability to the end users, hindering their ability to search and retrieve emails. In addition, initial estimates reported that exports would take well over one year to complete and require tremendous interaction with the system clicking through each export.  

To combat the challenges faced, RVM utilized the expertise of our in-house developers and the archive platform’s high-speed API. The developers created a robust utility to directly export messages, eliminating the prior bottlenecks and application latency. The utility was able to identify the distinct list of custodians available, target those still on legal hold, and export messages in batches.

Once stable, RVM implemented threading to the solution, which increased throughput to upwards of 3 million messages daily and met our goal of completing nearly 60 million messages in total. All this was accomplished in far less time with a substantially greater level of accuracy than directly using the application as an administrator. The RVM team also used this as an opportunity to validate counts, estimate duplicate data using a custom hash, and re-hydrate corrupt messages with a successfully stored version.


The challenges presented by decommissioning and/or migrating legacy systems can create a real roadblock for organizations. When selecting a partner to work with on such endeavors one should ensure they have a clear proposed plan, tremendous determination, and a track record of adapting when traditional approaches were not successful.

When Did eDiscovery Grow Up

By Sean King, COO, RVM Enterprises

I recently held a roundtable conversation with my team at RVM to see what was on their minds. During the discussion one of them asked a question that I’d hadn’t really thought about before. When did eDiscovery cease to be “new and shiny,” and start to become an established field?

I have been a proud member of the eDiscovery industry for nearly 20 years, having worked in prestigious litigation support and legal technology departments at AM Law 100 law firms and legal technology service providers that, like RVM, provide direct service to law firms and corporate clients. I grew up in this industry. I started as a kid out of college that knew something about computers and helped lawyers do what they needed to do with computers. I have watched this industry blossom into something that deals with cybersecurity, artificial intelligence, and block chain. eDiscovery has changed so much, and yet that change has been practically seamless.

So much of our day-to-day lives has changed, and quickly. When I think back on what I was doing 20 years ago, I remember using DVDs to watch movies, CDs in my car for music; I had a Blackberry as a phone, signed my mortgage papers for my first house by hand for over an hour, and shopped at brick and mortar locations. The landscape for those businesses have changed significantly. And, while some worried that the profound changes ushered in by Apple and Amazon would harm the existing industries, I could argue that they are even stronger and more successful because they have evolved to keep up with those leaders.

The story of eDiscovery is the story of evolution. It is the story of a kid growing from childhood to an adult. eDiscovery, metaphorically, changed from a kid to a teenager to an adult. Twenty years ago, it needed a lot of hand holding. The parents — lawyers, if you will — were afraid to mess it up, lest it turn bad. And, they went through a lot of growing pains such as FRCP changes, sanctions, case precedent, and technology changes. But the parents have become a lot hipper to the technology; they rely on the books and precedent that are written, and they are more comfortable with the constant changes.

To continue this metaphor, it might be possible to say that eDiscovery might be married. For so long, we treated eDiscovery as unique, a thing to be carefully handled, and treated differently from all other discovery. Today, it is the most common form of discovery and is merged with other things like cybersecurity, artificial intelligence, cloud-based storage, BYOD policies, and other mature technology enhancements to our everyday business lives. Like many of us that are now adults, there is no unique distinction. We are all the same. We all work, provide for our families, and do the things we need to as adults. Once we became adults, we sort of blended in with everyone else. So, where is this growth taking eDiscovery? We at RVM continue to see that maturation process at warp speed. Our clients are not inexperienced and technologically-incapable lawyers at law firms. Instead they are increasingly sophisticated corporate general counsels that deal with eDiscovery as part of their everyday business. eDiscovery is not seen as novel or niche. It has become mainstream. And, as many have written before, it is time to drop the term eDiscovery and call it what it is. Discovery.

It’s Time to Take action Against IP Theft

Recently, Tesla CEO Elon Musk was forced to admit that his company was the victim of sabotage by one of its own employees. That employee, frustrated over recently being passed up for promotion, applied damaging code to the company’s manufacturing system and shared large amounts of sensitive data with third parties.

Given the company’s desperate need to make progress following a string of negative announcements, the timing couldn’t have been worse.

Tesla’s situation, though perhaps one of the highest profile cases, is not new or unheard of. Companies quietly monitor their workflows and processes for any signs of IP theft or sabotage by disgruntled or even misinformed employees. Very often, it’s simply a case of those employees taking the work product that they created, believing that they have ownership. In other cases, an employee may copy large contact lists hoping to maintain and divert relationships to a new employer.

Whatever the theft, and whatever the motivation behind it, this particular crime is common and can cause a company not only financial loss, but the potential for serious reputation damage and even litigation.

Roughly 50 percent of employees will take work product when they leave a company, and close to 40 percent will attempt to leverage that work product on behalf of their new employer.

But what can we do about it?

Most companies leverage commonplace strategies, such as blocking employees from using online storage sites such as Dropbox, or disabling USB ports so that files cannot be moved to USB storage devices. The fact is that these methods are only a minor stumbling block for an employee intent on taking work product.

In the past, to determine whether information was stolen, companies needed to do forensics work, costing a lot of money, time, and resources.  It is hard to measure an ROI for a process like this because you cannot assess the value of an event that may have been prevented, and you cannot assume the result before you commit the resources.  Many companies struggle to see the value in building processes that protect their IP in the face of committing resources to R&D, service line launches, shareholder rewards, or employee benefits.

Understanding this challenge and leveraging its forensics expertise, RVM created a tool – Tracer – to analyze computers and identify activities that might be affiliated with potential IP theft. It is designed to look for user behaviors (online and offline) that may indicate an employee’s ill intentions. The tool can sweep through the user’s actions looking for files and actions and can draw attention to troubling patterns to guide an employer’s decisions.

But, technology alone may not be enough to overcome the problem.  Leveraging experts that can properly assess the problem and collaborate with a company to right-size the solution is a powerful next step.  The best way for companies to protect their IP is to ask the hard questions regarding its value and be prepared to take action.

Tesla is a strong company with a stable revenue stream, and will likely weather this storm. Other companies may not be so fortunate.

The Necessary Evil of Search Terms

by A.J. Strollo

“Having lawyers or judges guess as to the effectiveness of certain
search terms is ‘truly to go where angels fear to tread.’”
Magistrate Judge Facciola,
United States v. O’Keefe, 537 F. Supp. 2d 14, 24 (D.D.C. 2008)

This statement was made 10 years ago, and the wisdom – particularly when looking at the complexities relating to term syntax and what exists within data sets – has only become more prescient. Search terms can seem fraught, if not outright risky. So why do we continue to rely on them?

Despite the concerns surrounding keywords, and even after all the recent technological gains, they remain the most common way to cull data for potential review and production. The reason for this is likely that they are familiar, and as we all know, the legal community can be slow to move away from the tried and true, particularly when the alternatives involve relinquishing control to machines.

It’s relatively easy to generate a proposed list of terms, run them against the data, and determine how many documents the terms capture. But knowing whether the terms actually capture information of interest is a different story. Along those lines, Magistrate Judge Facciola noted that whether the terms “will yield the information sought is a complicated question involving the interplay, at least, of the sciences of computer technology, statistics and linguistics.”  Id.

Facciola may have said this because of the way lawyers often use the search results without substantive analysis. A common practice when running terms is to look at the volume of data that is returned, rather than the quality or effectiveness of the search. So, if the data returned is significantly higher than expected, the lawyer may narrow the terms arbitrarily with the goal of reaching the “right” number of documents. How they determined what is “right” can be a mystery. These adjustments may yield fewer results, but also risk eliminating necessary ones. While that’s not to say that this practice is haphazard, it does lack defensibility, especially if parties are locked in a contentious battle over the scope of discovery.

For me – and I think Facciola would agree — instead of volume, a better focus is on the effectiveness of the terms, measured not solely by number, but on the richness, or “relevancy rate,” of the potential review population.

So how do we make keywords and search terms more effective and assuage the “fears of the angels?”

A big step is to perform substantive analysis of any search terms rather than the commonly used guess and check method. When the starting point is a list of proposed terms from opposing counsel with an uncertain level of effectiveness, we must assess and refine those terms to increase the likelihood of capturing the most relevant documents. Borrowing concepts from basic statistical analysis, the process for vetting terms and suggested revisions can be based on results of a sample review.  Terms are modified by targeting common false positive hits — hits on the term but not for the intended target — identified within the non-relevant documents from the sample.

Imagine a fact pattern where the relevant discussions involve Jacob Francis and his interactions with a specific contract. Initial searches for Jacob OR Francis in documents that also contain the contract title or number would yield a substantial volume of documents based on the commonality of Jacob’s name.  It’s easy to label this as a bad term, but a lawyer’s analysis is helped much more by understanding why it is bad and how to make it better. Attorneys can do this by looking at the documents, which reveals that there are others at the company with Jacob or Francis in their names (e.g., Jacob Smith or John Francis), thus opening the door to an array of potential term revisions to minimize the number of documents returned. This is a good start, but the analysis does not end there.

Next, it is important to check actual document hits to ensure they are consistent with any assumptions. To do that an attorney should draw and review an additional sample from the documents that were removed from the review population to ensure the new terms are not missing potentially relevant content. Digging into these, the attorney may find out that Jacob Francis had a nickname, “Jake,” which would not be captured using the terms Jacob OR Francis to Jacob w/2 Francis. Continued analysis may also uncover references to the contract negotiation as “Project Apple” instead of the contract title or number.

Using this knowledge and adding or modifying the search to include “Project Apple” and “Jake” addresses these missing documents, avoiding potentially serious omissions. Additional considerations might include running “Project Apple” as a conceptual search rather than as a strict keyword, seeking documents that are similar in meaning but that do not necessarily share the same set of terms.

The payoff of all this work is a more focused set of documents for review, reducing associated costs, and concentrating the review team’s time working on documents in need of review. Considering the alternative of reviewing countless volumes of data unnecessarily, or worse, discarding valuable documents, it’s clear that using keyword searches – effectively – is not only necessary, but beneficial.

It Pays to Use Formal Discovery

Preparing for litigation comes with a mountain of expenses and challenges —much of which are attributable to discovery. And, as data volumes grow, so too, do those discovery costs. Unfortunately, eDiscovery is often misunderstood by clients and rationalized to be more complicated than it needs to be.

In an effort to contain the rising tide of costs and perceived complexity, some litigants are undertaking “informal discovery” — a process that on its face seems like a cost-effective and ideal option. It allows for the exchange of key documents without the burden of production format, custodian tracking or consideration for defensibility. In a common scenario the client will comb through their own inbox and send the relevant emails to counsel.

Sounds like a good deal, right?

“Clients don’t like the idea of paying money for things that they believe they can do themselves,” says Greg Cancilla, Director of Forensics at RVM Enterprises. “Collecting data can seem more like a job for an intern than an eDiscovery and legal forensics firm.”

Although it might seem like a cost-effective approach, parties that engage this way may be in for trouble.

The Trouble with Informal Discovery

Common Missteps in Informal Discovery
Self-selection of relevant documents
Self-collection of ESI
Emailing documents to counsel as attachments
Copying and pasting files to external media or an FTP site
Producing ESI by a) printing to hard copy or b) converting the files to .pdf
Bates numbering documents individually

A major concern with informal discovery is the risk exposure regarding authentication of evidence and the potential extra time and costs one might incur to correct the collection of data.  While eDiscovery providers have developed systems and technologies that enable them to work quickly and efficiently in an appropriate review environment, an informal approach does not offer those advantages. eDiscovery providers take the appropriate time and use the correct processes to collect data so that it can be done once, efficiently, and defensibly. With informal discovery, if further searches are warranted, the entire process may need to be repeated, adding undesirable costs and time.

Another issue is the likelihood of altering metadata. By using the “copy and paste” — or “foldering” —approach to data collection, you run the risk of modifying key dates such as last opened, last modified, etc. This can make authentication problematic, and makes it harder to sort and de-dupe files that have been modified, again adding to cost.

The most important shortcoming of the informal method is the unnecessary risk of misstating the scope of the production of electronically stored information (ESI). (Applied Underwriters, Inc. v. American Employer Group). In some circumstances, courts have held that self-identification and collection may not even be defensible.

According to Cancilla, “Self-collection puts all the responsibility on the custodian to determine what ESI is relevant. Foldering in particular can be troubling, as even well-intentioned clients may simply not realize that certain sources, a sent mail box for example, need to be included in the folder to be produced.”  In today’s age of electronic information, it is important to note that relevant information is not just the substance of the document, but also the metadata — or surrounding information — of the document.  FRCP Rule 34(b)(2)(E) advises that a party must produce documents “as they are kept in the usual course of business” or must “organize and label them to correspond to the categories in the request.”  “Informal Discovery” adversely impacts that instruction.

Changes on the Horizon

Two proposed amendments to Federal Rule of Evidence 902 are set to take effect on December 1, 2017 that will significantly affect the collection of ESI and its admissibility. In addition to providing a structure for standardizing ESI collection, these amendments, 902(13) and 902(14) demand a stricter, more organized method of collection that is outside the scope of informal eDiscovery. Where the current version of Rule 902 allows for self-authentication of certain types of documents, the new additions allow for authentication of electronic evidence by an affidavit of a “qualified person” who can certify in writing that the document was obtained with the requirements of Rule 902(11) and (12).

“The new rules are changing everything,” continues Cancilla. “It doesn’t make any attempt to disincentivize self-collecting, but by making ESI gained through formal discovery ‘self-authenticating,’ the advantages are well worth any cost to work with the professionals.”

The new rules cover records that can be authenticated using a document’s hash values, which are assumed to be unique. For purposes of authentication, hash values are the backbone of the proof that Rule 902 requires, but not the only allowable method. As the Advisory Committee on Evidence notes, “[t]he rule is flexible enough to allow certifications through processes other than comparison of hash value, including by other reliable means of identification provided by future technology.”

As December draws closer, parties must consider the implications of these rule changes and how they may affect authentication in upcoming trials. If they wish to take advantage of the new rules they must be prepared to track digital fingerprints on any new collection. If they don’t, they stand to spend more time and money authenticating their documents, including having their own in-house IT and network administration staff called to testify.

Says Cancilla, “Using the informal method of discovery is like driving with too little insurance: you’ll save money for a while, but if anything bad happens, you could wind up paying for it. Companies should remember that a well-documented and formalized data collection process is a small investment relative to the overall eDiscovery spend, but can significantly affect accuracy and defensibility.”


Greg Cancilla, EnCE, ACE is a Certified Computer Forensic Engineer and the Director of Forensics at RVM. He has performed countless digital forensics investigations since entering the field in 2003. Additionally, Greg has offered testimony in numerous cases, including presenting a key piece of evidence in Ronald Luri v. Republic Services, Inc., et al., which rendered the largest verdict in the State of Ohio’s history.

4 Questions About Media Preservation & Restoration

Last month RVM announced that it had acquired The Oliver Group – experts in collecting and preserving data stored on tape and other offline media.

Anyone involved in litigation discovery and collection understands the critical nature of electronic data. Long gone are the days of collecting paper from centralized file cabinets. Instead, companies are challenged with collecting data from multiple sources, such as email, hard drives, file shares, cell phones, social media, and older media including backup tapes. At times these backup tapes and offline media, often found tucked away in storage closets, can be the most burdensome and expensive to collect and process.

Unfortunately, with so much attention on cloud solutions (e.g., SAN and NAS storage, and other easy data storage options) there just aren’t as many companies capable of handling this kind of data properly. However, collection and authentication of offline files in a defensible manner is just as critical as for their digital brethren.

That’s where RVM and The Oliver Group come in. To learn more about the acquisition and why it’s such a game changer, we spoke to Chief Operations Officer, Sean King.


Sean KingWhat was the impetus for this acquisition?

RVM was interested in meeting its clients’ needs by becoming a one-stop shop for forensics, media restoration, and eDiscovery services. We saw The Oliver Group and the services they offer as a great partner that is very well-known and respected among clients and competition in the industry.

How will media preservation/restoration be folded into the work that RVM currently does?

Media preservation and restoration are a natural extension of RVM’s services. We’ve actually had a long working relationship with The Oliver Group, so we’re familiar with them and how they work. Our goal has always been to manage our clients’ eDiscovery needs – from data collection through document review and production – and this acquisition makes that possible in such a way that clients benefit with a streamlined process and lower costs.

With so many tech and media companies in the market what makes The Oliver Group’s work special?

The Oliver Group is one of the select few companies that understands media and its application to legal discovery requirements. RVM and The Oliver Group are both focused intently on the defensibility of the data that are collected – that means having policies that indicate compliance with a comprehensive audit trail and chain of custody. We have to be able to track the movement, access, and location of the data in question throughout the life of the evidence, and that is not something that a commercial media firm is equipped to do.

What do you see as the long-term future of media preservation/restoration?

There will always be a need for media preservation and restoration as companies respond to disaster recovery, compliance requirements, and litigation needs. Data management continues to evolve, and being able to support all storage mediums is a requirement for service providers looking to offer clients a cost-effective and defensible offering.


RVM to Participate in the 2017 NAMWOLF Annual Meeting & Law Firm Expo in September

In alignment with our dual passions for education and diversity, RVM is pleased to be a sponsor and CLE presenter at the 2017 NAMWOLF Annual Meeting & Law Firm Expo September 17-20, 2017 in New York City.  The National Association of Minority & Women Owned Law Firms, founded in 2001, is committed to promoting diversity in the legal profession by fostering successful relationships among preeminent minority and women owned law firms and private/public entities.  The event is full of networking opportunities and continuing legal education sessions on provocative and poignant topics that face legal professionals today.

RVM’s Manager of Education & Development, Talia Page,  will be moderating an esteemed and diverse panel of attorneys from around the country on a topic entitled OMG, There’s Evidence in My Pocket!? How the Proliferation & Accessibility of Data Affects Discovery, & What You Need to Know about the New Federal Rules on Monday September 17th from 1:45pm-2:45pm.  In this session, we will address hot topics and trends in eDiscovery using recent case law to work through some of the challenging issues litigators face in the digital age in light of the new Federal Rules.  For more information on NAMWOLF and other CLE opportunities, visit www.namwolf.org.