Schedule a Demo

BLOG

See the latest news and insights around Information Governance, eDiscovery, Enterprise Collaboration, and Social Media. 

All Posts

Why Hash Values Are Crucial in Evidence Collection & Digital Forensics

Traditionally, proving the authenticity of a piece of digital evidence could be tricky, especially if opposing counsel was determined to keep it out of evidence. Legal teams would have no other option than to spend significant time and resources on providing a sponsoring witness who could testify to the authenticity. 

rsz_adobestock_218269298

Thanks to the recent Federal Rules of Evidence Amendments 902(13) and (14), witness testimony is often no longer necessary. Electronically stored information (ESI), like social media posts and comments, cellphone images, text messages, and website content can now be submitted as machine-generated authenticated evidence, which  means submission can be greatly streamlined.

But what does this look like in practical terms? For the most part, it comes down to making use of hashing algorithms when collecting and authenticating evidence. To better understand this, we need to take a closer look at the amendments themselves. FRE 902(13) and (14) state:

(13) Certified Records Generated by an Electronic Process or System. A record generated by an electronic process or system that produces an accurate result, as shown by a certification of a qualified person that complies with the certification requirements of Rule 902(11) or (12). The proponent must also meet the notice requirements of Rule 902(11).

(14) Certified Data Copied from an Electronic Device, Storage Medium, or File. Data copied from an electronic device, storage medium, or file, if authenticated by a process of digital identification, as shown by a certification of a qualified person that complies with the certification requirements of Rule 902(11) or (12). The proponent also must meet the notice requirements of Rule 902(11).

While the amendments themselves don’t mention any specific ‘electronic process or system that produces an accurate result,’ references to hash values are made in accompanying comments provided by the Standing Committee on Federal Rules. These notes read: 

Today, data copied from electronic devices, storage media, and electronic files are ordinarily authenticated by "hash value." A hash value is a number that is often represented as a sequence of characters and is produced by an algorithm based upon the digital contents of a drive, medium, or file. If the hash values for the original and copy are different, then the copy is not identical to the original. If the hash values for the original and copy are the same, it is highly improbable that the original and copy are not identical. Thus, identical hash values for the original and copy reliably attest to the fact that they are exact duplicates.

What Is a Hash Value?

Similar to the Standing Committee on Federal Rules, the Cybersecurity and Infrastructure Security Agency (CISA) defines a hash value, or hash function, as:

A fixed-length string of numbers and letters generated from a mathematical algorithm and an arbitrarily sized file such as an email, document, picture, or other type of data. This generated string is unique to the file being hashed and is a one-way function—a computed hash cannot be reversed to find other files that may generate the same hash value. Some of the more popular hashing algorithms in use today are Secure Hash Algorithm-1 (SHA-1), the Secure Hashing Algorithm-2 family (SHA-2 and SHA-256), and Message Digest 5 (MD5).

In simple terms, a hash value is a specific number string that’s created through an algorithm, and that is associated with a particular file. If the file is altered in any way, and you recalculate the value, the resulting hash will be different. In other words, it’s impossible to change the file without changing the associated hash value as well. So if you have two copies of a file, and they both have the same hash value, you can be certain that they are identical.

A hash value guarantees authenticity thanks to four particular characteristics:    

  • It is deterministic, meaning that a specific input (or file) wil always deliver the same hash value (number string). This means that it is easy to verify the authenticity of a file. If two people independently (and correctly) check the hash value of a file, they will always get the same answer.
  • The odds of “collisions” are low. This means that the chances of two different inputs (files) coincidentally having the exact same hash value are incredibly small—practically non-existent.
  • A hash can be calculated quickly. Generating a hash value is quick and easy (provided you have the right tool). The size of the file in question is also irrelevant—generating a hash value for a large file is as simple as creating one for a small file.
  • Any change to the input will change the output. Even the smallest change to the input file will result in a change to the resulting hash value. This means that it is impossible to alter a file without changing the associated hash value, which makes it very easy to prove (or disprove) the authenticity of a piece of digital evidence.

The below video from the Computerphile YouTube channel offers a great explanation of how hashing and hash values are used in the realm of digital signatures and data authentication.

Using Hash Values to Authenticate Evidence

As is hopefully clear from the info above, a hash value acts as a digital signature (or fingerprint) that authenticates evidence. As long as a piece of evidence was correctly collected and processed, any other party independently examining the hash value will find the same number string.

In other words, if a person uses a tool (like this one) to authenticate a piece of evidence with a hashing algorithm during collection, anyone using the same algorithm to authenticate it at a later stage will see that exact same resulting hash value—and any change to the data will result in the hash value changing.

rsz_adobestock_298037321

This is why hash values are so crucial to making successful use of FRE 902(13) and (14): they provide incontestable and easily verifiable evidence that evidence has not been tampered with. Of course, it goes without saying that hashing has to be done correctly, otherwise opposing counsel will be quick to question authenticity. And because of this, you probably don’t want to collect and authenticate evidence yourself using the simple tool linked to above. Instead, you want to make use of only the most reliable methods and tools. 

The good news is that excellent DIY tools exist to help you generate defensible self-authenticating digital evidence. At Pagefreezer, we offer solutions for collecting and authenticating digital evidence (website, social media, team collaboration, and mobile text). Our solutions allow organizations to easily collect and authenticate both their own online data, and evidence from third parties.

We’ve also published a detailed reference guide that explains exactly how self-authenticating evidence can be generated under FRE 902(13) and (14). The reference guide is a summary of hundreds of pages of documents which explains exactly how you can generate self-authenticating evidence that’ll stand up in court. You can download this free paper by clicking on the button below.

Download Authenticating Digital Evidence Under FRE 902(13) and (14): Using Digital Signatures (Hash Values) and Metadata to Create Self-Authenticating Digital Evidence.

New call-to-action

Peter Callaghan
Peter Callaghan
Peter Callaghan is the Chief Revenue Officer at Pagefreezer. He has a very successful record in the tech industry, bringing significant market share increases and exponential revenue growth to the companies he has served. Peter has a passion for building high-performance sales and marketing teams, developing value-based go-to-market strategies, and creating effective brand strategies.

Related Posts

SEC Rule 17a-3 & FINRA Records Retention Requirements Explained

Financial industry recordkeeping regulatory requirements like the U.S. Securities and Exchange Commission (SEC) Rules 17a-3 and 17a-4, and the Financial Industry Regulatory Authority (FINRA) Rules 4511 and 2210, play a crucial role in maintaining the integrity of the U.S. financial markets. These regulations are not just bureaucratic formalities; their oversight involves ensuring that financial services firms adhere to stringent record retention requirements, essential for the transparency, accountability, and trust that underpin the financial system.

The Reddit OSINT/SOCMINT Investigation Guide

According to its IPO prospectus submitted to the US Securities and Exchange Commission on February 22, 2024, Reddit has more than 100K active communities, 73 million daily active visitors, 267 million weekly unique visitors, and more than 1 billion cumulative posts.

Understanding a Request for Production of Documents (RFP)

Requesting production of documents and responding to requests for production (RFP) are key aspects of the discovery process, allowing both parties involved in a legal matter access to crucial evidence.