BLOG

See the latest news and insights around Information Governance, eDiscovery, Enterprise Collaboration, and Social Media. 

All Posts

What is WARC and Why is it Important?

A Brief History of The Internet Archive

For over 20 years, The Internet Archive has built a library of Internet sites and other cultural artifacts preserved in digital form, through the use of WayBack Machine. Over this period of time, it has collected over 279 billion web pages.

Preserving this information and making it accessible to the public is what The Internet Archive is known for, but it is also known for the invention of Web Archive Format Files or WARC. WARC is a file format for the long term preservation of digital data. It stores web pages and other digital resources including images and meta information in their original source code.

WARC Files - The Standard for Long Term Preservation

warc-diagramThe WARC format eventually evolved into an international ISO standard (ISO 28500:2017) for digital asset archival. Since then, WARC has been adopted by many software vendors, libraries and government agencies across the globe as the new standard for digital records archival, specifically for web pages or full websites.

Governments have also embraced this standard. The National Archives and Records Administration (NARA), the nation’s recordkeeper, and the Library of Congress adopted WARC as the only acceptable file format for the long-term preservation of website & social media records according to Bulletin 2014-04, "Format Guidance for the Transfer of Permanent Electronic Records".

With WARC as the standard, the ability to create and present WARC files has become an expectation and a need.

WARC for Social Media

PageFreezer has been offering WARC exports for websites for many years now, but providing WARC formats for social media records was a completely new concept.

warc-download

PageFreezer is proud to announce that it is the first vendor to release WARC exports for social media data. Customers can now export single social media posts, complete social media timelines or selections of social media records in WARC with a single click from the PageFreezer dashboard.

WARC for Digital Forensic Investigations

But WARC provides more than a nice feature for government agencies to comply with FOIA and Open Records laws. The standard is also relevant for corporations seeking social media evidence for eDiscovery purposes. WARC exports of your social media records include all the metadata that is provided via the social media API, the HTTP header metadata and all the digital resources used in the message like video, audio and images in combination with the actual social media message, making it a valuable source for digital forensics investigations and legal authentication.

By taking advantage of PageFreezer’s new WARC export capability, your social media archives will now automatically comply with NARA’s record-keeping guidelines.

If you want to learn more how WARC can help your organization better comply with Open Records regulations or help in eDiscovery, contact us now.

Related Posts

Diversity, Equity, Inclusion, and Belonging (DEIB) at Pagefreezer

Back in June, Pagefreezer’s CEO Michael Riedijk posted our Stance on Racism. In that post we made the following promises:

More Legal Lessons Learned: 5 Times Social Media Evidence Was Denied in Court

Digital content (like web pages, Facebook posts, and tweets) is increasingly being submitted as evidence during legal matters—but it isn’t always being admitted by courts. As with any other form of evidence, digital evidence needs to meet a certain standard in order to be deemed admissible—and in many cases this comes down to how the evidence was collected and authenticated. If the collection and authentication process wasn’t handled correctly—and the method employed didn’t prove authenticity beyond any reasonable doubt—the evidence typically would not be accepted.

Why You Should Use SHA-256 in Evidence Authentication

In a previous article, we discussed why hash values are crucial in evidence collection and digital forensics. Following on from that, it’s worth discussing why Pagefreezer specifically makes use of the SHA-256 hashing algorithm when applying a digital signature to one of our records.