BLOG

See the latest news and insights around Information Governance, eDiscovery, Enterprise Collaboration, and Social Media. 

All Posts

The Art of Archiving Complex Modern Websites

You’ve probably heard of web crawlers (sometimes also called spiders or spiderbots). Much like real spiders scurrying around, these bots systematically scour the internet for website content. They discover and visit sites by following links in order to copy pages and process them.  

The most common use case for this is internet search. Search engines like Google use bots to index sites for its results pages. But there are many other reasons to crawl pages as well. At PageFreezer, crawling technology is used to archive customers’ websites. We regularly crawl sites to ensure that any change to any page is recorded and saved. Customers can then review these chronological versions of their websites through our dashboard, which allows them to replay any version as if it were live, and also instantly identify changes to a page.

How to archive modern websites 

Why Organizations Archive Their Websites   

Why is all this necessary? For many of our enterprise customers, it comes down to regulatory compliance. With industry regulators demanding the collection and preservation of all online communications, having an archiving solution in place is an absolute requirement. Failure to preserve records in appropriate, replayable, and verified formats can result in significant penalties.      

Want to learn more about the most advanced Website Archiving Solution available? Visit our product page, or click the button below the schedule a demo.

Schedule a Demo

 

Lawsuits, false claims, and eDiscovery responses are also common reasons. Should someone claim that a statement appeared on a company website that didn’t, the organization has to be ready to disprove the claim. In order to do this, evidentiary-quality records of website content is needed. PageFreezer provides evidentiary-quality records by delivering archive data that is timestamped and has a digital signature, so authenticity is guaranteed.

 

The Challenges of Archiving Modern Web Pages

While the idea of crawling websites is nothing new, the challenges of crawling many modern pages for archiving purposes is significant. Organizations like banks, for instance, have incredibly complex websites that consist of dynamic content, password-protected pages, personalized pages, and form flows. Capturing all of this accurately purely through crawling (without the need for complicated scripting or engineering) isn’t easy.

In fact, it’s a challenge that many archiving companies struggle with. A leading global financial services firm recently signed a contract with PageFreezer after concluding that we had the only web crawling technology capable of accurately capturing its highly-personalized dynamic website for eDiscovery and regulatory compliance purposes.

PageFreezer conducted an extensive two-year R&D program that resulted in the 3rd generation of our dynamic web capture technology. Thanks to this latest crawler, PageFreezer can capture client-side generated webpages by Javascript/Ajax frameworks, including Ajax-loaded content. It’s also capable of collecting multiple steps in web form flows, and can capture webpage content that is displayed after a user event (if a section on a webpage loads additional content using Ajax after a user clicks).

Ready to start archiving your website automatically? Request a quote by clicking the button below.   

Request a Quote

 

George van Rooyen
George van Rooyen
George van Rooyen is a Content Marketing Specialist at Pagefreezer.

Related Posts

How to Archive a Twitter Account

As with other social media platforms, like Facebook and Instagram, compliance and legal professionals often need to archive a Twitter account for official use. 

Pagefreezer's Stance on Racism

Statement from Pagefreezer CEO Michael Riedijk. Over the last weeks, I have closely followed the Black Lives Matter protests motivated by the deaths of George Floyd, Rayshard Brooks, Breonna Taylor, Ahmaud Arbery, Michael Brown, and so many other Black Americans who are victims of systemic and institutionalized racism. The movement has quickly expanded to the UK, Europe and globally, exemplifying its importance and how widespread this problem is.

How to Use Social Media in Fraud Investigations

When it comes to investigating potential fraud, modern social media platforms can be a tremendously useful resource. The reason for this is simple: a lot of us are active on social media these days—and we tend to share more than less. At the end of Q1 2020, Facebook reported 1.73 billion daily active users and 2.6 billion monthly active users, with around half of all social media site visits in the United States going to Facebook. Add Instagram’s 500-million daily active users—not to mention the 500 hours of video uploaded to YouTube every minute!—and you’re left with a lot of potential digital evidence.