As with other social media platforms, like Facebook and Instagram, compliance and legal professionals often need to archive a Twitter account for official use.
In the public sector, this is usually to satisfy FOIA and Open Records recordkeeping requirements, while in the private sector, it is generally in preparation for a regulatory audit or legal matter.
In a recent example, Tesla CEO Elon Musk’s Twitter account became central to a legal matter when one of his tweets resulted in Tesla shares dropping 13%, and a group of shareholders initiating a lawsuit against the company.
Tesla stock price is too high imo
— Elon Musk (@elonmusk) May 1, 2020
In cases like the one above, it becomes necessary to collect a twitter account as potential evidence, but what is the best way to do this? An obvious solution is simply to take screenshots.
At first glance, simply taking screenshots of relevant tweets can seem like a good solution, but this sort of manual capture of static images introduces certain issues:
For the reasons listed above, static images are not the ideal way of collecting and preserving tweets—at least, not when it comes to a legal matter or regulatory audit. A better alternative is to download an archive of the account directly from Twitter.
Like Facebook and Instagram, Twitter allows users to download account data from the platform, which effectively creates an archive of the information.
To download your data on a desktop, you need to click “More” in the left-hand menu on the site, and select “Settings and privacy.” From there, you select “Your Twitter data” under “Data and permissions.”
Next, you can click a button to request your archive.
When you return to the same page after you’ve received the notification email, you will now be able to download the data directly to your computer. Twitter will also tell you when the data was generated and what the estimated download size is—the download is in the form of a ZIP file that will need to be extracted.
Once you’ve extracted the ZIP file, you’ll be left with the following:
The first and most obvious file to pay attention to is “Your archive.html”. As the file extension suggests, it opens in your web browser.
As Twitter itself states: “The easiest way to navigate your archive is to open the HTML renderer in a desktop web browser by double clicking the “Your archive” file included in the main folder once the archive is unzipped.”
What you’re provided with, essentially, is a local version of your online account. Clicking on “Tweets” provides you with a user-experience that’s very similar to that of the normal Twitter user interface.
Unfortunately, as user-friendly as this view of the archive is, it has several severe limitations:
To find something that could potentially be used for compliance and litigation purposes, you have to dive into the JSON files in the downloaded “data” folder. And as is so often the case with these kinds of archive downloads, the content has been broken apart and stripped of context. Here is what a portion of that folder looks like:
All the images from tweets can be found in the “tweet_media” folder, while the tweets themselves can be found in the “tweets.js” file. When you view this .js file through a JSON viewer, you see this:
All tweets from the account have been strung together and stripped of their context, including their associated media. On the positive side, “tweets.js” is at least a single file that could be authenticated with a hash value, but the content is difficult to understand at a glance and hardly the sort of thing you’d want to present in court.
Like a screenshot, Twitter’s built-in archiving feature is not an ideal solution, either, as it doesn’t provide you with the sort of authenticated copy of evidence that could be submitted to an auditor or a court.
When it comes to collecting and preserving data for legal and compliance reasons, a much better alternative is to make use of an automated archiving solution that archives Twitter content in real-time. With real-time social media capture, you can: