Wednesday, December 12, 2018

"5 Tools to Help You Search the Archived Internet"

Sometimes I forget and have to shout "Siva, what's that archive site..." and if he deigns to respond, being a smartass (CalTech postdoc) it will be something like "Yes Sahib, how may I serve" or "Yes Sahib, my feet are like wings" (M*A*S*H episode) even though I've stopped reciting Kipling out loud and ...where was I?

From Tech.co:
The archived internet deserves more recognition. Online security has been a hot button topic in the tech community recently, with data scandals and privacy policy updates constantly driving the conversation. But, keeping the internet a stable and reliable network isn’t all about data security – it’s also about data preservation.

Anything that’s low tech is dismissed as “from the stone age,” but stone is by far the most stable way to record information. Not only will the hard drives and networked routers of today never last a thousand years, but plenty of information online won’t even last the decade. As local newspapers or long-in-the-tooth startups go under, they all leave dead links scattered across the internet, constantly replaced with fresh links that will themselves eventually die.

Wow, sorry, didn’t mean to get too dark there. My point is, memories that you might want to keep are increasingly likely to exist only on the internet — rambling G-Chat conversations with your best friend, say, or your first WordPress blog. If you want to preserve, protect, or search through your online footprint, read on to learn which five online tools can best help you comb through the archived internet.
What is the Deep Web? – Learn more about the hidden parts of the internet with our explainer guide
Archive.is
What It Does
This is the quickest and easiest way to grab a free, high-quality record of an existing webpage.
“This can be useful if you want to take a ‘snapshot’ a page which could change soon: price list, job offer, real estate listing, or drunk blog post,” the site explains.
You can search through the Archive.is site for previously archived webpages, if you’re interested in tracking a specific Twitter account or tech company. There’s even a draggable bookmarklet that you can add to your bookmarks bar to archive future webpages with a single click.
How You Can Use It
Go to the Archive.is site, paste the URL of your webpage into the bar at the top, and click on the “save the page” button.

Given the social fallout that can come from a single bad tweet, this site can be a useful way to grab a verifiable, photoshop-proof evidence of a tweet or post that will likely be deleted soon. The saved webpage that results won’t have any active elements or scripts (no popups or paywalls, in other words), but should look more or less the same, even down to the same clickable hyperlinks that the original page boasted.

Lumen
What It Does
Data loss on the internet isn’t always due to the natural process of link rot, as servers or domains become permanently unavailable. One major cause is due to legal demands for content removal. While the content removed due to takedowns can’t itself be archived, the legal complaints themselves can be.

Lumen is an online database of takedown notifications. It’s a project from the Berkman Klein Center for Internet & Society at Harvard University, designed to collect digital content removal requests.
“Our goals are to educate the public, to facilitate research about the different kinds of complaints and requests for removal–both legitimate and questionable–that are being sent to Internet publishers and service providers, and to provide as much transparency as possible about the “ecology” of such notices, in terms of who is sending them and why, and to what effect,” the website explains.
How You Can Use It
Type any search term into the site and you’ll likely pull up thousands of results. Use the advanced search functions, and you’ll be able to narrow down the DMCA requests by topic, sender, recipient, tags, country, language, action taken, and date.

The search results page includes an easily scanned list of takedown requests, including details such as who submitted them, on behalf of whom, and who to (the latter is almost always Google). You might want to use this database if you’re interested in why a seemingly innocent post in your search results or a favorite YouTube video has suddenly disappeared due to a content claim. You’ll get all the information you need to follow up on the takedown with the company who submitted the request in the first place.

Lumen has a feature that allows you to construct a DMCA counter notice, if you’re the one who has been hit with a takedown that you want to contest. You can also report your own takedown notification though the contact information available on the site....
...MORE

Archive.is, that's the one. I can usually remember the Wayback Machine (follow link) thanks to Sherman and Mr. Peabody but for some reason Archive.is just doesn't stick.