Categories
Reviews Technology

On-demand web page archiving

I did a post yesterday at Youthpad on how popular websites looked in old days. Internet Archives’ Wayback Machine is indeed an excellent resource for this particular purpose, but its task is to keep snapshots of the Web as it grows, and not primarily as an archival service. Snapshots are made available six months after they are ‘crawled’, i.e., recorded by the Internet Archives’ automated scripts. What if you need to create on-demand snapshots of how a particular web page looks? Fortunately, there are a few web services to the rescue.

The first web service is called Iterasi. (Cool sounding name and yet unique enough – a lot like ‘Google’). What Iterasi does is that it creates an exact copy of a webpage that you are viewing, including text, images, stylesheets, JavaScript elements, et al. Using Iterasi you can create working copies of a page you come across. We often underestimate the fluid nature of websites – what may be there today may not be a valid link tomorrow. For instance, I had to link to IIT JEE rules for a blog post I did a few months ago. The thing is that if I link to it right now, it no doubt points to the correct link; however, if someone visits the same post a year down the line and clicks through to the IIT JEE site it may no longer be a valid link. By storing a copy on Iterasi, you can circumvent this potential problem. It’s not necessary to Iterasi-ize everything you link to, just a few important ones. You will need to create a free account on Iterasi to start archiving (you’ll have to hunt around a bit for the free account sign-up link). Once you have done that, you can set copies of a web page as public or private. You also get a short URL to the copy. To make the task of archiving easier, you have a bookmarklet (works on any browser; just drag and drop the link in your bookmarks toolbar) and a Firefox plugin.

The other archival or recording requirement you might have is to take a screenshot of a web page. Basically, just an image and not a ‘working’ copy as in Iterasi. Using the normal method of pressing ‘Print Screen’ key and then pasting in some image editing application, or by using a standalone screenshot application what you often get is a screenshot of just the visible portion of the web page. Aviary.com – an online image editing suite (the mind boggles at the various online image-editing utilities they have on offer) – has a free feature that allows you to take a screenshot image of the entire web page, and not just the visible area. All you need to do is this: say that you want to take a screenshot of gyaan.in, enter aviary.com/http://gyaan.in and it will take a screenshot which you can save to your PC. Just visit a webpage, and when you find one you need to take a screenshot of, enter aviary.com in front of the URL and press Enter. Aviary.com also offers a bookmarklet that you can drag to your bookmarks list, and a Firefox plugin that offers you the option to take a screenshot of whole page, visible area only, or a selected region of the page. You can then save the image to your desktop or edit it online on Aviary.com (provided you have signed up for a free account).

PS – BTW, folks in Delhi can catch up with me tomorrow at OSSCamp Delhi at NSIT Dwarka from 10am to 5pm (drop in any time you want). I’ll be giving a talk on Creative Commons licenses and conducting a quiz on open source. There are goodies to win from Adobe, Mozilla, and OSSCamp branded T-shirts.

2 replies on “On-demand web page archiving”