It's always fun to write about research that you can actually try out for yourself.
Try this: Take a photo and upload it to Facebook, then after a day or so, note what the URL to the picture is (the actual photo, not the page on which the photo resides), and then delete it. Come back a month later and see if the link works. Chances are: It will.
Facebook isn't alone here. Researchers at Cambridge University (so you know this is legit, people!) have found that nearly half of the social networking sites don't immediately delete pictures when a user requests they be removed. In general, photo-centric websites like Flickr were found to be better at quickly removing deleted photos upon request.
Why do "deleted" photos stick around so long? The problem relates to the way data is stored on large websites: While your personal computer only keeps one copy of a file, large-scale services like Facebook rely on what are called content delivery networks to manage data and distribution. It's a complex system wherein data is copied to multiple intermediate devices, usually to speed up access to files when millions of people are trying to access the service simultaneously. (Yahoo! Tech is served by dozens of servers, for example.) But because changes aren't reflected across the CDN immediately, ghost copies of files tend to linger for days or weeks.
The Wayback Machine is a 150 billion page web archive with a front end to serve it through the archive.org website.
Today the new machine came to life, so if you using the service, you are using a 20' by 8' by 8' "machine" that sits in Santa Clara, courtesy of Sun Microcomputer. It serves about 500 queries per second from the approximately 4.5 Petabytes (4.5 million gigabytes) of archived web data. We think of the cluster of computers and the Modular Datacenter as a single machine because it acts like one and looks like one. If that is true, then it might be one of the largest current computers.
Also, we can do fun stats. We now know the the web weighs 26,500 pounds, the average web page weighs 80 micrograms, and 160 joules per query.
On another note, we got a nice letter from the last living director of the Rocky and Bullwinkle Show, Gerard Baldwin, because he read about the "fantastic project". Our Wayback Machine is a tribute to their more cleverly named "Waybac Machine" which in turn was a reference to the Univac. Sherman and Peabody live on.