Data warehousing has become a sort of personal quest of mine. I’ve been using computers since I was four and in those few decades I’ve managed to create, acquire, and loose countless precious bits of data. It’s disheartening to have lost some projects I created in my pre-teen years that I know I poured my soul in to at the time. And all because of a bad floppy.
What’s worse is that I just recently lost more of this precious data in the past few weeks while upgrading my fileserver. It’s hard to say just how much I’ve lost. It’s hard to notice when a rogue file or two goes missing over the years but it happens all the time. On linux with some historical backups a quick diff or rsync can find problems like that. Cygwin and Windows doesn’t really work quite as nicely.
So currently I have been sorting through 9 year old backup CDs trying to look for things that have gone missing. I had been thinking it was silly to keep hanging on to a binder full of ancient CDs but I proved myself wrong today. In fact I found quite a few little nice surprises that I had thought lost forever.
Windows XP is a terrible tool for archiving or backing up data. Vista is not much better. I really wish this thing could be Linux. Heck, I wouldn’t even mind dual booting if I thought Linux could be trusted with live NTFS partitions. I don’t mind Ubuntu hacking it’s way through one of my NTFS USB drives but my lovely RAID5 array is another story.
NTFS and Linux is getting better though. It’s a lot less of a struggle to get it to WRITE to NTFS anyway.
Computers are good at remembering things for us. From letters we sent to ex landlords to what music we used to listen too and pictures of people we used to work with. My computer remembers my past a lot better than I do. I try to focus on making new memories rather than spending time living in the past. It’s possible that this choice means it’s actually more difficult for me to retrieve older memories. If you spend enough time dwelling on the past it’ll seem much more clear. I think that in a sense having an “off-mind” memory storage device is quite beneficial. If only they were reliable on the cheap.
Amazon S3 is not cheap but it does protect my memories from fire and theives or any other sort of disaster that might befall my house. It’s money well spent I think. But like burning CDs or using backup tapes it takes quite a bit of thought and energy. Archiving anything requires a lot of patience and technical knowledge. Your average computer user simply doesn’t backup their data properly.
Of course now that the web has become very cost effective one can upload their pictures to Flickr or Snapfish or any number of gallery products out there. You can probably RAR up your text documents and shove them on to RapidShare or something. But people have the tendency to unwittingly back things up to the web and this is a trend that should definitely continue. Archiving data really should be this natural.
For me the problem is sheer quantity. Home videos and huge print quality photo files and CD and game backups start taking up gigs of space. I have 3.5TB of storage in my file server and the more I have the more I tend to fill. The RAID5 array has redudnancy but nothing a power surge or virus can’t wreck in an instant. So I have backup DVDs. It can take days to compress and burn just a few dozen gigs worth of data. So I backup more regularly to portable USB drives. But this is problematic since Windows XP doesn’t have any good backup utilities to properly handle this.
The real reason I lost data in the move is because In order to back up 2TB of data you need 2TB of backup storage. I had 1.25TB. I backed up the most important things, so I thought, and assumed the rest would be safe on the RAID5 array. After the crash I made sure to buy a 1TB drive for one of my external enclosures and I’ll be sure to be buying more of those. I really need the ability to have full backup somewhere.
And since I wouldn’t want to squeeze even 1TB in to Amazon S3 I need to consider getting some sort of fireproof safe for the backup drives. Heck, I’ve even considered hiding a hard drive somewhere in one of my cars. At least if my house burns down or is robbed I’ll probably have the car with me. I just don’t know how a hard drive will survive in the car with all the moisture and vibration. I regret throwing out all those silica gel packets now.
The fun comes from my old Atari 800XL system. I’m still working on preserving the files I generated when I was eight. It takes some special hardware to get the data on to a PC but I have managed to get some of it uploaded. It’s still another time consuming process but one I hope to start in to again soon. Those old 5-1/4 floppies rot pretty darn quick.