Wednesday, November 17, 2010

Imaging Linux

So, I've been raving about this Backup and Recovery book by Curtis Preston to anyone who will listen (and can still find some joy in revisiting old concepts). What I like the most about this book is that it offers a lot of common sense, nuts and bolts advice about backups: the why, the how, the what, and most importantly gives valuable tips on how to design your own backup strategy. As I'd mentioned in a previous post there's a whole section on bare metal recoveries and the various options for Windows and Linux. For Windows it pretty much boils down to doing some kind of alt-boot imaging, either whole disk or partitioned, but with Linux you can also create live images provided you're not using LVM, software RAID, or using extended partitions. Luckily for me my environment doesn't use any of that (except for one server).

The steps, in a nutshell, are to use dd to backup your drive, and use a live CD like Knoppix to put the image back in place. My setup is pretty simple with only 3 partitions: /, /boot, and swap. I made a share on my SBS 2003 box, and used dd to back up the partitions and the MBR. I also made a copy of fstab for reference. Then I trashed my drive by using dd to zero out the MBR and the first 1GB of each partition.

First I tried the restore by booting off of my Knoppix bootable thumb drive. I rebooted once I'd gotten everything back in place and...what's this? Huh. Interesting little error: ALERT! /dev/disk/by-uuid/ #### does not exist. Dropping to shell.

Ummm, okay. Hmmm. So I do some digging and get a lot of good information. The thing about plans not going right is you inevitably end up learning something from it, which is great. Those learned things are often the kind of thing you don't get from every day use either. I can take that attitude you see because this is all test and not production. Phew. Score one for the test environment team.

Anywho, my Googling leads me down the path of /dev/disk/by-uuid and checking fstab and in the end I think my error was that I had used a USB-drive for the recovery. The OS recognized the device as /dev/sda, so the actual drives in the server were mounted as /dev/sdb. That was my thinking at least, so the fix would be to simply change the UUIDs in fstab to whatever /dev/disk/by-uuid was seeing. Or so I thought.

I made the change and rebooted. Same error. What happened? Turns out I got turned upside down and was comparing content in /dev/ instead of in /media/sdb1/dev/... This had to have happened because of using a USB drive, so let me go ahead and recreate this test scenario but this time I'm going to use a CD. The instructions I was following never mentioned a thumb drive anyway. I was being cute; clearly it was not the time or place to be cute.

I won't bore you with the long tale of my CD-burning woes, but let's just say that several disks later I came to find that Knoppix doesn't support SATA drives, so it would not successfully boot the CD. Next step: external CD drive. Holy cow I'm going through a lot for this lab. With the external drive I finally am able to successfully boot into Knoppix so I try my experiment again. Guess what? Same error. Weird upon weird. I checked everything: fstab, grub.cfg, /dev/disk/by-uuid-- the "missing" device is very much there in all places. There is no problem here. I tried changing fstab to reference the actual device instead of the UUID. Still no joy.

At this point I am ready to declare the experiment a complete failure and move on to Plan B: virtualization. That's where we're looking into heading anyway, but virtualization is not fast and easy to implement if you're not willing to pay the bucks...and we're not. I wanted to get something in place fairly quickly because it makes me nervous to not have some kind of quick recovery solution in place for production servers, especially web servers providing content to clients. At this point though I may very well be putting in more time for this than is worth it. It's a tough call because I hate to admit defeat, especially for something that was supposed to be so simple, and every time I try it again I say to myself, "Alright, if it doesn't work this time I'm done." Sounds like a bad relationship. :)

I think if I can't get it off the ground by the end of the week I will officially call it quitsters on this little project.

No comments:

Post a Comment