'"/" not a directory'

Eeek.

Wednesday morning, for no good reason, CVS at work was failing with a missing symbol in libkrb5. A little experimentation revealed that the problem was not my machine but the server, Lovel. This is a RH9 box so I checked the versions of the packages thinking a broken upgrade happened overnight, but they were fine. At this point I scratched by head and issued some rpm -V commands on the relevant packages (the only good thing about RPM is that is stores per-file checksums), and discovered that krb5-libs didn't match the checksums. Hmm. Re-installed the package and tried CVS, problem solved. Or so I thought...

Worried about random libraries changing, I checked the logs for anything interesting. No errors, no strange logins, nothing. Just in case, I started a read-only fsck on that partition to check the drive is okay... and the kernel spewed errors. fsck couldn't find a valid superblock. 'ls' wouldn't work as /bin didn't exist any more, and echo * also produced yet more kernel errors.

Rebooted very quickly, wondering what would happen. The kernel couldn't find the partition (its mounted by label instead of device name), so I made a Tom's Bootdisk (after throwing away five floppies trying) and had a poke. Superblock definitely gone, but a backup block was fine. However, this is when I get the message / is not a directory. Fix?. Erm. Eek. Spent the next minute holding down before aborting and re-running as fsck -y. When it eventually finished (*** FILE SYSTEM WAS MODIFIED *** -- surely not) I have hundreds of megabytes in /lost+found, and nothing else. I tried to see what was there, and recovered some key directories, but sadly /bin and /lib were reduced to numbered files.

I'll skip the pain of getting a CD-ROM hooked up, and the case fan which then decided to give up and make an awful noise, and discovering that the drive has bad blocks after all and rushing out to get a new one, and shoddy CD-Rs which don't write correctly at 48x. Happily the Red Hat re-install went well: it detected the RAID1 array on the other disks and configred it correctly for me. Once it had rebooted I essentially had to set the hostname, do some NFS mounts and configure the NFS export, and we were back in business. The machine needs more work, and I really should put the case back together, but at least CVS is back and the repository is fine. After all of this stress a quick blast of Enemy Territory was much appreciated...

07:40 Friday, 06 Feb 2004 [#] [computers] (0 comments)