(Illustration by Gaich Muramatsu)
Hello, I want to share a success story, and document the steps I did (in essence, followed Jan's recommendations and read some man pages) Running a two-server setup with all volumes replicated. One of the servers got a corrupted volume and refused to start, complaining about an assertion error when running salvager on one of the volumes. [think of an irreparable fsck problem on a central NFS server? ;-] Well, my other server stayed online so that I had the system running. The steps to revive the crashing server: get the info about the volume: [any of the servers]# grep <volname> /vice/db/VRList you get <volname> <groupid> <replica_num> <volid_serv1> <volid_serv2> 0 0 0 0 0 0 <VSG> now you have to find which <volid> you are interested in: [any of the servers]# grep <host_with_dead_server> /vice/db/servers you get <hostname> <serverid> the <serverid> is a small number and it matches the beginning of <volid>, e.g. my dead server has number 2 and the matching volid was 200004d Let us get more information about the volume (and destroy it, as it is broken) [the host with the dead server]# grep rvm_ <where-you-have-it>/server.conf rvm_log="<LOG>" rvm_data="<DATA>" rvm_data_length="<LENGTH>" [the host with the dead server]# norton -mapprivate <LOG> <DATA> <LENGTH> (-mapprivate is a lot faster than without it) norton> show volume <volid> Id: 0x<volid> Name: <name>.<digit> Parent: 0x200004d GoupId: 0x<groupid> Partition: <partition> as I had to remove the broken volume: norton> delete volume <volid> norton> Ctrl/D Now start the server, watch "tail -f /vice/srv/SrvLog" and see the volume being destroyed, instead of crashing the server. When the server is up, the moment of truth has come. Run (substituting the values from the above) : [the host with now alive server, missing one volume]# \ volutil create_rep <partition> <name>.<digit> 0x<groupid> 0x<volid> Now go to a well connected client and do "ls -alR" on the corresponding mountpoint. You can watch the resolution to happen, looking at the results of volutil -h <hostname-of-serverN> info <volid_servN> | grep diskused for the servers concerned (use output of "grep <volname> /vice/db/VRList" above) Enjoy Coda! -- IvanReceived on 2003-03-02 04:36:13