Coda File System

Re: Recovering from lost RVM log partition..

From: <jaharkes_at_cs.cmu.edu>
Date: Mon, 15 Mar 1999 20:55:09 -0500
troybenj_at_scl.ameslab.gov said:
| I have two servers set up to evaluate coda, and for performance
| reasons, I put the RVM log partition on a separate disk (along with
| some extra swap partitions). This disk, being rather old, then
| proceeded to crash about two weeks ago. 
|
| Because of coda's replication (and limited use of the machines), I
| didn't notice the server (and disk) had died for a couple of days.
|
| I've made dump files of the replicated volumes I want to keep from
| the remaining working server. My question is: Do I have to
| re-initialize RVM on both the servers, or can I get by with only the
| server which had the disk crash? 

Hmm, interesting.

<dream mode>
I do not know that much about RVM, but it might even be possible to get
away without reinitializing anything. Just recreate the RVM log and if 
all transactions were flushed from the log before the disk crashed there
should be no problem.
</dream mode>

In any case, because of the replication, you only need to reinitialize
the crashed server. And then recreate the volumes that existed before the 
crash/reinitialization.

This does require some info which you can get from /vice/vol/VolumeList 
on the crashed server and /vice/vol/VRList on the SCM. These files get
lost when volumes are created, so make backup copies.

The command to recreate the volumes is:
  volutil create_rep <partition> <volumename> <grpid> <rw-volid>

volumename is the `regular' name with a .number extension. (see VolumeList)
grpid is the replicated volume id, something like 7F000xxx. (see VRList)
rw-volid is the `local' volume id. (see VolumeList)

f.i.
Wvmm:s.info.0 Ic9000085 Hc9 P/vicepa m0 M0 U1739 Wc9000085 C35.....
               ^^^^^^^^ rw-volid

The easiest way is to prepare a script to run when the server has come 
up, eg:

  volutil create_rep /vicepa vmm:s.info.0 7F000001 c9000085
  volutil create_rep /vicepa vmm:user.0 7F000002 c9000086

After all the volumes have been recreated, the new server will get the 
data from the surviving server by resolving. This can be done by doing
'ls -lR /coda' from a client. The `volmunge' script is useful for this
because it will only walk within a single volume. Don't forget to do a 
cfs cs <reinited-server> <survivor> to make sure the client is talking
to both servers.

Jan
Received on 1999-03-15 20:56:40