Coda File System

Re: coda server replicant peer removal (and caveats)

From: Jan Harkes <jaharkes_at_cs.cmu.edu>
Date: Fri, 18 Jun 2010 16:07:18 -0400
On Wed, Jun 09, 2010 at 11:01:25PM -0700, don wrote:
> Ok, so I've created my replicant peer:
> 
> *) updated vice/db/servers
> *) run vice/db/volutil createrep
> *) updated vice/db/VRList (using volutil createrep output)
> *) run vice/db/volutil makevrdb
> *) run vice/db/bldvrdb.sh (new host)

If you went from a single replica to a doubly replicated group, then the
original replica doesn't have an active resolution log as there is no
need to resolve when we are the only replica and there is no mechanism
to reclaim log entries if we will never detect different versions
between available replicas.

So things will seem to work fine until the client detects a version
difference and triggers server-server resolution. If the conflict is
non-trival we fall back on log-based directory resolution and without
the resolution logs the servers give up and declare it an unrepairable
conflict. As an end-user you would mostly notice that you have to
manually repair more server-server conflicts than usual.

To enable the log run,
     volutil setlogparm <replica id> reson 4 logsize 16384

The logsize ideally should be set the same on all replicas, it normally
is initialized to 4096, which in my experience tends to be on the small
size. Not sure if it really has to be a power of two, but that is what
I've been using.

Now when you drop back down to a single replica, the log should be
disabled, volutil setlogparm <replica id> reson 0. If the log is kept
around it will slowly over time accumulate entries that will never get
resolved and when it overflows the set size the server halts on an
assertion.

> And, of course, bounced all the coda server services to get the
> updated vice/db/servers file.
> 
> Now, I want to remove the replicant entirely from coda.  How do I do
> this?
> 
> Obviously I'll need to update the vice/db/servers & vice/db/VRList,
> but what other commands ought I to run to completely remove the
> server from volume replication?

Well, for both growing and shrinking the volume replication group size
we're missing the parts that notify clients to recache the necessary
replication information. So all clients will have to be reinitialized to
forget about the removed replica.

Once the replica is removed from the VRList and 'volutil makevrdb' is
run the server should no longer be referenced. This data is
automatically propagated to the other servers so a server restart is not
necessary. At this point clients can be reinitialized and will learn
about the volume's only location on whatever replica is remaining.

To remove the information about volumes that are/used to be on the
server, you would have to remove /vice/vol/remote/servername.list and
then run bldvldb.sh with the name of a remaining server, or remove the
server's name from /vice/db/servers and run bldvldb.sh without an
argument. You don't really have to restart any of the servers. They'll
know about the old serve, but since nothing references it this doesn't
matter. Once they get restarted for some other reason they'll pick up
the fact that it is gone.

Jan
Received on 2010-06-18 16:07:39