Coda File System

Re: coda as a high availability solution

From: Jan Harkes <jaharkes_at_cs.cmu.edu>
Date: Thu, 19 Jul 2001 15:44:47 -0400
On Thu, Jul 19, 2001 at 08:59:30PM +0200, Cain Ransbottyn wrote:
> > A larger cache also has disadvantages, longer startup time, and more
> > data to revalidate after a disconnection. Typically I try to get the
> > cache-size between 1x and 2x the 'active working set'. But caches larger
> > than about 200MB are sure to tickle a few problems as some cache-wide
> > operations don't scale nicely.
> 
> Thanks for all the threads today... most of my question were answered... now
> correct me if i'm wrong but this is my new idea :
> 
> The 'web clients' will mount the coda filesystem on the primary coda server.
> What happens if the primary server fails ? Is it possible to configure your

Coda has no 'primary' or 'secondary' servers. It uses a modified
write-all, read-one replication strategy. In effect all servers are
peers of each other.

When one of the servers in the volume storage group fails, the clients
continue to write to the remaining servers (the available volume storage
group). Clients use a 2 phase commit, so after the second phase the
remaining servers know that some server has missed updates and record
the changes in a persistent log.

Once the dead server comes back, the version vectors in it's replies
will be different from those of the servers that remained available. Any
client that observes these VV differences, triggers resolution for that
object on all servers who then start exchanging version vectors and logs
until they agree on who has the latest version etc.

This is a pretty resilient replication strategie, but as it is
optimistic replication it is sensitive to conflicts. The servers cannot
reach agreement on the latest version or cannot find a common point in
their logs and declare defeat, at which point it is up to the user to
intervene and repair the conflict by deciding which of the servers is
right. The fact that in some cases a user has to make such a decision is
in my opinion one reason why Coda is not really suited for high
availability type deployment.

> client that it will go to the second coda server and will keep doing his
> 'production' ? What if the second server fails (network outage,...) is it
> possible that the coda cache has 'everything'? Let's say the fileserver has

Basically a client that can cache everything will only talk to the
servers to fetch files that have been updated on the servers. Servers
send callbacks to all clients that might have a 'stale' object cached
when it is updated.

If all servers are unreachable, the client will continue as it did
before, but as it's now disconnected it will return ETIMEDOUT for every
file request that is not locally cached. Write operations are logged and
sent back to the servers when they become reachable again. Here again is
a potential for conflicts where a user has to intervene and decide what
to keep. The client will also revalidate all cached objects by comparing
version vectors, when the VVs are equal the callback promises for these
files are restored.

> 60 gig.. 20 gig for web and 10 gig for mail... 30 gig free of space... is it
> possible that the 'web client' has a 'cache' of 20 gig so if the server
> fails, he can rely on his local cache... and when the server will be online
> again he can sync the data again... This way the local clients always have a
> working cache and a exact mirror copy of the production environment ?

Yes. That's the idea, although a 20GB local client cache is currently a
bit idealistic.

> Am I dreaming here or is this setup possible? Does somebody else knows a
> better failover/high availability solutions for us ?

Shared SCSI, RAID over a network block device, GFS.
Check www.linux-ha.org for more references.

Jan
Received on 2001-07-19 15:45:21