(Illustration by Gaich Muramatsu)
On Sat, Sep 06, 2003 at 03:52:03PM +0530, Mahesh wrote: > Hi all, > I need a distributed set up where we have many machines,each > containing their respective coda server and coda client with > replication. Each machin can be added and removed from the > distributed setup without modifications to any machines. In a way that is a tough question, but I'll start with some answers. First of all, your need to have several machines responsible for your 'realm'. Once this is set up correctly all clients will be able to connect even when either server A or B is offline. This is normally done with IN SRV dns records. Lets assume you have 2 machines, A.localrealm and B.localrealm. The DNS configuration would look something like _codasrv._udp.localrealm IN SRV 10 0 2432 A.localrealm. IN SRV 10 0 2432 B.localrealm. Alternatively (if you can't add DNS records, or your dns servers don't support IN SRV type records, which is my situation here at CMU) you have to specify this information in /etc/coda/realms on each client. /etc/coda/realms: localrealm A.localrealm B.localrealm Second, all of the responsible servers for the realm that we specified this way should have a replica of the rootvolume. There can be no more than 8 replicas, but 2 or 3 is typically enough already. I actually don't have anything higher than triple replication. Finally, to make conflict resolution more reliable, servers keep a log of operations that haven't been committed by all other servers in a replicated group. These logs have a finite size (afaik somewhere between 4000 and 8000 operations) and a server doesn't like to run out of the resolution log. As a result, any machine can only be taken offline for a limited period. The resolution log size can be enlarged (volutil setlogparms) by the administrator but that really only postpones the time to failure by a bit. If all other volumes within your realm have at least 2 replicas, then any one server can be taken offline for some period of time. If each volume has three replicas then two servers can be taken offline. Ofcourse as long as you make sure that at least one server for any replicated volume available you can clearly bring even more servers offline at the same time. JanReceived on 2003-09-07 18:32:30