(Illustration by Gaich Muramatsu)
On Mon, Apr 12, 2010 at 04:01:45AM -0700, root wrote: > >>This is unclear to me. I do see that my coser based install of > >>coda/vice server does, in fact, have both a vice/db/scm and a > >>vice/hostname, and > >>they match. > > > >They match on the scm machine and do not match on all other ones. > > Ok, so we would need to update "scm" [on all servers] to be the new > SCM, and it would need to match that shown in /vice/hostname of the > new SCM, This is only used when the update daemons are started (restarted) on the SCM machine it will start the update server daemon, on all other machine it will start an update client that will connect to the SCM and receive any updates made to files in /vice/db. It is a pretty brain-dead master-slave replication mechanism for things like the list of volumes each server is supposed to be exporting, the user/group databases. Things that are mostly read-only, server admin stuff, but where we don't want to get conflicting updates between different sites. Essentially all of the data replicated by the update daemons could be stored in an LDAP database. If the SCM goes down (disappears) the worst that happens is that users cannot change their passwords and we cannot add new volumes into the system. Because each server has a full copy of all files in /vice/db it is possible to move the R/W reponsibility to any other server by updating the /vice/db/scm file and restarting the update daemons on all servers. None of the file data that is stored by Coda clients is replicated by this mechanism. That is propagated between Coda servers through a conflict resolution process. The current strategy is that a client writes a new version of a file to a single server and then notifies all replicas for that volume that there is a conflict, all server replicas then check each others versions and fetch the latest version. If a server misses an update, this will be discovered the next time a client fetches the file's attributes and notices a version difference between replicas. To assist in the conflict resolution there is a per-volume update log that is used when directories on more than one server were updated before resolution (effectively the versions branched). This log is normally truncated whenever we have successfully resolved all known replicas. But when a server has crashed it will not participate in conflict resolution, so although the available replicas can be in sync, the log cannot be truncated. This log is store in RVM, and it's size can be changed using the 'volutil setlogsize' command on the server. It defaults to something like 8000 log entries, but this can run out pretty quickly on a busy volume. Oh, and the implementation uses one entry as a syncronization point for each directory, so a volume with a lot of directories keeps a lot of these entries busy and as such filling up this log is often one of the first scalability problems people hit. > Where are the modification logs so we could monitor them? They are stored inside of the servers RVM data segment, the only way I know of checking the size is to run 'volutil info <volumeid> | head -3' (although volutil info's help says it can handle volume names, that hasn't worked for me in the past, you need the numeric volume id) JanReceived on 2010-04-16 10:04:15