(Illustration by Gaich Muramatsu)
On Tue, May 03, 2005 at 08:38:36AM -0600, Patrick Walsh wrote: > What would you suggest would be the best way to detect when a conflict > occurs so that an administrator can be notified? Is there a particular > message we can monitor one of the logs for? Or perhaps a cron job with > a find command similar to this: > > find . -type l -a -not \( -xtype f -o -xtype d \) I typically use find . -lname '@*' The problem is really that local-global conflicts only appear on the client that failed to reintegrate. server-server conflict are seen first by the client that noticed the version-vector differences when it called getattr. > Or perhaps monitoring the /usr/coda/spool directory? How is this > managed in other places? Not sure how other are doing it, but for most conflicts it is typically a user that alerts me. I don't get all that many conflicts on our 'backend' servers because things like the hypermail mailinglist archives are actually built on the local disk and rsync'd over to /coda. If the rsync gets stuck it only affects the client that is writing the update, so users just don't see the new mails. > Also, the repair utility doesn't seem to have a way to list what > objects are in conflict -- you have to already know the full path to > them. Are there any undocumented commands or shortcuts for using this > utility? one shortcut in combination with the previous find is, find . -lname '@*' -exec repair {} /tmp/fix -owner 7768 -mode 755 \; However this only works reliably for directory conflicts, if there are any file conflicts this would overwrite them with the contents of the fix-file that was written by the previous directory repair. > Would it help my situation if there was a minimum for the RTT estimate > in the case where the estimate is near zero? That would make it so the > server can take a moment to flush a file without the client write > disconnecting. There is a minimum RTT value which I think is 300ms. That should be pretty conservative, especially since even a 10baseT network tends to have <10ms RTTs. I think this value was picked because it was 50% more than a typical roundtrip on a ppp link. > > At the same time, the poor server is still > > stuck waiting for the disk, and can't even dash off a quick ack telling > > the client that it did get the request and is working on it. > > Are there any plans to make the server multi-threaded to avoid these > sorts of bottle-necks? There is a version of LWP that runs on top of pthreads. If that is used when building RVM, it runs the flush/truncate daemon thread fully concurrent. But the RPC2 socket listener still runs as a non-concurrent thread. I have used a venus built this way for a bit when I was trying to catch some memory leaks with valgrind. But overall is isn't totally reliable and a bit slower. It is also not really possible to go completely multi-threaded, a lot of the code expects that threads are cooperative and that concurrency is limited only to places where we explicitly yield control. JanReceived on 2005-05-03 21:24:49