(Illustration by Gaich Muramatsu)
On Sat, Jul 05, 2003 at 05:17:58PM +0200, Peter Sch?ller wrote: > While testing disconnected operation I have run into a new problem. When > issuing a reconnect followed by a checkservers after having made > modifications in disconnected mode, venus often ("often"? see below) > fails with errors such as: > > 15:27:55 to /usr/coda/spool/1000/coda.root@_coda.tar > 15:27:55 and /usr/coda/spool/1000/coda.root@_coda.cml > 15:27:55 Reintegrate: coda.root, 100/544 records, result = Unknown error: 198 ... > This happens on both my Linux 2.4.20 system with Coda 5.3.20, and on my > FreeBSD 4.8 STABLE (cvsup:ed today and built today) with the same version of Yeah, that is fixed with the 6.0 servers. It only happens when you have singly replicated volumes, and all of our volumes happen to have 2 or 3 replicas so I never hit the problem. Basically the version vector test is too restrictive and the reintegration will abort due to conflicting updates (in fact updates it just reintegrated from the same client moments ago). > Coda. On Linux it's not even possible to restart Venus because /coda is never > unmounted, so I have to reboot each time. On FreeBSD I can just restart We don't automatically unmount on linux because that fails pretty much all the time (i.e. as long as any process has it's cwd in Coda or a open file reference). But after killing venus you can always try to unmount it by hand, and if that fails search around with lsof for processes that have a reference to /coda (lsof | grep /coda) and kill those and retry the umount. > 15:38:21 volume coda.root has unrepaired local subtree(s), skip checkpointing > CML! > 15:43:22 Reintegrate coda.root pending tokens for uid = 1000 Correct, there was a reintegration conflict detected, which has not yet been resolved by the user. And since the client was restarted, you obviously don't have tokens. Which explains the second message, we can't reintegrate until the user has obtained authentication tokens. > Note that when it breaks it will flush a few hundred changes successfully > (looking at the venus log and the output of cfs listvol /coda), but then get > stuck somewhere with the above errors. That sound like the singly replicated volume reintegration problem. The first reintegration succeeds, but the second one fails because the directory version vectors on the server are not what the client expected (due to the modifications made by the first reintegration). JanReceived on 2003-07-05 15:32:08