(Illustration by Gaich Muramatsu)
On Fri, 2003-01-10 at 09:21, ext Jan Harkes wrote: > > In any case 'errorcode 198' is EINCOMPATIBLE. It is returned when we're > trying to write (store) data to a file that has been modified, i.e. the > store-id of the original copy on the client doesn't match the store-id > of the file on the server. > > Your server log should show a message similar to, > CheckStoreSemantics: (0x7f00002a.0x76a.0x44d0), VCP error (198) 15:43:08 Callback failed RPC2_DEAD (F) for ws 192.103.16.64:32772 15:43:08 Unbinding RPC2 connection 1000633518 15:43:08 Unbinding RPC2 connection 758732922 15:44:50 client_GetVenusId: got new host 192.103.16.64:32809 15:44:50 Building callback conn. 15:44:50 No idle WriteBack conns, building new one 15:44:50 Writeback message to 192.103.16.64 port 32809 on conn 188968d3 succeeded 15:44:50 RevokeWBPermit on conn 188968d3 returned 0 15:50:08 SmonDaemon timer expired 15:50:08 Entered CheckRVMResStat 15:50:08 Starting SmonDaemon timer 15:59:23 Callback failed RPC2_DEAD (F) for ws 192.103.16.64:32809 15:59:23 Unbinding RPC2 connection 89471738 15:59:23 Unbinding RPC2 connection 274774048 16:02:30 client_GetVenusId: got new host 192.103.16.64:32769 16:02:30 Building callback conn. 16:02:30 No idle WriteBack conns, building new one 16:02:30 Writeback message to 192.103.16.64 port 32769 on conn 88d471b succeeded 16:02:30 RevokeWBPermit on conn 88d471b returned 0 16:09:16 client_GetVenusId: got new host 192.103.16.64:32770 16:20:06 RevokeWBPermit on conn 88d471b returned 0 16:20:06 CheckCreateSemantics: (1000002.3.42), VCP error (198) 16:20:06 Entering VFlushVnode for vnode 0x3 16:20:06 Entering VFlushVnode for vnode 0x5 16:20:06 Entering VFlushVnode for vnode 0x7 16:20:06 Entering VFlushVnode for vnode 0x5a 16:20:06 Entering VFlushVnode for vnode 0x5c 16:20:06 Entering VFlushVnode for vnode 0x5e 16:20:06 Entering VFlushVnode for vnode 0x60 and then today: 11:38:23 S_VolSetLogParms: res flag on volume 0x1000002 set to 4 (resolution enabled) 11:38:23 S_VolSetLogParms: volume 1000002 log parms set 11:39:02 Building callback conn. 11:39:02 No idle WriteBack conns, building new one 11:39:02 Writeback message to 192.103.16.171 port 32769 on conn 17cc0aae succeeded 11:39:07 RevokeWBPermit on conn 17cc0aae returned 0 11:39:07 ValidateVolumes: 0x7f000001 failed! 11:42:18 RevokeWBPermit on conn 17cc0aae returned 0 11:50:23 SmonDaemon timer expired 11:50:23 Entered CheckRVMResStat 11:50:23 Starting SmonDaemon timer I just noticed that the clock on my server is a day ahead. Not sure how that happened. Will Coda barf if I move it back a day? > > Looking at the server code, I just noticed that when resolution is > disabled, the code uses a stronger test that would validate the store-id > of directories. And we've been turning resolution off on newly created > volumes that have only a single replica because of some other problem. > It could be that this has made reintegration more susceptible to > failures. This is just one theory, but > 'volutil setlogparms <volume replica id> reson 4' > will turn resolution back on. I tried this (see above), but didn't solve the problem. Whether or not it would have stopped it from occurring in the first place, can't say. > > Another interesting fact is that the first entry in your CML is a store. > Perhaps the client got disconnected during the connected store attempt, > and this is essentially a replay of an already committed operation. I > believe that I once noticed that operations were given a new store-id > when the code falls back to logging the operation in the CML after a > failed connected store operation. > Well, in mine, the first item in the CML is a Create, so assuming we're hitting the same bug, that's a red herring. If you want to log in and look at it, I can give you access to the server -- there's nothing secret on it (yet). I'd have to think about it before I gave you access to the laptop, though. --RodReceived on 2003-01-10 15:03:15