(Illustration by Gaich Muramatsu)
On Wed, Jul 28, 2004 at 04:58:13PM -0500, Troy Benjegerdes wrote: > > Hrrm, I seem to have hit a somewhat serious bug.. > > > > After a couple combinations of 'cfs er', 'cfs fl', 'cfs flushvolume', > > etc, and a venus restart, I'm getting > > > > 16:39:16 Local inconsistent object at > > /coda/hozed.org/user/hozer/.gnupg/gpg.confed.org:2,S, please check! > > > > messages from venus. > > > > There was never a 'gpg.confed.org:2,S' file.. this looks like part of a > > maildir filemane got appended onto the filename of the bogus conflicting > > object. > > > > Do we have any testcases for resolution and conflicts that can excercise > > all the code paths? Are there any coda testcases I can run at all? > > > > I just tried a "cfs purgeml", and killed venus... The problem with reintegration conflicts..... What happens is that when reintegration fails, the locally cached objects are all copied into a 'local fake volume'. This is done so that the object 'foo/local' doesn't collide with the object 'foo/global'. However the cleanup action was never really worked out correctly, or possibly got lost over time. When repair succeeds all the objects in the local fake volume are simply discarded and we refetch the correct data from the servers. When repair fails they simply stay around in the fake volume in the hope that a next repair session will fix things up and we can discard them. However when venus is restarted, the salvager tries to move the objects back into their original place so that we can re-try the reintegration. If the reintegration succeeds we're done, and if it fails we automatically end up in the previous repair state. However sometimes the linking doesn't really work out right and we end up messing up the CML, the volume or the objects that had a conflict. Whenever I try to fix something in the local repair expansion related code, something else seems to break. Probably because I don't really understand everything it is trying to do. I did start on a redesign/rewrite that avoids moving the objects into a local fake volume. Ideally we want to see something that is similar to server-server repair expansion but with an added view on the locally cached copy. We cannot use the existing server-server expansion code because the expanded object/directory has the same file identifier as the local copy which gives the same name collission problem as we have with local vs. global. So I create a single conflict object that is patched in the tree where the real conflict appeared that way I managed to make the expansion work, but the collapsing (taking the special conflict marker out) isn't really working yet. Hopefully at some point we'll be able to remove most of the current local-repair special cases and have a single well tested code path that is used by all types of (conflict) expansion. JanReceived on 2004-07-30 12:05:51