(Illustration by Gaich Muramatsu)
On Mon, Apr 19, 2004 at 04:08:52PM +0200, Johannes Martin wrote: > I've got a problem reintegrating my laptop after it was disconnected. The > laptop had been suspended for a few hours and files had been modified on > the server. The laptop was then woken up but was still in disconnected > mode as venus hadn't realized yet that the network connection was up. Just over the past 2 weeks I found a bunch of reintegration/repair related problems. It looks like they have been lurking for about half a year (or more) and started with a local variable in a loop that obscured an identically named one at the scope of a function, and a missed pointer dereference when moving the conflicting CML entries to a special 'local repair volume'. > Now there are a few different scenarios: > - sometimes, venus falls asleep as soon as I clog: > 07:25:04 fatal error -- cmlent::thread: can't find (5086c288.7f000001.1.1) Fixed this one, because of the bad dereference, CML entries were not correctly renamed and the symptom was that we couldn't find the repair fid. The bad fix as to map them back to the original file identifier. However repair related CML entries are created in several different ways, and so the universal 'map to global fid' really doesn't work. > - when I tried just now, I was able to clog, but repair failed: > repair > beginrepair > Pathname of object in conflict? []: jmartin > No such replica vid=0xffffffff > Could not allocate replica list > beginrepair failed. > cfs beginrepair had actually been executed and ls -l jmartin showed the > following: > total 4 > lrw-r--r-- 1 root nogroup 43 Apr 19 16:01 global -> \@7f000001.00000001.00000001\@notamusica.com > drwxrwxrwx 28 root nogroup 4096 Apr 18 20:38 local/ Hmm, I guess that volume is replicated across multiple servers, because it looks like there is a server-server conflict which originally caused the reintegration to fail. > Any hints on how to repair this problem (if it helps, I don't need any of > the data that is locally cached). With the existing 6.0.5 code this probably can't be fixed. It does recover a bit when restarted, but not enough to reliably repair the conflict. If you would have cared about the local data, a snapshot that contains any new files should be in /usr/coda/spool/<userid>/<volume>.tar (actually, it is probably in /var/lib/coda/spool on Debian). The easiest way out is to reinitialize the client, kill venus, unmount /coda and restart venus with the -init flag, killall -9 venus umount /coda venus -init & I was hoping to find at least one more (very slow) memory leak in the servers before making a new release, but if I haven't found that by wednesday I'll probably start building whatever is currently in CVS as a new release. JanReceived on 2004-04-19 14:00:17