(Illustration by Gaich Muramatsu)
Jan Harkes wrote: >>03:55:48 starting VDB scan >>03:55:48 Fatal Signal (11); pid 6701 becoming a zombie... >>03:55:48 You may use gdb to attach to 6701 >> >> >Between 'starting VDB scan' and the next message 'N volume replicas' is >only a little bit of code. We iterate over the list of all known volumes >and reset the non-persistent data. The only way that this could crash is >if the linked list is somehow messed up. I don't know how your client >got into that state since RVM should guarantee that any updates to this >list are either atomically committed or aborted. > >So I have a pretty good idea where it crashed, but no idea how it >managed to crash there. > > Maybe it is caused by my manual editing of these files to [1] correct the wrongly detected machine names. Probably I should remove everything else and install again. I've got a lot of such experience anyway. My general impression is that coda is sometimes working and sometimes not, given nearly the same installation procedure on a couple of testing machines. Maybe because the scripts didn't shutdown the processes correctly. Maybe because of the "strange" network configurations I have. But still sometimes I do have no way to discover where the problems are. Process can be frozen, my not knowing what it is waiting for [2]. And in most cases, the messages logged are simply not enough to track down what is configured wrong. Thanks for your help anyway. [1] - hostname - db/scm/ - db/servers - db/vicetab - vol/remote/* - vol/BigVolumeList [2] sltam_at_beta:/coda$ date; ls -l; date Thu Feb 10 06:23:35 HKT 2005 total 9 dr-xr-xr-x 2 root guest 2048 Dec 25 02:57 ./ drwxr-xr-x 25 root root 4096 Feb 5 19:12 ../ lrw-r--r-- 1 root guest 9 Feb 10 03:29 delta.mydomain.com -> #@delta.mydomain.com Thu Feb 10 06:24:26 HKT 2005 Regards, AlanReceived on 2005-02-09 17:32:55