(Illustration by Gaich Muramatsu)
On Thu, Feb 10, 2005 at 04:01:20AM +0800, Alan Tam wrote: > root_at_delta:/var/coda/db# gdb > Reading symbols from /usr/sbin/venus...(no debugging symbols found)...done. And I thought I had disabled stripping of the binaries, clearly not. > (gdb) bt > #0 0x40146b68 in sigsuspend () from /lib/tls/libc.so.6 > #1 0x080adddb in ?? () > #2 0xbffff5dc in ?? () > #3 0x00001a2d in ?? () > #4 0x00001a2d in ?? () This backtrace looks pretty useless, most of the shown addresses are not in any executable code segment and the ones that look valid don't seem to map to anything useful (I tried to use addr2line on a non-stripped version of the binary). It looks like the list of volumes is somehow corrupt. > 03:55:48 starting VDB scan > 03:55:48 Fatal Signal (11); pid 6701 becoming a zombie... > 03:55:48 You may use gdb to attach to 6701 Between 'starting VDB scan' and the next message 'N volume replicas' is only a little bit of code. We iterate over the list of all known volumes and reset the non-persistent data. The only way that this could crash is if the linked list is somehow messed up. I don't know how your client got into that state since RVM should guarantee that any updates to this list are either atomically committed or aborted. So I have a pretty good idea where it crashed, but no idea how it managed to crash there. JanReceived on 2005-02-09 16:24:09