Coda File System

Re: Venus Segfault

From: Jan Harkes <jaharkes_at_cs.cmu.edu>
Date: Wed, 9 Feb 2005 16:23:00 -0500
On Thu, Feb 10, 2005 at 04:01:20AM +0800, Alan Tam wrote:
> root_at_delta:/var/coda/db# gdb
> Reading symbols from /usr/sbin/venus...(no debugging symbols found)...done.

And I thought I had disabled stripping of the binaries, clearly not.

> (gdb) bt
> #0  0x40146b68 in sigsuspend () from /lib/tls/libc.so.6
> #1  0x080adddb in ?? ()
> #2  0xbffff5dc in ?? ()
> #3  0x00001a2d in ?? ()
> #4  0x00001a2d in ?? ()

This backtrace looks pretty useless, most of the shown addresses are not
in any executable code segment and the ones that look valid don't seem
to map to anything useful (I tried to use addr2line on a non-stripped
version of the binary).

It looks like the list of volumes is somehow corrupt.

> 03:55:48 starting VDB scan
> 03:55:48 Fatal Signal (11); pid 6701 becoming a zombie...
> 03:55:48 You may use gdb to attach to 6701

Between 'starting VDB scan' and the next message 'N volume replicas' is
only a little bit of code. We iterate over the list of all known volumes
and reset the non-persistent data. The only way that this could crash is
if the linked list is somehow messed up. I don't know how your client
got into that state since RVM should guarantee that any updates to this
list are either atomically committed or aborted.

So I have a pretty good idea where it crashed, but no idea how it
managed to crash there.

Jan
Received on 2005-02-09 16:24:09