(Illustration by Gaich Muramatsu)
On Tue, 15 May 2001, Jan Harkes wrote: > On Mon, May 14, 2001 at 10:18:31PM -0500, Ryan M. Lefever wrote: > > Hi, > > > > I am trying to fix some RPC2 problems that I have when using volutil. > > > > When I do a "volutil setdebug", the following happens no matter whether I > > do it locally or remotely, or to the SCM or a non-SCM. Also, a > > /vice/srv/CRASH file is created. > > > > -- > > [root_at_nsx srv]# startserver -d 1000 > > [root_at_nsx srv]# volutil setdebug 100 > > V_BindToServer: binding to host nsx.crhc.uiuc.edu > > VolSetDebug failed with RPC2_DEAD (F) > ... > > -- > > > > The SrvErr file reads: > > > > -- > > could not open key 2 file: No such file or directory > > Assertion failed: 0, file "srv.cc", line 336 > > EXITING! Bye! > > -- > > This is a generic assertion point where we always end up when a SIGSEGV > is received. If you create the file /vice/srv/ZOMBIFY, the server should > end up in an infinite loop at this point. Then you can easily attach gdb > and get a stacktrace. > > # gdb /usr/sbin/codasrv `pidof codasrv` > (gdb) bt > > The trace will be a bit funny, because the actual point where the > segfault was triggered won't show up. The stack is clobbered by the > signal handler. However, the function that called the function that > crashed will show up and from the line number it is possible to figure > out at least which function had a problem. > > It will probably be something like, > > #1 coda_assert function where we are waiting > #2 sigsegv handler > #3 ??? > #4 function before the segv was received. > x/x/volutil/vol_setdebug.cc:666 > > I tried this method and got the following: -- (gdb) bt #0 0x40184c61 in __libc_nanosleep () from /lib/libc.so.6 #1 0x40184bed in __sleep (seconds=1) at ../sysdeps/unix/sysv/linux/sleep.c:82 #2 0x80c34a7 in coda_assert (pred=0x80c48e7 "0", file=0x80c48e0 "srv.cc", line=336) at coda_assert.c:45 #3 0x804be04 in zombie (sig=11) at srv.cc:336 #4 0x40111c68 in __restore () at ../sysdeps/unix/sysv/linux/i386/sigaction.c:127 #5 0x40138986 in _IO_vfprintf (s=0x401e1ce0, format=0x4005bed5 "[%s]%s: \"%s\", line %d: ", ap=0x151a0f00) at vfprintf.c:1029 #6 0x40141047 in fprintf (stream=0x401e1ce0, format=0x4005bed5 "[%s]%s: \"%s\", line %d: ") at fprintf.c:32 #7 0x40047200 in RPC2_SendResponse (ConnHandle=505527757, Reply=0x8165ad0) at rpc2a.c:154 #8 0x8084958 in volUtil_ExecuteRequest (_cid=505527757, _reqbuffer=0x0, _bd=0x0) at volutil.server.c:1808 #9 0x8065ccc in VolUtilLWP (myindex=0xbffff8d0) at volutil.cc:135 #10 0x400829be in Create_Process_Part2 () at lwp.c:795 -- > The other (and perhaps easier) way to debug this is by running codasrv > under the control of gdb at the time the segfault happens. That way the > stacktrace shows up a lot nicer. > > # gdb /usr/sbin/codasrv `pidof codasrv` > (gdb) continue > /* trigger the volutil setdebug crash */ > SEGV received > (gdb) bt > #1 culprit function > file.cc:line I tried this method, and the backtrace gave the following: -- Program received signal SIGSEGV, Segmentation fault. 0x401e1d88 in main_arena () from /lib/libc.so.6 (gdb) bt #0 0x401e1d88 in main_arena () from /lib/libc.so.6 #1 0x3f3e002b in ?? () #2 0x40138986 in _IO_vfprintf (s=0x401e1ce0, format=0x4005bed5 "[%s]%s: \"%s\", line %d: ", ap=0x151a0f00) at vfprintf.c:1029 #3 0x40141047 in fprintf (stream=0x401e1ce0, format=0x4005bed5 "[%s]%s: \"%s\", line %d: ") at fprintf.c:32 #4 0x40047200 in RPC2_SendResponse (ConnHandle=1052104457, Reply=0x8165ad0) at rpc2a.c:154 #5 0x8084958 in volUtil_ExecuteRequest (_cid=1052104457, _reqbuffer=0x0, _bd=0x0) at volutil.server.c:1808 #6 0x8065ccc in VolUtilLWP (myindex=0xbffff8d0) at volutil.cc:135 #7 0x400829be in Create_Process_Part2 () at lwp.c:795 -- Since I didn't write any of the Coda code, its kind of hard for me to debug. Jan, does this help you any. Thanks, RyanReceived on 2001-05-15 20:13:06