Coda File System

Re: codasrv crash on netbsd/sparc64 3.0

From: Sean Caron <caron.sean_at_gmail.com>
Date: Tue, 25 Apr 2006 10:43:49 -0400
Hi Greg,

Please see embedded comments.

On 4/25/06, Greg Troxel <gdt_at_ir.bbn.com> wrote:
>
> Hmm.  I would think, given that you're running a 32-bit kernel
> (presumably you are running the SUN4U kernel from NetBSD/sparc and
> NetBSD/sparc userland) that you essentially see the same behavior as
> on NetBSD/sparc.  I've run venus on that platform, but not codasrv.



This is yes, the NetBSD/sparc 3.0 userland sets plus, I took the
GENERIC_SUN4U
kernel from NetBSD/sparc, patched it to fix DEFPA FDDI on SPARC, and rebuilt
it.


I would guess you installed most dependencies from pkgsrc, and then
> compiled lwp/rpc2/rvm/coda.  Is that right?



I actually installed everything manually -- downloaded source packages of
gnu make
3.80, autoconf 2.59, automake 1.9.6, libtool 1.5.8, readline 5.1, m4 1.4.
did the usual
./configure;make;make install routine for well-behaved programs. everything
built fine
with no errors & i've used the tools to build other software packages e.g.
SSH, BIND,
Apache

I then compiled lwp 2.1, rvm 1.11, and rpc2 1.28 from source as well using
the same
./configure;make;make install methodology -- no show stopping errors, no
wierd flags
passed to configure.


It could be that your problem is not because of using sparc.



I hope so! I run into some wierd snags with my SPARC machines that x86 users
probably don't heave to deal with, so to not have it be an
architecture-specific issue
for once would be always refreshing. :)

In gdb, after attaching, do "bt" to get a stack backtrace.  Then do
> "up" to move to where the signal was, and there "i frame" and "list".



Done! Here's the data:

(gdb) bt
#0  0x403bc3a0 in sleep () from /usr/local/lib/libc.so.12
#1  0x0008f2a8 in coda_assert (
    pred=0xf9e <Error reading address 0xf9e: Invalid argument>,
    file=0x94518 "srv.cc", line=302) at coda_assert.c:46
#2  0x00013c64 in zombie(int) (sig=3998) at srv.cc:302
#3  <signal handler called>

(gdb) up
#1  0x0008f2a8 in coda_assert (
    pred=0xf9e <Error reading address 0xf9e: Invalid argument>,
    file=0x94518 "srv.cc", line=302) at coda_assert.c:46
46                   sleep(1);
Current language:  auto; currently c

(gdb) i frame
Stack level 1, frame at 0xffffc250:
 pc = 0x8f2a8 in coda_assert (coda_assert.c:46); saved pc 0x13c64
 called by frame at 0xffffc2f8, caller of frame at 0xffffc1e8
 source language c.
 Arglist at 0xffffc250, args:
    pred=0xf9e <Error reading address 0xf9e: Invalid argument>,
    file=0x94518 "srv.cc", line=302
 Locals at 0xffffc250, Previous frame's sp in sp

(gdb) list
41          case CODA_ASSERT_SLEEP:
42              fprintf(stderr, "Sleeping forever.  You may use gdb to
attach to process %d.",
43                      (int)getpid());
44              fflush(stderr);
45              for (;;)
46                   sleep(1);
47              break;
48
49          case CODA_ASSERT_EXIT:
50              fprintf(stderr, "EXITING! Bye!\n");
(gdb)


I have made some fixes in NetBSD current for coda kernel support that
> have not yet been applied to 3.0.  But that affects system stability
> with venus, not running codasrv.


This is where I was *really* expecting problems to occur -- on the client
end
rather than the server end. I built new kernels with the coda driver
config'd in
and they boot up okay, but I was expecting finicky behaviour once I actually
started to use the coda client -- file-system CODA is not by default enabled
in /sparc and /sparc64 kernel configuration files in NetBSD so I assumed it
was perhaps "a little sketchy" and thus I expect some strange behaviour
perhaps.

Unfortunately I wasn't even able to get that far since I have no servers up
and
running to test! Arg... If I can manage to get codasrv starting up on my
servers,
I might keep those client patches in the back of my mind :)


I use the following for building from CVS.  I use gmake, but I don't
> remember why make didn't work.  That's probably not your problem.
>
> #!/bin/sh
> for i in lwp rpc2 rvm coda; do
>   echo "BUILDING $i" && \
>   (cd $i && \
>   ./bootstrap.sh && \
>   rm -f config.cache && \
>   LDFLAGS="-L/usr/pkg/lib -R/usr/pkg/lib" CPPFLAGS="-I/usr/pkg/include"
> ./configure --prefix=/usr/local/coda && \
>   gmake -k clean &&
>   gmake -k &&
>   gmake install)
> done
> (cd coda && gmake -k client-install server-install)
> exit 0



Never ran into any make errors using gmake. I think using BSD make
instead gives "missing seperator" errors or some other syntax-y thing.

I just ln -s /usr/lib /usr/local/lib; ln -s /usr/include /usr/local/include
to
put any silliness about lost libs or includes to rest; besides, didn't get
any of the usual errors associated with bad libs or headers such as
compilation errors or successful build but undefined symbol on run...

--
>         Greg Troxel <gdt_at_ir.bbn.com>
>

Thanks for the response thus far.

Regards, Sean
scaron_at_umich.edu
Received on 2006-04-25 10:45:46