Coda File System

Coda Stress Test

From: Casey Helfrich <helfrich_at_intel-research.net>
Date: Thu, 18 Apr 2002 14:48:43 -0400
Hi Jan, 

Mike and I have been stress testing our Coda Servers, and we think we
have a reproducible error.  (Happened 3 times in the row).

We have setup a singly replicated server that is normally very stable.
Mike wrote a script that generates 4GB of data (all zeros right now),
split up into 125kb files.  The script then tried to dump it all into
Coda.  Right now I have a 6GB hard limit on in the /vicepa partition,
and cmon shows that it only used 57% of the available space before it
crashed.

I went down and checked out the server and saw that the codasrv process
had vanished.  Upon examining the SrvLog I found that the server
"crashed" around 1:07PM.  

Below is what I hope will be someuseful information to you.  As you can
see SrvErr is showing a failed assertion in rvmlib.c, and the file
containing this message was created at 1:07PM, as you can see from the
ls -l. So I'm pretty sure that this error is what caused the codasrv
process to disappear.  

If I just run: startserver & the server starts again and the clients
reconnect.  Is it possible that he is managing to fill the RVM so
quickly that it had trouble doing an rvm_realloc() or some equivalent?

Let me know if there is any other information I can collect.

-Casey

-----Original Message-----
From: root [mailto:root_at_avalon.XXXXXXXXXXX.net] 

[root_at_avalon root]# cat /vice/srv/SrvErr.prev
RVMLIB_ASSERT: error in rvmlib_malloc

Assertion failed: 0, file 
"/usr/src/redhat/BUILD/coda-5.3.19/coda-src/util/rvmlib.c", line 211
EXITING! Bye!
[root_at_avalon root]# ls -la /vice/srv/
total 8456
drwxr-xr-x    2 root     root         4096 Apr 18 14:18 .
drwxr-xr-x    9 root     root         4096 Apr 18 14:21 ..
-rw-r--r--    1 root     root            4 Apr 18 14:18 pid
-rw-r--r--    1 root     root            0 Apr 18 14:18 SrvErr
-rw-r--r--    1 root     root          148 Apr 18 13:07 SrvErr.prev
-rw-r--r--    1 root     root         3599 Apr 18 14:21 SrvLog
-rw-r--r--    1 root     root      4055622 Apr 18 14:18 SrvLog-1
-rw-r--r--    1 root     root         3291 Apr 18 12:07 SrvLog-2
-rw-r--r--    1 root     root         3782 Apr 18 11:16 SrvLog-3
-rw-r--r--    1 root     root          340 Apr 18 11:06 SrvLog-4
-rw-r--r--    1 root     root         3045 Apr 18 11:03 SrvLog-5
-rw-r--r--    1 root     root         4114 Apr 18 11:01 SrvLog-6
-rw-r--r--    1 root     root      4507868 Apr 18 10:41 SrvLog-7
[root_at_avalon root]#
Received on 2002-04-18 14:59:55