(Illustration by Gaich Muramatsu)
Hi Jan, Mike and I have been stress testing our Coda Servers, and we think we have a reproducible error. (Happened 3 times in the row). We have setup a singly replicated server that is normally very stable. Mike wrote a script that generates 4GB of data (all zeros right now), split up into 125kb files. The script then tried to dump it all into Coda. Right now I have a 6GB hard limit on in the /vicepa partition, and cmon shows that it only used 57% of the available space before it crashed. I went down and checked out the server and saw that the codasrv process had vanished. Upon examining the SrvLog I found that the server "crashed" around 1:07PM. Below is what I hope will be someuseful information to you. As you can see SrvErr is showing a failed assertion in rvmlib.c, and the file containing this message was created at 1:07PM, as you can see from the ls -l. So I'm pretty sure that this error is what caused the codasrv process to disappear. If I just run: startserver & the server starts again and the clients reconnect. Is it possible that he is managing to fill the RVM so quickly that it had trouble doing an rvm_realloc() or some equivalent? Let me know if there is any other information I can collect. -Casey -----Original Message----- From: root [mailto:root_at_avalon.XXXXXXXXXXX.net] [root_at_avalon root]# cat /vice/srv/SrvErr.prev RVMLIB_ASSERT: error in rvmlib_malloc Assertion failed: 0, file "/usr/src/redhat/BUILD/coda-5.3.19/coda-src/util/rvmlib.c", line 211 EXITING! Bye! [root_at_avalon root]# ls -la /vice/srv/ total 8456 drwxr-xr-x 2 root root 4096 Apr 18 14:18 . drwxr-xr-x 9 root root 4096 Apr 18 14:21 .. -rw-r--r-- 1 root root 4 Apr 18 14:18 pid -rw-r--r-- 1 root root 0 Apr 18 14:18 SrvErr -rw-r--r-- 1 root root 148 Apr 18 13:07 SrvErr.prev -rw-r--r-- 1 root root 3599 Apr 18 14:21 SrvLog -rw-r--r-- 1 root root 4055622 Apr 18 14:18 SrvLog-1 -rw-r--r-- 1 root root 3291 Apr 18 12:07 SrvLog-2 -rw-r--r-- 1 root root 3782 Apr 18 11:16 SrvLog-3 -rw-r--r-- 1 root root 340 Apr 18 11:06 SrvLog-4 -rw-r--r-- 1 root root 3045 Apr 18 11:03 SrvLog-5 -rw-r--r-- 1 root root 4114 Apr 18 11:01 SrvLog-6 -rw-r--r-- 1 root root 4507868 Apr 18 10:41 SrvLog-7 [root_at_avalon root]#Received on 2002-04-18 14:59:55