Coda File System

"Connection timed out" and "mount system call failed"

From: Jeremy Malcolm <Jeremy_at_Malcolm.wattle.id.au>
Date: Sun, 19 Aug 2001 21:18:08 +0800
I have been trying for two solid days (plus on and off before that) to
get coda to work, but it won't even though I have read the documentation
and am using the latest version.  I have checked the mailing list
archives about the current problems, but none of the responses listed
there shed any light.

I am simply trying to have replicated volumes, but either venus won't
start at all, or if it will, when I try to do anything with the mounted
volumes I get connection timed out errors.  cmon shows both servers
running fine when the errors occur, and when I can manage to get venus
to run at all (seldom) it does this:

[root_at_servalan /coda]# ls
documents  precedents  programs  templates
[root_at_servalan /coda]# cfs cs
Contacting servers .....
All servers up
[root_at_servalan /coda]# cfs lv documents
  Status of volume 0x7f000001 (2130706433) named "ilaw"
  Volume type is ReadWrite
  Connection State is Connected
  Minimum quota is 0, maximum quota is unlimited
  Current blocks used are 6
  The partition has 3879868 blocks available out of 3879944
  Write-back is disabled
[root_at_servalan /coda]# date
Sun Aug 19 20:08:58 WST 2001
[root_at_servalan /coda]# ls documents
ls: documents: Connection timed out
[root_at_servalan /coda]# cfs lv documents
documents: Connection timed out
[root_at_servalan /coda]# ls
ls: documents: Input/output error
precedents  programs  templates
[root_at_servalan /coda]# date
Sun Aug 19 20:09:38 WST 2001
ls: documents: Input/output error
precedents  programs  templates
[root_at_servalan /coda]# date
Sun Aug 19 20:10:25 WST 2001
[root_at_servalan /coda]# ls
documents  precedents  programs  templates
[root_at_servalan /coda]# date
Sun Aug 19 20:11:10 WST 2001

[root_at_servalan /]# umount /coda
umount: /coda: device is busy
[root_at_servalan /]# umount /coda
umount: /coda: device is busy
[root_at_servalan /]# umount /coda
[root_at_servalan /]# ls /coda
NOT_REALLY_CODA
[root_at_servalan /]# /etc/rc.d/init.d/auth2.init start
Starting auth2: /usr/sbin/auth2 done.
[root_at_servalan /]# /etc/rc.d/init.d/update.init start
Starting coda update servers: rpc2portmap updatesrv updateclnt done.
[root_at_servalan /]# /etc/rc.d/init.d/codasrv.init start
Starting codasrv: codasrv.
[root_at_servalan /]# /etc/rc.d/init.d/venus.init start
Starting venus: done.
[root_at_servalan /]#
Date: Sun 08/19/2001

20:54:25 /usr/coda/LOG size is 5193216 bytes
20:54:25 /usr/coda/DATA size is 20767868 bytes
20:54:25 Loading RVM data
20:54:27 Last init was Sun Aug 19 17:20:19 2001
20:54:27 Last shutdown was clean
20:54:27 starting VDB scan
20:54:27        5 volume replicas
20:54:27        5 replicated volumes
20:54:27        0 CML entries allocated
20:54:27        0 CML entries on free-list
20:54:27 starting FSDB scan (8333, 200000) (25, 75, 4)
20:54:27        4 cache files in table (0 blocks)
20:54:27        8329 cache files on free-list
20:54:27 starting HDB scan
20:54:27        0 hdb entries in table
20:54:27        0 hdb entries on free-list
20:54:27 Getting Root Volume information...
20:54:27 Venus starting...
20:54:27 CHILD: mount system call failed. Killing parent.

There is no reason why venus should not be able to mount coda.  The logs
do not show anything apart from the same error as above.  Earlier on,
between 20:08:58 and 20:11:10 the venus log shows just this:

[ H(06) : 0030 : 20:09:05 ] HDBDaemon just woke up

[ H(06) : 0031 : 20:09:05 ] HDBDaemon about to sleep on hdbdaemon_sync

[ W(33) : 0000 : 20:09:15 ] *** Long Running (Multi)Fetch: code = -2001,
elapsed = 15032.0 ***

[ T(01) : 1092 : 20:09:36 ] BeginRvmFlush (1, 38956, T)
[ T(01) : 1092 : 20:09:37 ] EndRvmFlush

[ T(01) : 1101 : 20:10:52 ] BeginRvmFlush (1, 33388, T)
[ T(01) : 1101 : 20:10:52 ] EndRvmFlush

[ V(04) : 1124 : 20:11:08 ] WAITING(SRVRQ):
[ V(04) : 1124 : 20:11:08 ] WAIT OVER, elapsed = 2.1

The vice log doesn't show anything, until I shut down the server when a
whole lot of backed-up information is written to the log file all at
once.  However there is nothing particularly relevant over that period,
only:

20:08:09 RevokeWBPermit on conn 496a850 returned 0
20:11:02 Building callback conn.
20:11:02 RevokeWBPermit on conn 496a850 returned 0
20:11:08 RevokeWBPermit on conn 496a850 returned 0
20:11:08 RevokeWBPermit on conn 496a850 returned 0
20:11:08 RevokeWBPermit on conn 496a850 returned 0
20:11:08 RevokeWBPermit on conn 496a850 returned 0

Any tips?  Thanks very much in advance, as I am pulling my hair out
here.

--
JEREMY MALCOLM <Jeremy@Malcolm.wattle.id.au> http://malcolm.wattle.id.au
Providing online networks of Australian lawyers (http://www.ilaw.com.au)
and Linux experts (http://www.linuxconsultants.com.au) for instant help!
Disclaimer: http://www.terminus.net.au/disclaimer.html. GPG key: finger.
Received on 2001-08-19 09:18:52