Coda File System

Venus dies on startup on RH6.2 client, but runs on server

From: Dr. Douglas C. MacKenzie <doug_at_mobile-intelligence.com>
Date: Wed, 14 Jun 2000 09:38:28 -0400
The port mapper seems to be running.  Actually, two are.

[doug_at_Oak ~]$ ps auxw | grep port
bin        326  0.0  0.0  1212    0 ?        SW   13:15   0:00 [portmap]
root       548  0.0  0.4  2048  304 ?        S    13:15   0:00 sendmail:
accepting connections on port 25
xfs        615  0.0  1.8  2768 1156 ?        S    13:15   0:01 xfs -droppriv
-daemon -port -1
root       658  0.0  0.1  1380  104 ?        S    13:15   0:00 [rpc2portmap]
doug      1349  0.0  0.6  1172  404 pts/1    D    20:30   0:00 grep port
[doug_at_Oak ~]$

I've attached as much log data as I could find.
After the failed connect, the server seems to still be alive.
I can log into venus running on the server and edit files in the coda
directory.

    Doug

Ashwin Krishna Patil wrote:

> Hey, what did /vice/srv/SrvLog say?  Sometimes the server quietly dies but
> the process is still visible.  Definitely check out the /vice/srv/SrvLog
> and /vice/srv/SrvErr on the server side AFTER your failed connect.  Also,
> make sure that the portmapper is running on the server.
>
> --------------------------------------------------
> Ashwin Patil
> ph. (734) 213-2003
> email: apatil_at_eecs.umich.edu
> University of Michigan at Ann Arbor
>
> On Tue, 13 Jun 2000, Dr. Douglas C. MacKenzie wrote:
>
> > Thanks for taking an interest.  I haven't gotten much farther.
> > I did run the bldvldb.sh and makevrdb commands to build the volume
> > databases on the server.  That seemed to clear up a couple messages,
> > but I still can't get the client to successfully start on the other
> > machine.
> > The client running on the server is happy and mounts /coda fine.
> > Both are pentium machines with recent RH6.2 clean loads.
> >
> >     Doug MacKenzie
> >  doug_at_mobile-intelligence.com
> >
> >
> > The server log /vice/srv/SrvLog has this info:
> >
> > 3:53:30 File Server started Tue Jun 13 13:53:30 2000
> >
> > 13:54:06 client_GetVenusId: got new host 127.0.0.1:2430
> > 13:54:06 Building callback conn.
> > 13:54:06 No idle WriteBack conns, building new one
> > 13:54:06 Writeback message to 127.0.0.1 port 2430 on conn 7a47eea
> > succeeded
> > 13:54:06 RevokeWBPermit on conn 7a47eea returned 0
> > 13:54:25 client_GetVenusId: got new host 192.168.1.4:2430
> > 13:54:25 Building callback conn.
> > 13:54:25 No idle WriteBack conns, building new one
> > 13:54:25 Writeback message to 192.168.1.4 port 2430 on conn 2f911989
> > succeeded
> > 13:58:12 VRDB created, 1 entries
> > 14:02:15 Callback failed RPC2_DEAD (F) for ws 192.168.1.4, port 2430
> > 14:02:15 Unbinding RPC2 connection 450252675
> > 14:35:17 Building callback conn.
> > 14:35:17 RevokeWBPermit on conn 2f911989 returned -2016
> > 14:35:17 No idle WriteBack conns, building new one
> > 14:35:17 Writeback message to 192.168.1.4 port 2430 on conn 3f0a69e3
> > succeeded
> > 14:42:32 Callback failed RPC2_DEAD (F) for ws 192.168.1.4, port 2430
> > 14:42:32 Unbinding RPC2 connection 40275237
> >
> >
> >
> > The client log /usr/coda/etc/LOG has this info:
> >
> > [ X(00) : 0000 : 13:33:45 ] Coda Venus, version 5.3.7
> > [ X(00) : 0000 : 13:33:45 ] Logfile initialized with LogLevel = 0 at Tue
> > Jun 13
> > 13:33:45
> > 2000
> >
> > [ X(00) : 0000 : 13:33:45 ] E StatsInit()
> > [ X(00) : 0000 : 13:33:45 ] L StatsInit()
> > [ X(00) : 0000 : 13:33:45 ] RecovDataSizes: Log = 0x47f63, Data =
> > 0x11fd8c
> > [ X(00) : 0000 : 13:33:45 ] Recov_AllocateVM: allocated 1000 bytes at
> > 20000000
> > [ X(00) : 0000 : 13:33:45 ] Recov_AllocateVM: allocated 11a000 bytes at
> > 21000000
> > [ X(00) : 0000 : 13:33:45 ] BeginRvmTruncate (1, 4312, I)
> > [ X(00) : 0000 : 13:33:45 ] cache::EndRvmTruncate
> > [ X(00) : 0000 : 13:33:45 ] Recov_DeallocateVM: deallocated 1000 bytes
> > at 20000000
> >
> > [ X(00) : 0000 : 13:33:45 ] Recov_DeallocateVM: deallocated 11a000 bytes
> > at
> > 21000000
> > [ X(00) : 0000 : 13:33:45 ] BeginRvmFlush (1, 1332, F)
> > [ X(00) : 0000 : 13:33:45 ] EndRvmFlush
> > [ X(00) : 0000 : 13:33:45 ] BeginRvmTruncate (2, 1492, F)
> > [ X(00) : 0000 : 13:33:45 ] cache::EndRvmTruncate
> > [ X(00) : 0000 : 13:33:45 ] BeginRvmFlush (1, 5516, F)
> > [ X(00) : 0000 : 13:33:45 ] EndRvmFlush
> > [ X(00) : 0000 : 13:33:45 ] BeginRvmTruncate (3, 5676, F)
> > [ X(00) : 0000 : 13:33:45 ] cache::EndRvmTruncate
> > [ X(00) : 0000 : 13:33:45 ] BeginRvmFlush (1, 223616, F)
> > [ X(00) : 0000 : 13:33:45 ] EndRvmFlush
> > [ X(00) : 0000 : 13:33:45 ] BeginRvmTruncate (420, 223776, F)
> > [ X(00) : 0000 : 13:33:45 ] cache::EndRvmTruncate
> > [ X(00) : 0000 : 13:33:45 ] BeginRvmFlush (1, 16808, F)
> > [ X(00) : 0000 : 13:33:45 ] EndRvmFlush
> > [ X(00) : 0000 : 13:33:45 ] BeginRvmTruncate (1, 16968, F)
> > [ X(00) : 0000 : 13:33:45 ] cache::EndRvmTruncate
> > [ X(00) : 0000 : 13:33:45 ] E adviceserver::adviceserver: AdviceServer
> >
> > [ A(18) : 0000 : 13:33:45 ] adviceserver::main()
> >
> > [ H(07) : 0000 : 13:33:45 ] HDBDaemon about to sleep on hdbdaemon_sync
> >
> > [ T(01) : 0000 : 13:33:51 ] BeginRvmFlush (1, 2352, T)
> > [ T(01) : 0000 : 13:33:51 ] EndRvmFlush
> >
> > [ D(21) : 0000 : 13:33:51 ] WAITING(SRVRQ):
> >
> > [ W(20) : 0000 : 13:34:01 ] NotifyUserOfServerDownEvent
> > [ W(20) : 0000 : 13:34:01 ] DiscoRef= 1
> >
> > [ D(21) : 0000 : 13:34:01 ] WAIT OVER, elapsed = 10023.3
> >
> >
> >
> >
> > Jan Harkes wrote:
> >
> > > On Tue, Jun 13, 2000 at 10:17:35AM -0500, Dr. Douglas C. MacKenzie
> > wrote:
> > > > I have successfully installed a Coda server and client running on
> > RedHat
> > > > 6.2. When I tried to start venus on a second machine as just a
> > client
> > > > I get the following errors in the /var/log/messages
> > > ...
> > > > Jun 13 10:07:25 maple kernel: Failure of coda_cnode_make for root:
> > error -110
> > >
> > > ETIMEDOUT, the client was unable to get the attributes of the root
> > > directory for the root volume on the server. /usr/coda/venus.log on
> > the
> > > client and /vice/srv/SrvLog on the server might have more information
> > > about why we failed to fetch it.
> > >
> > > > -----------
> > > > Also, the file  /usr/sbin/venus-setup from the rpm file
> > > > coda-debug-client-5.3.7-1.i386.rpm
> > > > references a utility called codaconfedit
> > > > which is not in any of the rpms that I can find.
> > >
> > > It should be in /usr/sbin, but you are right, it isn't there. I'll
> > make
> > > sure it ends up in the right place for 5.3.8.
> > >
> > > Jan
> >
> >
> >


---- client's console messages -----

08:17:27 /usr/coda/LOG setup for size 0x47f63
08:17:27 /usr/coda/DATA initialized at size 0x11fd8c
08:17:27 brain-wiping recoverable store
08:17:27 loading recoverable store
08:17:27 starting VSGDB scan
08:17:27 	0 vsg entries in table
08:17:27 	0 vsg entries on free-list
08:17:27 starting VDB scan
08:17:27 	1 vol entries in table (0 MLEs)
08:17:27 	0 vol entries on free-list (0 MLEs)
08:17:28 starting FSDB scan (416, 10000) (25, 75, 4)
08:17:28 	0 cache files in table (0 blocks)
08:17:28 	416 cache files on free-list
08:17:28 starting HDB scan
08:17:28 	0 hdb entries in table
08:17:28 	0 hdb entries on free-list
08:17:28 Kernel version ioctl failed (Inappropriate ioctl for device)!
08:17:28 Initial LRDB allocation
08:17:28 Getting Root Volume information...
08:17:29 Venus starting...

[1]+  Killed                  /usr/sbin/venus -init
[root_at_maple etc]# 


---------- Client's messages in /var/log/messages --------------------------------
---------- Is the "coda_downcall: opcode 30, no sb!" message an error? -----------

Jun 14 08:17:27 maple kernel: Coda Kernel/Venus communications (module), v5.0-pre1, braam_at_cs.cmu.edu.
Jun 14 08:17:27 maple kernel: coda_psdev_write: downcall, no SB!
Jun 14 08:17:29 maple kernel: coda_read_super: rootfid is (0x7f000000,0x1,0x1)
Jun 14 08:17:44 maple kernel: coda_downcall: opcode 30, no sb!
Jun 14 08:17:45 maple kernel: Failure of coda_cnode_make for root: error -110

---------- Client's messages in /usr/coda/etc/venus.log ---------------

[ X(00) : 0000 : 08:17:27 ] Coda Venus, version 5.3.7
[ X(00) : 0000 : 08:17:27 ] Logfile initialized with LogLevel = 0 at Wed Jun 14 08:17:27 2000

[ X(00) : 0000 : 08:17:27 ] E StatsInit()
[ X(00) : 0000 : 08:17:27 ] L StatsInit()
[ X(00) : 0000 : 08:17:27 ] RecovDataSizes: Log = 0x47f63, Data = 0x11fd8c
[ X(00) : 0000 : 08:17:27 ] Recov_AllocateVM: allocated 1000 bytes at 20000000
[ X(00) : 0000 : 08:17:27 ] Recov_AllocateVM: allocated 11a000 bytes at 21000000
[ X(00) : 0000 : 08:17:27 ] BeginRvmTruncate (1, 4312, I)
[ X(00) : 0000 : 08:17:27 ] cache::EndRvmTruncate
[ X(00) : 0000 : 08:17:27 ] Recov_DeallocateVM: deallocated 1000 bytes at 20000000
[ X(00) : 0000 : 08:17:27 ] Recov_DeallocateVM: deallocated 11a000 bytes at 21000000
[ X(00) : 0000 : 08:17:27 ] BeginRvmFlush (1, 1332, F)
[ X(00) : 0000 : 08:17:27 ] EndRvmFlush
[ X(00) : 0000 : 08:17:27 ] BeginRvmTruncate (2, 1492, F)
[ X(00) : 0000 : 08:17:27 ] cache::EndRvmTruncate
[ X(00) : 0000 : 08:17:27 ] BeginRvmFlush (1, 5516, F)
[ X(00) : 0000 : 08:17:27 ] EndRvmFlush
[ X(00) : 0000 : 08:17:27 ] BeginRvmTruncate (3, 5676, F)
[ X(00) : 0000 : 08:17:27 ] cache::EndRvmTruncate
[ X(00) : 0000 : 08:17:28 ] BeginRvmFlush (1, 223616, F)
[ X(00) : 0000 : 08:17:28 ] EndRvmFlush
[ X(00) : 0000 : 08:17:28 ] BeginRvmTruncate (420, 223776, F)
[ X(00) : 0000 : 08:17:28 ] cache::EndRvmTruncate
[ X(00) : 0000 : 08:17:28 ] BeginRvmFlush (1, 16808, F)
[ X(00) : 0000 : 08:17:28 ] EndRvmFlush
[ X(00) : 0000 : 08:17:28 ] BeginRvmTruncate (1, 16968, F)
[ X(00) : 0000 : 08:17:28 ] cache::EndRvmTruncate
[ X(00) : 0000 : 08:17:28 ] E adviceserver::adviceserver: AdviceServer

[ A(18) : 0000 : 08:17:28 ] adviceserver::main()

[ H(07) : 0000 : 08:17:28 ] HDBDaemon about to sleep on hdbdaemon_sync

[ T(01) : 0000 : 08:17:34 ] BeginRvmFlush (1, 2352, T)
[ T(01) : 0000 : 08:17:34 ] EndRvmFlush

[ D(21) : 0000 : 08:17:34 ] WAITING(SRVRQ):

[ W(20) : 0000 : 08:17:44 ] NotifyUserOfServerDownEvent
[ W(20) : 0000 : 08:17:44 ] DiscoRef= 1

[ D(21) : 0000 : 08:17:44 ] WAIT OVER, elapsed = 10016.7


----------- The RPC port mapper is running on the client ---------
----------- Hmm, but not the RPC2 port mapper???         ---------

[root_at_maple etc]# ps axuw | grep -i port
bin        332  0.0  0.1  1212  428 ?        S    08:11   0:00 portmap
root       541  0.0  0.4  2048 1056 ?        S    08:11   0:00 sendmail: accepting connections on port 25
xfs        595  0.0  0.5  2264 1372 ?        S    08:11   0:00 xfs -droppriv -daemon -port -1




----------- The Server's messages from /vice/srv/SrvLog ----------
----------- /vice/srv/SrvErr is empty ----------


09:12:27 New SrvLog started at Wed Jun 14 09:12:27 2000

09:12:27 Resource limit on data size are set to 2147483647

09:12:27 Server etext 0x80c0c76, edata 0x80ed5b0
09:12:27 RvmType is Rvm
09:12:27 Main process doing a LWP_Init()
09:12:27 Main thread just did a RVM_SET_THREAD_DATA

09:12:27 Setting Rvm Truncate threshhold to 5.

Partition /vicepa: inodes in use: 1, total: 1048576.
09:12:40 Partition /vicepa: 280633K available (minfree=5%), 280622K free.
09:12:40 The server (pid 661) can be controlled using volutil commands
09:12:40 "volutil -help" will give you a list of these commands
09:12:40 If desperate,
		"kill -SIGWINCH 661" will increase debugging level
09:12:40 	"kill -SIGUSR2 661" will set debugging level to zero
09:12:40 	"kill -9 661" will kill a runaway server
09:12:40 Vice file system salvager, version 3.0.
09:12:40 SanityCheckFreeLists: Checking RVM Vnode Free lists.
09:12:40 DestroyBadVolumes: Checking for destroyed volumes.
09:12:40 Salvaging file system partition /vicepa
09:12:40 Force salvage of all volumes on this partition
09:12:40 Scanning inodes in directory /vicepa...
09:12:40 Entering DCC(0x1000001)
09:12:40 DCC: Salvaging Logs for volume 0x1000001

09:12:40 done:	2 files/dirs,	3 blocks
09:12:40 SalvageFileSys completed on /vicepa
09:12:40 VAttachVolumeById: vol 1000001 (coda_root.0) attached and online
09:12:40 Attached 1 volumes; 0 volumes not attached
lqman: Creating LockQueue Manager.....LockQueue Manager starting .....
09:12:40 LockQueue Manager just did a rvmlib_set_thread_data()

done
09:12:40 CallBackCheckLWP just did a rvmlib_set_thread_data()

09:12:40 CheckLWP just did a rvmlib_set_thread_data()

09:12:40 ServerLWP 0 just did a rvmlib_set_thread_data()

09:12:40 ServerLWP 1 just did a rvmlib_set_thread_data()

09:12:40 ServerLWP 2 just did a rvmlib_set_thread_data()

09:12:40 ServerLWP 3 just did a rvmlib_set_thread_data()

09:12:40 ServerLWP 4 just did a rvmlib_set_thread_data()

09:12:40 ServerLWP 5 just did a rvmlib_set_thread_data()

09:12:40 ResLWP-0 just did a rvmlib_set_thread_data()

09:12:40 ResLWP-1 just did a rvmlib_set_thread_data()

09:12:41 VolUtilLWP 0 just did a rvmlib_set_thread_data()

09:12:41 VolUtilLWP 1 just did a rvmlib_set_thread_data()

09:12:41 Starting SmonDaemon timer
09:12:41 File Server started Wed Jun 14 09:12:41 2000

09:12:42 client_GetVenusId: got new host 127.0.0.1:2430
09:12:42 Building callback conn.
09:12:42 No idle WriteBack conns, building new one
09:12:42 Writeback message to 127.0.0.1 port 2430 on conn 3a416ecc succeeded
09:12:42 RevokeWBPermit on conn 3a416ecc returned 0

>>> Here is where I started the remote venus client, the clocks are skewed <<<<

09:18:59 client_GetVenusId: got new host 192.168.1.4:2430
09:19:00 Building callback conn.
09:19:00 No idle WriteBack conns, building new one
09:19:00 Writeback message to 192.168.1.4 port 2430 on conn b036baf succeeded

------------------------------------------------------------------------------
--- The server's /var/log/messages file.  Is the "coda_psdev_write: downcall, no SB!"
--- message an error?  Also, I notice that the client does not start auth2.init in
--- the boot process, if that matters. ----------------------------------------

un 14 09:12:26 Oak auth2.init: Starting auth2:
Jun 14 09:12:26 Oak auth2.init: /usr/sbin/auth2 done.
Jun 14 09:12:26 Oak rc: Starting auth2.init succeeded
Jun 14 09:12:26 Oak update.init: Starting coda update servers:
Jun 14 09:12:26 Oak update.init: rpc2portmap
Jun 14 09:12:26 Oak update.init: updatesrv
Jun 14 09:12:26 Oak update.init: updateclnt done.
Jun 14 09:12:26 Oak rc: Starting update.init succeeded
Jun 14 09:12:26 Oak codasrv.init: Starting codasrv:
Jun 14 09:12:26 Oak codasrv.init: codasrv.
Jun 14 09:12:27 Oak rc: Starting codasrv.init succeeded
Jun 14 09:12:27 Oak venus.init: Starting venus:
Jun 14 09:12:27 Oak venus.init: done.
Jun 14 09:12:28 Oak rc: Starting venus.init succeeded
Jun 14 09:12:28 Oak linuxconf: Linuxconf final setup
Jun 14 09:12:28 Oak kernel: Coda Kernel/Venus communications (module), v5.0-pre1, braam_at_cs.cmu.edu.
Jun 14 09:12:28 Oak kernel: coda_psdev_write: downcall, no SB!
Jun 14 09:12:36 Oak rc: Starting linuxconf succeeded
Jun 14 09:12:42 Oak kernel: coda_psdev_write: downcall, no SB!
Jun 14 09:12:42 Oak kernel: coda_psdev_write: downcall, no SB!
Jun 14 09:12:42 Oak kernel: coda_read_super: rootfid is (0x7f000000,0x1,0x1)
Jun 14 09:12:42 Oak kernel: coda_read_super: rootinode is 1025 dev 3
Jun 14 09:14:19 Oak PAM_pwdb[707]: (login) session opened for user doug by LOGIN(uid=0)
Jun 14 09:14:42 Oak gnome-name-server[785]: starting
Jun 14 09:14:42 Oak gnome-name-server[789]: starting
Jun 14 09:14:43 Oak gnome-name-server[785]: name server starting
Jun 14 09:14:43 Oak gnome-name-server[789]: name server was running on display, exiting
Jun 14 09:15:15 Oak PAM_pwdb[813]: (su) session opened for user root by doug(uid=500)

>> Here is where I started the remote venus client <<<
>> There were no messages in the Server's /var/log/messages file <<<
>> after I started the client. <<<
Received on 2000-06-14 09:43:46