Coda File System

Venus mount problems

From: Dennis Nasarov <dennis.nasarov_at_gmail.com>
Date: Tue, 22 Feb 2011 14:26:24 +0100
Hello,

I'm running Coda 6.9.4 on FreeBSD 8.1 boxes. After reboot, venus does not work, and is showing this:

[root_at_n2 /coda]# ls -al
total 5
dr-xr-xr-x   1 root  nobody  2048 Feb 22 21:03 .
drwxr-xr-x  24 root  wheel    512 Feb 20 21:51 ..

ls: ./cloud.kg: Invalid argument
lrw-r--r--   1 root  nobody    29 Feb 22 21:10 cloud.kg

I've tried venus -init from different clients, and still have same problem.

With volumes all seems to be ok (n1.cloud.kg is scm, n2.cloud.kg is replica)

[root_at_n2 /coda]# getvolinfo n1.cloud.kg /
RPC2 connection to n1.cloud.kg:2432 successful.
Returned volume information for /
	VolumeId 7f000000
	Replicated volume (type 3)

	Type0 id 0
	Type1 id 0
	Type2 id 0
	Type3 id 7f000000
	Type4 id 0

	ServerCount 2
	Replica0 id 01000002, Server0 195.38.166.141
	Replica1 id 02000001, Server1 195.38.166.142
	Replica2 id 00000000, Server2 0.0.0.0
	Replica3 id 00000000, Server3 0.0.0.0
	Replica4 id 00000000, Server4 0.0.0.0
	Replica5 id 00000000, Server5 0.0.0.0
	Replica6 id 00000000, Server6 0.0.0.0
	Replica7 id 00000000, Server7 0.0.0.0

	VSGAddr 0


V_BindToServer: binding to host n1.cloud.kg
P/vicepa Hn1.cloud.kg T8e8ba80 F833e6e0
W/.0 I1000002 H1 P/vicepa m0 M0 U9da39 W1000002 C4d6223a3 D4d6223a3 B0 A147556
GetVolumeList finished successfully

V_BindToServer: binding to host n2.cloud.kg
P/vicepa Hn2.cloud.kg T8e8ba80 F83d9b50
W/.1 I2000001 H2 P/vicepa m0 M0 U95fc8 W2000001 C4d618e3a D4d618e3a B0 A525cf5
GetVolumeList finished successfully

When starting coda on scm i see following in SrvLog

Date: Tue 02/22/2011

20:58:17 Coda Vice, version 6.9.4	log started at Tue Feb 22 20:58:17 2011

20:58:17 RvmType is Rvm
20:58:17 Main process doing a LWP_Init()
20:58:17 Main thread just did a RVM_SET_THREAD_DATA

20:58:17 Setting Rvm Truncate threshhold to 5.

Partition /vicepa: inodes in use: 45049, total: 16777216.
20:58:18 Partition /vicepa: 149469824K available (minfree=7%), 137619216K free.
20:58:18 The server (pid 1391) can be controlled using volutil commands
20:58:18 "volutil -help" will give you a list of these commands
20:58:18 If desperate,
		"kill -SIGWINCH 1391" will increase debugging level
20:58:18 	"kill -SIGUSR2 1391" will set debugging level to zero
20:58:18 	"kill -9 1391" will kill a runaway server
20:58:18 Vice file system salvager, version 3.0.
20:58:18 SanityCheckFreeLists: Checking RVM Vnode Free lists.
20:58:18 DestroyBadVolumes: Checking for destroyed volumes.
20:58:18 Salvaging file system partition /vicepa
20:58:18 Force salvage of all volumes on this partition
20:58:18 Scanning inodes in directory /vicepa...
20:58:18 Entering DCC(0x1000002)
20:58:19 DCC: Salvaging Logs for volume 0x1000002
20:58:19 recov_vol_log::SalvageLog: bitmaps are not equal

20:58:19 Log rec at index 52 is unreachable
    **Server: c326a68d StoreId: 50204e56.1602f 
    Directory(00000111.0000eeec)
    Opcode: ResolveNULL_OP 
    index is 52, sequence number 252, var length is 0
!!!!!!!! vle == NULL pointer !!!!!!
    ** End of Record **

20:58:19 Log rec at index 54 is unreachable
    **Server: c326a68d StoreId: 50204e56.14812 
    Directory(000000fd.0000e127)
    Opcode: ResolveNULL_OP 
    index is 54, sequence number 254, var length is 0
!!!!!!!! vle == NULL pointer !!!!!!
    ** End of Record **

20:58:19 Log rec at index 55 is unreachable
    **Server: c326a68d StoreId: 50204e56.1481c 
    Directory(00000105.0000e129)
    Opcode: ResolveNULL_OP 
    index is 55, sequence number 255, var length is 0
!!!!!!!! vle == NULL pointer !!!!!!
    ** End of Record **

20:58:19 Log rec at index 56 is unreachable
    **Server: c326a68d StoreId: 50204e56.14e42 
    Directory(00000135.0000eef5)
    Opcode: ResolveNULL_OP 
    index is 56, sequence number 256, var length is 0
!!!!!!!! vle == NULL pointer !!!!!!
    ** End of Record **

20:58:19 done:	49658 files/dirs,	645689 blocks
20:58:19 SalvageFileSys completed on /vicepa
20:58:19 VAttachVolumeById: vol 1000002 (/.0) attached and online
20:58:19 Attached 1 volumes; 0 volumes not attached
lqman: Creating LockQueue Manager.....LockQueue Manager starting .....
20:58:19 LockQueue Manager just did a rvmlib_set_thread_data()

done
20:58:19 CallBackCheckLWP just did a rvmlib_set_thread_data()

20:58:19 Starting AuthLWP-0
20:58:19 Starting AuthLWP-1
20:58:19 Starting AuthLWP-2
20:58:19 Starting AuthLWP-3
20:58:19 Starting AuthLWP-4
20:58:19 ServerLWP 0 just did a rvmlib_set_thread_data()
20:58:19 ServerLWP 1 just did a rvmlib_set_thread_data()
20:58:19 ServerLWP 2 just did a rvmlib_set_thread_data()
20:58:19 ServerLWP 3 just did a rvmlib_set_thread_data()
20:58:19 ServerLWP 4 just did a rvmlib_set_thread_data()
20:58:19 ServerLWP 5 just did a rvmlib_set_thread_data()
20:58:19 ServerLWP 6 just did a rvmlib_set_thread_data()
20:58:19 ServerLWP 7 just did a rvmlib_set_thread_data()
20:58:19 ServerLWP 8 just did a rvmlib_set_thread_data()
20:58:19 ServerLWP 9 just did a rvmlib_set_thread_data()
20:58:19 ResLWP-0 just did a rvmlib_set_thread_data()

20:58:19 ResLWP-1 just did a rvmlib_set_thread_data()

20:58:19 VolUtilLWP 0 just did a rvmlib_set_thread_data()

20:58:19 VolUtilLWP 1 just did a rvmlib_set_thread_data()

20:58:19 Starting SmonDaemon timer
20:58:19 File Server started Tue Feb 22 20:58:19 2011

Also cfs lv looks good
[root_at_n1 /usr/coda/etc]# cfs lv /coda/cloud.kg
  Status of volume 7f000000 (2130706432) named "/"
  Volume type is ReadWrite
  Connection State is Reachable
  Reintegration age: 0 sec, time 15.000 sec
  Minimum quota is 0, maximum quota is unlimited
  Current blocks used are 645689


And in venus.log i see following:
[ X(00) : 0000 : 21:17:41 ] Coda Venus, version 6.9.4
[ X(00) : 0000 : 21:17:41 ] Logfile initialized with LogLevel = 0 at Tue Feb 22 21:17:41 2011

[ X(00) : 0000 : 21:17:41 ] E StatsInit()
[ X(00) : 0000 : 21:17:41 ] L StatsInit()
[ X(00) : 0000 : 21:17:41 ] RecovDataSizes: Log = 0x2947c4, Data = 0xa51f10
[ X(00) : 0000 : 21:17:41 ] BeginRvmFlush (1, 9544, F)
[ X(00) : 0000 : 21:17:41 ] EndRvmFlush
[ X(00) : 0000 : 21:17:41 ] BeginRvmTruncate (7, 9704, F)
[ X(00) : 0000 : 21:17:41 ] EndRvmTruncate
[ X(00) : 0000 : 21:17:41 ] BeginRvmFlush (1, 2033612, F)
[ X(00) : 0000 : 21:17:41 ] EndRvmFlush
[ X(00) : 0000 : 21:17:41 ] BeginRvmTruncate (4170, 2033772, F)
[ X(00) : 0000 : 21:17:41 ] EndRvmTruncate
[ X(00) : 0000 : 21:17:41 ] BeginRvmFlush (1, 16808, F)
[ X(00) : 0000 : 21:17:41 ] EndRvmFlush
[ X(00) : 0000 : 21:17:41 ] BeginRvmTruncate (1, 16968, F)
[ X(00) : 0000 : 21:17:41 ] EndRvmTruncate

[ H(07) : 0000 : 21:17:41 ] HDBDaemon about to sleep on hdbdaemon_sync

[ T(01) : 0001 : 21:17:51 ] BeginRvmFlush (1, 16720, T)
[ T(01) : 0001 : 21:17:51 ] EndRvmFlush

[ W(13) : 0000 : 21:17:55 ] fsdb::Get: transforming 1.ff000001.1.1 (1.ff000001.1.1) into fake mtpt with Fakeify()
[ W(13) : 0000 : 21:17:55 ] Allowing access to stale status! (key = <1.ff000001.1.1>)
[ W(13) : 0000 : 21:17:55 ] Allowing access to stale status! (key = <1.ff000001.1.1>)
[ W(13) : 0000 : 21:17:55 ] fsdb::Get: transforming cloud.kg (1.ff000001.fffffffc.2) into fake mtpt with Fakeify()
[ W(13) : 0000 : 21:17:55 ] Allowing access to stale status! (key = <1.ff000001.fffffffc.2>)
[ W(13) : 0000 : 21:17:55 ] userent::Connect: ViceGetAttrPlusSHA(n1.cloud.kg)
[ W(13) : 0000 : 21:17:55 ] userent::Connect: ViceGetAttrPlusSHA() -> 22
[ W(13) : 0000 : 21:17:55 ] userent::Connect: VGAPlusSHA_Supported -> 1
[ W(13) : 0000 : 21:17:55 ] userent::Connect: ViceGetAttrPlusSHA(n2.cloud.kg)
[ W(13) : 0000 : 21:17:55 ] userent::Connect: ViceGetAttrPlusSHA() -> 22
[ W(13) : 0000 : 21:17:55 ] userent::Connect: VGAPlusSHA_Supported -> 1
[ W(13) : 0000 : 21:17:55 ] userent::Connect: ViceGetAttrPlusSHA(n1.cloud.kg)
[ W(13) : 0000 : 21:17:55 ] userent::Connect: ViceGetAttrPlusSHA() -> 22
[ W(13) : 0000 : 21:17:55 ] userent::Connect: VGAPlusSHA_Supported -> 1
[ W(13) : 0000 : 21:17:55 ] userent::Connect: ViceGetAttrPlusSHA(n2.cloud.kg)
[ W(13) : 0000 : 21:17:55 ] userent::Connect: ViceGetAttrPlusSHA() -> 22
[ W(13) : 0000 : 21:17:55 ] userent::Connect: VGAPlusSHA_Supported -> 1
[ W(13) : 0000 : 21:17:55 ] fsdb::Get: ??? (2.7f000000.1.1) in server/server conflict

[ T(01) : 0002 : 21:18:00 ] BeginRvmFlush (1, 6168, T)
[ T(01) : 0002 : 21:18:00 ] EndRvmFlush

[ W(13) : 0000 : 21:18:33 ] Allowing access to stale status! (key = <1.ff000001.1.1>)
[ W(13) : 0000 : 21:18:34 ] Allowing access to stale status! (key = <1.ff000001.1.1>)
[ W(13) : 0000 : 21:18:34 ] Allowing access to stale status! (key = <1.ff000001.1.1>)
[ W(13) : 0000 : 21:18:34 ] Cachefile::SetLength 1024
[ W(13) : 0000 : 21:18:34 ] Allowing access to stale status! (key = <1.ff000001.1.1>)
[ W(13) : 0000 : 21:18:34 ] Allowing access to stale status! (key = <1.ff000001.1.1>)
[ W(13) : 0000 : 21:18:34 ] Allowing access to stale status! (key = <1.ff000001.1.1>)
[ W(13) : 0000 : 21:18:34 ] fsdb::Get: EINCONS after GetAttr
[ W(13) : 0000 : 21:18:34 ] FidToNodeid: called for volume root (2.7f000000)!!!
[ W(13) : 0000 : 21:18:34 ] Allowing access to stale status! (key = <1.ff000001.1.1>)
[ W(13) : 0000 : 21:18:34 ] Allowing access to stale status! (key = <1.ff000001.1.1>)
[ W(13) : 0000 : 21:18:34 ] Allowing access to stale status! (key = <1.ff000001.1.1>)
[ W(13) : 0000 : 21:18:34 ] Allowing access to stale status! (key = <1.ff000001.1.1>)
[ W(13) : 0000 : 21:18:34 ] Allowing access to stale status! (key = <1.ff000001.1.1>)
[ W(13) : 0000 : 21:18:34 ] Allowing access to stale status! (key = <1.ff000001.1.1>)
[ W(13) : 0000 : 21:18:37 ] Allowing access to stale status! (key = <1.ff000001.1.1>)
[ W(13) : 0000 : 21:18:37 ] fsdb::Get: EINCONS after GetAttr
[ W(13) : 0000 : 21:18:37 ] FidToNodeid: called for volume root (2.7f000000)!!!
[ W(13) : 0000 : 21:18:37 ] Allowing access to stale status! (key = <1.ff000001.1.1>)
[ W(13) : 0000 : 21:18:37 ] Allowing access to stale status! (key = <1.ff000001.1.1>)


So the question is what problem i have and how to fix it?

Thank you.
Received on 2011-02-22 08:50:18