(Illustration by Gaich Muramatsu)
Hello again, I have some questions about servers crash/rebuild : Mainly I asked myself these questions from the very beginning and did find a clear answer yet: 1. Coda backup coordinator Do we have to use the coda backup coordinator machine or any other backup system will do ? We could, for example restore the entire disk and mount it on /vicepa afterwards. Will that create too many inconsistencies ? 2. Non-scm crash/ rebuild The partition of one of the non-scm servers that I had on amovible hard drive crashed because of some test I did. It was on an reiserfs partition. Anyway, I re-installed the non-scm server, it managed to get the /vice/db files Question: how do I get the volume to populate the non-scm /vicepa directory. I type bld...sh non-scm server on the scm server, it said success bla bla but this still doesn't work. Actually, when i log with the client on the scm only I get the same messages. I get the name of the directories on /coda but when I do a cd I get an error message. So how do I add a new non-scm server on a coda cell an dpopulate it with the existing volumes, I should modify the /vice/db/VSGDB adn type bldvldb.sh new-non-scm again ? But how will the others non-scm servers know about it ? If I try to connect the client I get the error message: cd /coda/pub 17:26:38 MaxRetries exceeded...returning EWOULDBLOCK 17:26:45 MaxRetries exceeded...returning EWOULDBLOCK bash: cd: /coda/pub: Ressource not available or something like that on the non-scm server I have in /vice/srv/SrvLog 15:18:08 New SrvLog started at Mon Jul 24 15:18:08 2000 15:18:08 Resource limit on data size are set to 2147483647 15:18:08 Server etext 0x80c83c4, edata 0x8100008 15:18:08 RvmType is Rvm 15:18:08 Main process doing a LWP_Init() 15:18:08 Main thread just did a RVM_SET_THREAD_DATA 15:18:08 Setting Rvm Truncate threshhold to 5. Partition /vicepa: inodes in use: 0, total: 16777216. 15:18:20 Partition /vicepa: 238255K available (minfree=5%), 238253K free. 15:18:20 The server (pid 704) can be controlled using volutil commands 15:18:20 "volutil -help" will give you a list of these commands 15:18:20 If desperate, "kill -SIGWINCH 704" will increase debugging level 15:18:20 "kill -SIGUSR2 704" will set debugging level to zero 15:18:20 "kill -9 704" will kill a runaway server 15:18:20 Vice file system salvager, version 3.0. 15:18:21 SanityCheckFreeLists: Checking RVM Vnode Free lists. 15:18:21 DestroyBadVolumes: Checking for destroyed volumes. 15:18:21 Salvaging file system partition /vicepa 15:18:21 Force salvage of all volumes on this partition 15:18:21 Scanning inodes in directory /vicepa... 15:18:21 SalvageFileSys completed on /vicepa 15:18:21 Attached 0 volumes; 0 volumes not attached lqman: Creating LockQueue Manager.....LockQueue Manager starting ..... 15:18:21 LockQueue Manager just did a rvmlib_set_thread_data() done 15:18:21 CallBackCheckLWP just did a rvmlib_set_thread_data() 15:18:21 CheckLWP just did a rvmlib_set_thread_data() 15:18:21 ServerLWP 0 just did a rvmlib_set_thread_data() 15:18:21 ServerLWP 1 just did a rvmlib_set_thread_data() 15:18:21 ServerLWP 2 just did a rvmlib_set_thread_data() 15:18:21 ServerLWP 3 just did a rvmlib_set_thread_data() 15:18:21 ServerLWP 4 just did a rvmlib_set_thread_data() 15:18:21 ServerLWP 5 just did a rvmlib_set_thread_data() 15:18:21 ResLWP-0 just did a rvmlib_set_thread_data() 15:18:21 ResLWP-1 just did a rvmlib_set_thread_data() 15:18:21 VolUtilLWP 0 just did a rvmlib_set_thread_data() 15:18:21 VolUtilLWP 1 just did a rvmlib_set_thread_data() 15:18:21 Starting SmonDaemon timer 15:18:21 File Server started Mon Jul 24 15:18:21 2000 15:21:34 client_GetVenusId: got new host 192.168.1.214:2430 15:21:34 Building callback conn. 15:21:34 No idle WriteBack conns, building new one 15:21:34 Writeback message to 192.168.1.214 port 2430 on conn 159e92a0 succeeded 15:21:35 client_GetVenusId: got new host 192.168.1.27:2430 15:21:35 Building callback conn. 15:21:35 No idle WriteBack conns, building new one 15:21:35 Writeback message to 192.168.1.27 port 2430 on conn 17ccb899 succeeded 15:21:35 GetVolObj: VGetVolume(2000007) error 103 15:21:35 GrabFsObj, GetVolObj error Volume not online 15:21:35 GetVolObj: VGetVolume(2000007) error 103 15:21:35 GrabFsObj, GetVolObj error Volume not online 15:21:35 GetVolObj: VGetVolume(2000007) error 103 15:21:35 RS_LockAndFetch: Error 103 during GetVolObj for (0x2000007.0x1.0x1) 15:21:35 GetVolObj: VGetVolume(2000007) error 103 15:21:35 GrabFsObj, GetVolObj error Volume not online 15:21:35 GetVolObj: VGetVolume(2000007) error 103 15:21:35 RS_LockAndFetch: Error 103 during GetVolObj for (0x2000007.0x1.0x1) 15:21:35 RevokeWBPermit on conn 17ccb899 returned 0 15:21:35 GetVolObj: VGetVolume(2000001) error 103 15:21:35 GrabFsObj, GetVolObj error Volume not online 15:21:35 GetVolObj: VGetVolume(2000001) error 103 15:21:35 GrabFsObj, GetVolObj error Volume not online 15:21:35 GetVolObj: VGetVolume(2000001) error 103 15:21:35 RS_LockAndFetch: Error 103 during GetVolObj for (0x2000001.0x1.0x1) 15:21:35 GetVolObj: VGetVolume(2000001) error 103 15:21:35 GrabFsObj, GetVolObj error Volume not online 15:21:35 GetVolObj: VGetVolume(2000001) error 103 15:21:35 RS_LockAndFetch: Error 103 during GetVolObj for (0x2000001.0x1.0x1) 15:21:35 RevokeWBPermit on conn 17ccb899 returned 0 15:21:35 GetVolObj: VGetVolume(2000004) error 103 15:21:35 GrabFsObj, GetVolObj error Volume not online 15:21:35 GetVolObj: VGetVolume(2000004) error 103 15:21:35 RS_LockAndFetch: Error 103 during GetVolObj for (0x2000004.0x1.0x1) 15:21:35 GetVolObj: VGetVolume(2000004) error 103 15:31:36 Unbinding RPC2 connection 126131479 15:31:36 Unbinding RPC2 connection 109566470 15:31:36 Unbinding RPC2 connection 968266099 15:31:36 Unbinding RPC2 connection 62628995 15:31:36 Unbinding RPC2 connection 261667013 16:18:51 SmonDaemon timer expired 16:18:51 Entered CheckRVMResStat 16:18:51 Starting SmonDaemon timer 17:19:21 SmonDaemon timer expired 17:19:21 Entered CheckRVMResStat 17:19:21 Starting SmonDaemon timer 17:23:56 client_GetVenusId: got new host "a client ip address":2430 on the Scm i get : 15:19:07 New SrvLog started at Mon Jul 24 15:19:07 2000 15:19:07 Resource limit on data size are set to 2147483647 15:19:07 Server etext 0x80c83c4, edata 0x8100008 15:19:07 RvmType is Rvm 15:19:07 Main process doing a LWP_Init() 15:19:07 Main thread just did a RVM_SET_THREAD_DATA 15:19:07 Setting Rvm Truncate threshhold to 5. Partition /vicepa: inodes in use: 167, total: 16777216. 15:19:11 Partition /vicepa: 255887K available (minfree=0%), 220169K free. 15:19:11 The server (pid 679) can be controlled using volutil commands 15:19:11 "volutil -help" will give you a list of these commands 15:19:11 If desperate, "kill -SIGWINCH 679" will increase debugging level 15:19:11 "kill -SIGUSR2 679" will set debugging level to zero 15:19:11 "kill -9 679" will kill a runaway server 15:19:11 Vice file system salvager, version 3.0. 15:19:11 SanityCheckFreeLists: Checking RVM Vnode Free lists. 15:19:11 DestroyBadVolumes: Checking for destroyed volumes. 15:19:11 Salvaging file system partition /vicepa 15:19:11 Force salvage of all volumes on this partition 15:19:11 Scanning inodes in directory /vicepa... 15:19:11 SFS: There are some volumes without any inodes in them 15:19:11 Entering DCC(0x1000001) 15:19:11 DCC: Salvaging Logs for volume 0x1000001 15:19:11 done: 5 files/dirs, 6 blocks 15:19:11 Entering DCC(0x1000002) 15:19:11 DCC: Salvaging Logs for volume 0x1000002 15:19:11 done: 22 files/dirs, 7421 blocks 15:19:11 Entering DCC(0x1000003) 15:19:11 DCC: Salvaging Logs for volume 0x1000003 15:19:11 done: 104 files/dirs, 1238 blocks 15:19:11 Entering DCC(0x1000004) 15:19:11 DCC: Salvaging Logs for volume 0x1000004 15:19:11 done: 8 files/dirs, 70 blocks 15:19:11 Entering DCC(0x1000005) 15:19:11 DCC: Salvaging Logs for volume 0x1000005 15:19:11 done: 3 files/dirs, 4 blocks 15:19:11 SFS:No Inode summary for volume 0x1000006; skipping full salvage 15:19:11 SalvageFileSys: Therefore only resetting inUse flag 15:19:11 SFS:No Inode summary for volume 0x1000007; skipping full salvage 15:19:11 SalvageFileSys: Therefore only resetting inUse flag 15:19:11 Entering DCC(0x1000008) 15:19:11 DCC: Salvaging Logs for volume 0x1000008 15:19:11 done: 3 files/dirs, 4 blocks 15:19:11 Entering DCC(0x1000009) 15:19:11 DCC: Salvaging Logs for volume 0x1000009 15:19:11 done: 18 files/dirs, 56676 blocks 15:19:11 Entering DCC(0x100000a) 15:19:11 DCC: Salvaging Logs for volume 0x100000a 15:19:11 done: 16 files/dirs, 44359 blocks 15:19:11 SalvageFileSys completed on /vicepa 15:19:11 VAttachVolumeById: vol 1000001 (coda_root.0) attached and online 15:19:11 VAttachVolumeById: vol 1000002 (coda.0) attached and online 15:19:11 VAttachVolumeById: vol 1000003 (doc.0) attached and online 15:19:11 VAttachVolumeById: vol 1000004 (pub.0) attached and online 15:19:11 VAttachVolumeById: vol 1000005 (users.0) attached and online 15:19:11 VAttachVolumeById: vol 1000006 (mandrake.0) attached and online 15:19:11 VAttachVolumeById: vol 1000007 (a.0) attached and online 15:19:11 VAttachVolumeById: vol 1000008 (mp3.0) attached and online 15:19:11 VAttachVolumeById: vol 1000009 (cubana.0) attached and online 15:19:11 VAttachVolumeById: vol 100000a (Golden_Gate.0) attached and online 15:19:11 Attached 10 volumes; 0 volumes not attached lqman: Creating LockQueue Manager.....LockQueue Manager starting ..... 15:19:11 LockQueue Manager just did a rvmlib_set_thread_data() done 15:19:11 CallBackCheckLWP just did a rvmlib_set_thread_data() 15:19:11 CheckLWP just did a rvmlib_set_thread_data() 15:19:11 ServerLWP 0 just did a rvmlib_set_thread_data() 15:19:11 ServerLWP 1 just did a rvmlib_set_thread_data() 15:19:11 ServerLWP 2 just did a rvmlib_set_thread_data() 15:19:11 ServerLWP 3 just did a rvmlib_set_thread_data() 15:19:11 ServerLWP 4 just did a rvmlib_set_thread_data() 15:19:11 ServerLWP 5 just did a rvmlib_set_thread_data() 15:19:11 ResLWP-0 just did a rvmlib_set_thread_data() 15:19:11 ResLWP-1 just did a rvmlib_set_thread_data() 15:19:11 VolUtilLWP 0 just did a rvmlib_set_thread_data() 15:19:11 VolUtilLWP 1 just did a rvmlib_set_thread_data() 15:19:11 Starting SmonDaemon timer 15:19:11 File Server started Mon Jul 24 15:19:11 2000 15:22:48 client_GetVenusId: got new host 192.168.1.27:2430 15:31:56 Unbinding RPC2 connection 77674659 15:31:56 Unbinding RPC2 connection 607742238 15:31:56 Unbinding RPC2 connection 193043662 15:31:56 Unbinding RPC2 connection 146736843 15:31:56 Unbinding RPC2 connection 438581500 15:31:56 Unbinding RPC2 connection 909797812 15:31:56 Unbinding RPC2 connection 713149311 15:31:56 Unbinding RPC2 connection 495551406 16:19:41 SmonDaemon timer expired 16:19:41 Entered CheckRVMResStat 16:19:41 Starting SmonDaemon timer 17:19:41 SmonDaemon timer expired 17:19:41 Entered CheckRVMResStat 17:19:41 Starting SmonDaemon timer 17:25:21 client_GetVenusId: got new host 192.168.1.27:2430 (the client IP address) 17:25:21 Building callback conn. 17:25:21 No idle WriteBack conns, building new one 17:25:21 Writeback message to 192.168.1.27 port 2430 on conn 22c5d1a7 succeeded 17:25:21 RevokeWBPermit on conn 22c5d1a7 returned 0 17:25:24 RevokeWBPermit on conn 22c5d1a7 returned 0 17:25:24 CheckRetCodes: server 192.168.1.64 returned error 103 (a non scm IP address) 17:25:24 ViceResolve:Couldnt lock volume 7f000007 at all accessible servers 17:25:24 CheckRetCodes: server 192.168.1.64 returned error 103 17:25:24 RevokeWBPermit on conn 22c5d1a7 returned 0 17:25:24 ViceResolve:Couldnt lock volume 7f000007 at all accessible servers 17:25:24 RevokeWBPermit on conn 22c5d1a7 returned 0 17:25:24 CheckRetCodes: server 192.168.1.64 returned error 103 17:25:24 ViceResolve:Couldnt lock volume 7f000001 at all accessible servers 17:25:24 CheckRetCodes: server 192.168.1.64 returned error 103 17:25:24 ViceResolve:Couldnt lock volume 7f000001 at all accessible servers 17:25:24 RevokeWBPermit on conn 22c5d1a7 returned 0 17:25:25 CheckRetCodes: server 192.168.1.64 returned error 103 17:25:25 ViceResolve:Couldnt lock volume 7f000004 at all accessible servers 17:25:25 CheckRetCodes: server 192.168.1.64 returned error 103 17:25:25 ViceResolve:Couldnt lock volume 7f000004 at all accessible servers 17:25:25 RevokeWBPermit on conn 22c5d1a7 returned 0 17:25:25 CheckRetCodes: server 192.168.1.64 returned error 103 17:25:25 ViceResolve:Couldnt lock volume 7f000003 at all accessible servers 17:25:25 CheckRetCodes: server 192.168.1.64 returned error 103 17:25:25 ViceResolve:Couldnt lock volume 7f000003 at all accessible servers 17:25:27 CheckRetCodes: server 192.168.1.64 returned error 103 17:25:27 ViceResolve:Couldnt lock volume 7f000003 at all accessible servers 17:25:27 CheckRetCodes: server 192.168.1.64 returned error 103 17:25:27 ViceResolve:Couldnt lock volume 7f000003 at all accessible servers 17:25:28 CheckRetCodes: server 192.168.1.64 returned error 103 17:25:28 ViceResolve:Couldnt lock volume 7f000003 at all accessible servers 17:25:29 CheckRetCodes: server 192.168.1.64 returned error 103 3. Scm crash/rebuild Now, if the scm crashes or if I turn it off, will the cell still be working, eventually i I restart the clients ? sincerely, -- Florin florin@mandrakesoft.com http://www.linux-mandrake.comReceived on 2000-07-24 11:34:04