Coda File System

Server Crash Post Mortem

From: Steffen Neumann <sneumann_at_TechFak.Uni-Bielefeld.DE>
Date: Mon, 1 Sep 2003 12:22:18 +0200 (MET DST)
Hi,

we had a server crash this morning, and I am trying 
to send some information to understand what happened...

Here is an index to the logs below:

	SrvErr
	- Very few meaningful looking messages

	SrvLog
	- 11:28:35 Installing Coda-6.0.x, Server Start 08/06/2003
	- 00:00:06 Statistics from midnight today 09/01/2003
	- 10:15:45 random snippet with warnings that sound chinese to me
	- 11:44:22 (?) User reporting Problems...
	- 11:55:14 Server Shutdown

Hope that helps a bit,
I can dig out more of the logs if needed.

So long,
Steffen

-----------------------------------------------------------------------

	aipc1:/vice/srv # cat -n SrvErr-1 | uniq --skip-chars=9
	     1  No waiters, dropped incoming sftp packet
	    88  Xmit_Packet socket 5: Invalid argument
	    93  No waiters, dropped incoming sftp packet
	   300  No waiters, dropped incoming sftp packe
	                                              ^^^
	                                              Dead here, really!

-----------------------------------------------------------------------

	Date: Wed 08/06/2003
	11:28:35 New SrvLog started at Wed Aug  6 11:28:35 2003
	11:28:35 Resource limit on data size are set to -1
[...]
	Date: Mon 09/01/2003
        00:00:06 Total operations for File Server = 7864320 : time = Mon Sep  1 00:00:06 2003
        00:00:06 Vice was last started at Wed Aug  6 11:37:38 2003
        00:00:06 NewConnectFS 17836
        00:00:06 DisconnectFS 16873
        00:00:06 GetAttr 118944
        00:00:06 GetAcl 6
        00:00:06 Fetch 567573
        00:00:06 SetAttr 14213
        00:00:06 SetAcl 3
        00:00:06 Store 109517
        00:00:06 ValidateAttrs 665127
        00:00:06 Remove 36428
        00:00:06 Create 40922
        00:00:06 Rename 4027
        00:00:06 SymLink 1232
        00:00:06 Link 0
        00:00:06 MakeDir 1151
        00:00:06 RemoveDir 911
        00:00:06 GetRootVolume 8
        00:00:06 SetRootVolume 0
        00:00:06 GetVolumeStatus 7
        00:00:06 SetVolumeStatus 0
        00:00:06 GetTime 18846
        00:00:06 GetStatistics 2
        00:00:06 GetVolumeInfo 7482
        00:00:06 AllocFids 1399
        00:00:06 COP2 21445
        00:00:06 Resolve 1
        00:00:06 Repair 0
        00:00:06 SetVV 0
        00:00:06 Reintegrate 43
        00:00:06 OpenReintHandle 1
        00:00:06 QueryReintHandle 0
        00:00:06 SendReintFragment 2
        00:00:06 CloseReintHandle 1
        00:00:06 GetVolVS 69230
        00:00:06 ValidateVols 166
        00:00:06 GetWBPermit 0
        00:00:06 TossWBPermit 0
        00:00:06 RejectWBPermit 0
        00:00:06 GetAttrPlusSHA 5973826
        00:00:06 ValidateAttrsPlusSHA 177074
        00:00:06 Total FetchDatas = 567572, bytes transfered = -1296394282, transfer rate = -173942 bps
        00:00:06 Fetched files <1k = 95127; <8k = 395007; <64k = 58533; <512k = 14398; >512k = 4507.
        00:00:06 Total StoreDatas = 109516, bytes transfered = 1792858781, transfer rate = 1677136 bps
        00:00:06 Stored files <1k = 87669; <8k = 10137; <64k = 6282; <512k = 2945; >512k = 2483.
        00:00:06 Large vnode cache, 500 entries, 1151 allocs, 8526519 gets (704546 reads), 157020 writes
        00:00:06 Small vnode cache,500 entries, 42192 allocs, 8068229 gets (7393303 reads), 364357 writes
        00:00:06 Volume header cache, 50 entries, 17506430 gets, 8477 replacements
        00:00:06 Partition /vicepa: 4787924K available (minfree=5%), 120656K free.
        00:00:06 Partition /vicepb: 4787924K available (minfree=5%), 1366368K free.
        00:00:06 Partition /vicepc: 8480296K available (minfree=5%), 5731480K free.
        00:00:06 Partition /vicepd: 4787924K available (minfree=5%), 4787904K free.
        Dirstats: get 0, put 0, flush 0
        00:00:06 RPC Total bytes:     sent = 1036058497, received = 3325407192
        00:00:06        bytes sent: rpc = 1375888564, multirpc = 0, sftp = -339830067, sftp multicasted = 0
        00:00:06        bytes received: rpc = 1017151112, multirpc = 0, sftp = -1986711216, sftp multicasted = 0
        00:00:06 RPC Total packets:   sent = 24713300, received = 16521269
        00:00:06        packets sent: rpc = 8335056, multirpc = 0, sftp = 16378244, sftp multicasted = 0
        00:00:06        packets received: rpc = 8286592, multirpc = 0, sftp = 8234677, sftp multicasted = 0
        00:00:06 RPC Packets retried = 385, Invalid packets received = 54471, Busies sent = 49039
        00:00:06 RPC Requests 7918404, Good Requests 7864348, Replies 367723, Busies 50
        00:00:06 RPC Counters: CCount 487; Unbinds 17705; FConns 17705; AConns 18192; GCConns 0
        00:00:06 RPC Creation counts: Conn 512; SL 100; PB Small 105, Med 48, Large 15; SS 3
        00:00:06 RPC2 In Use: Conn 487; SS 3
        00:00:06 RPC2 PB: InUse 13, Hold 10, Freeze 0, SFree 82, MFree 48, LFree 15
        00:00:06 RPC2 HW:  Freeze 0, Hold 18
        00:00:06 SFTP:  datas 15596644, datar 5977489, acks 758698, ackr 2256821, retries 1543, duplicates 2
        00:00:06 SFTP:  timeouts 614, windowfulls 161, bogus 367, didpiggy 177274
        00:00:06 Total CB entries= 565829, blocks = 595; and total file entries = 274099, blocks = 1269
        00:00:06 There are currently 421 connections in use
        00:00:06 There are 26 workstations and 26 are active (req in < 15 mins)
        00:00:06 Ether Total bytes: sent = 0, received = 0
        00:00:06 Ether Packets:     sent = 0, received = 0, errors = 0
        00:00:06 Printing RVM statistics
        
        Transaction statistics               current cumulative
        
          Started:                                 1   24768835
          Aborted:                                 0         38
          Committed, flush:                        1   21485446
          Committed, no_flush:                     0    3283351
          Total committed:                         1   24768797
          Committed, but not flushed:              0
          Unflushed transactions length:           0
          Not committed:                           0
          Split by log wrap:                       0       1064
          Delayed by truncation:                   0          0
        
        Log function statistics              current cumulative
        
          rvm_flush calls:                         0        107
          Internal flushes, incl. commit:          1   21481499
          Total flushes:                           1   21481606
          Average flush time (msec):               0         11
          Last flush time (msec):                 21
        
          rvm_truncate calls:                                 2
          Automatic truncations:                          25965
          Forced synch. truncations:                          0
          Recovery truncations:                              99
          Total truncations:                              26066
          Average truncation time (sec):                      0
          Last truncation time (sec):                         1
          Last tree build time (sec):                         0
          Last tree apply time (sec):                         0
        
          Transaction records:                     1   24769861
          Bookeeping records:                      1      26067
          Log wrap-arounds:                        0       1275
          Total records:                           2   24797203
        
          Log used:                                5%        14%
          Log written (bytes):                  1624 33447890944.000
        
        Timing Histograms
        
          Current Flush Timings (msec):
              <=25    <=50   <=100   <=250   <=500  <=1000  <=2500  <=5000 <=10000 > 10000
                 1       0       1       0       0       0       0       0       0       0
        
          Cummulative Flush Timings (msec):
              <=25    <=50   <=100   <=250   <=500  <=1000  <=2500  <=5000 <=10000 > 10000
           20726504  591918  137221   19237    4907    1070     321     256      83       4
        
        
          Truncation Timings for Tree Build (sec):
             <=1  <=10 <=100 <=500 > 500
           26058    31     0     0     0
        
          Truncation Timings for Tree Apply (sec):
             <=1  <=10 <=100 <=500 > 500
           24788  1266    34     1     0
        
          Total Truncation Timings (sec):
             <=1  <=10 <=100 <=500 > 500
           24000  2049    39     1     0
        
        
        Transaction Optimization Statistics          current cumulative
        
          Ranges eliminated
           Range coalesce:                                 0          0
           Trans coalesce:                                 0          0
            Totals:                                        0          0
          Avg. number eliminated per transaction
           Range coalesce:                                 0          0
           Trans coalesce:                                 0          0
            Totals:                                        0          0
          Range length eliminated
           Range coalesce:                                 0          0
           Trans coalesce:                                 0          0
            Totals:                                        0          0
          Log savings
           Range coalesce:                                 0          0
           Trans coalesce:                                 0          0
            Totals:                                        0          0
          Transactions coalesced per flush:                0          0
        
        
        Tranasction Modification Range Distributions
        
          Current Range Lengths (bytes):
                   <=0         <=4         <=8        <=16        <=32        <=64       <=128
                     0           0           0           0           0           0           0
        
                 <=256       <=512      <=1024      <=2048      <=4096      > 4096
                     0           0           2           0           0           0
        
          Cumulative Range Lengths (bytes):
                   <=0         <=4         <=8        <=16        <=32        <=64       <=128
                     1    33434461    28330072    15163871     2512492    11399700    15000723
        
                 <=256       <=512      <=1024      <=2048      <=4096      > 4096
                183373     5584553     9651710     5524435         152        5420
        
        
          Current Range Lengths Eliminated (bytes):
           Range coalesce:
                   <=0         <=4         <=8        <=16        <=32        <=64       <=128
                     2           0           0           0           0           0           0
        
                 <=256       <=512      <=1024      <=2048      <=4096      > 4096
                     0           0           0           0           0           0
           Trans coalesce:
                   <=0         <=4         <=8        <=16        <=32        <=64       <=128
                     2           0           0           0           0           0           0
        
                 <=256       <=512      <=1024      <=2048      <=4096      > 4096
                     0           0           0           0           0           0
            Totals:
                   <=0         <=4         <=8        <=16        <=32        <=64       <=128
                     4           0           0           0           0           0           0
        
                 <=256       <=512      <=1024      <=2048      <=4096      > 4096
                     0           0           0           0           0           0
        
          Cumulative Range Lengths Eliminated (bytes):
           Range coalesce:
                   <=0         <=4         <=8        <=16        <=32        <=64       <=128
              24764766           0           0           0           0           0           0
        
                 <=256       <=512      <=1024      <=2048      <=4096      > 4096
                     0           0           0           0           0           0
           Trans coalesce:
                   <=0         <=4         <=8        <=16        <=32        <=64       <=128
              24764766           0           0           0           0           0           0
        
                 <=256       <=512      <=1024      <=2048      <=4096      > 4096
                     0           0           0           0           0           0
            Totals:
                   <=0         <=4         <=8        <=16        <=32        <=64       <=128
              49529532           0           0           0           0           0           0
        
                 <=256       <=512      <=1024      <=2048      <=4096      > 4096
                     0           0           0           0           0           0
        
          Current Number of Eliminated Ranges per Transaction
           Range coalesce:
                   <=0         <=5        <=10        <=50       <=100       > 100
                     2           0           0           0           0           0
           Trans coalesce:
                   <=0         <=5        <=10        <=50       <=100       > 100
                     2           0           0           0           0           0
            Totals:
                   <=0         <=5        <=10        <=50       <=100       > 100
                     4           0           0           0           0           0
        
          Cummulative Number of Eliminated Ranges per Transaction
           Range coalesce:
                   <=0         <=5        <=10        <=50       <=100       > 100
              24764766           0           0           0           0           0
           Trans coalesce:
                   <=0         <=5        <=10        <=50       <=100       > 100
              24764766           0           0           0           0           0
            Totals:
                   <=0         <=5        <=10        <=50       <=100       > 100
              49529532           0           0           0           0           0
        
          Cummulative Number of Transactions Coalesced per flush Cycle
                   <=0         <=5        <=10        <=50       <=100       > 100
              24764768           0           0           0           0           0
        
        00:00:06 Printing RDS statistics
        
        Number of
         Free bytes:     d9d21a0
         Alloced bytes:  562db00
         Mallocs:        4196937
         Frees:          3883372
         Preallocs:      0
         Hits:           3731507
         Misses:         459851
         Large Hits:     2
         Large Misses:   5577
         Coalesces:      675852
         Merges          142373
         Not Merged:     1672900565
         Times the Large List pointer has changed: 0
        00:00:06 done

[...]
        10:15:45 ****** WARNING entry at 0x81279b8 already has deqing set!
        
        10:15:46 CheckRemoveSemantics: 1000004.5.3, VCP error (198)
        10:15:46 Entering VFlushVnode for vnode 0x5
        10:15:46 Entering VFlushVnode for vnode 0xe0de
        10:15:46 CheckCreateSemantics: (1000004.5.3), VCP error (198)
        10:15:46 Entering VFlushVnode for vnode 0x5
        10:15:46 Entering VFlushVnode for vnode 0xe0a2
        10:15:46 CheckRemoveSemantics: 1000004.5.3, VCP error (198)
        10:15:46 Entering VFlushVnode for vnode 0x5
        10:15:46 Entering VFlushVnode for vnode 0xe2d2
        10:15:46 CheckCreateSemantics: (1000004.5.3), VCP error (198)
        10:15:46 Entering VFlushVnode for vnode 0x5
        10:15:46 Entering VFlushVnode for vnode 0xe254
        10:15:47 CheckCreateSemantics: (1000004.5.3), VCP error (198)
        10:15:47 Entering VFlushVnode for vnode 0x5
        10:15:47 Entering VFlushVnode for vnode 0xe388
        10:15:48 CheckCreateSemantics: (1000004.5.3), VCP error (198)
        10:15:48 Entering VFlushVnode for vnode 0x5
        10:15:48 Entering VFlushVnode for vnode 0xe256
        10:16:00 SmonDaemon timer expired
        10:16:00 Entered CheckRVMResStat
        10:16:00 Starting SmonDaemon timer
        10:16:00 VGetVnode: vnode 1000004.dda8 is not allocated
        10:16:00 VGetVnode: vnode 1000004.ddee is not allocated
        10:16:00 VGetVnode: vnode 1000004.dfe2 is not allocated
        10:16:00 VGetVnode: vnode 1000004.e0a0 is not allocated
        10:16:00 VGetVnode: vnode 1000004.e0de is not allocated
        10:16:00 VGetVnode: vnode 1000004.e11a is not allocated
        10:16:00 VGetVnode: vnode 1000004.e252 is not allocated
        10:16:00 VGetVnode: vnode 1000004.e2d2 is not allocated
        10:16:00 VGetVnode: vnode 1000004.e386 is not allocated
        10:30:46 CheckRemoveSemantics: 1000004.5.3, VCP error (198)
        10:30:46 Entering VFlushVnode for vnode 0x5
        10:30:46 Entering VFlushVnode for vnode 0xe20e
        10:30:46 CheckCreateSemantics: (1000004.5.3), VCP error (198)
        10:30:46 Entering VFlushVnode for vnode 0x5
        10:30:46 Entering VFlushVnode for vnode 0xdeb2
        10:35:28 VGetVnode: vnode 1000004.deb0 is not allocated
        10:35:28 VGetVnode: vnode 1000004.df22 is not allocated
        10:35:28 VGetVnode: vnode 1000004.e20e is not allocated
        10:35:28 VGetVnode: vnode 1000004.e30a is not allocated
        10:35:29 VGetVnode: vnode 1000004.e20e is not allocated
        10:45:45 CheckCreateSemantics: (1000004.5.3), VCP error (198)
        10:45:45 Entering VFlushVnode for vnode 0x5
        10:45:45 Entering VFlushVnode for vnode 0xe350
        10:45:45 CheckRemoveSemantics: 1000004.5.3, VCP error (198)
        10:45:45 Entering VFlushVnode for vnode 0x5
        10:45:45 Entering VFlushVnode for vnode 0xe024
        10:45:46 CheckRemoveSemantics: 1000004.5.3, VCP error (198)
        10:45:46 Entering VFlushVnode for vnode 0x5
        10:45:46 Entering VFlushVnode for vnode 0xe3be
        10:45:46 CheckRemoveSemantics: 1000004.5.3, VCP error (198)
        10:45:46 Entering VFlushVnode for vnode 0x5
        10:45:46 Entering VFlushVnode for vnode 0xe160
        10:45:46 CheckRemoveSemantics: 1000004.5.3, VCP error (198)
        10:45:46 Entering VFlushVnode for vnode 0x5
        10:45:46 Entering VFlushVnode for vnode 0xdfa6
        10:45:47 CheckRemoveSemantics: 1000004.5.3, VCP error (198)
        10:45:47 Entering VFlushVnode for vnode 0x5
        10:45:47 Entering VFlushVnode for vnode 0xe024
        10:45:47 CheckCreateSemantics: (1000004.5.3), VCP error (198)
        10:45:47 Entering VFlushVnode for vnode 0x5
        10:45:47 Entering VFlushVnode for vnode 0xdeea
        10:45:47 CheckCreateSemantics: (1000004.5.3), VCP error (198)
        10:45:47 Entering VFlushVnode for vnode 0x5
        10:45:47 Entering VFlushVnode for vnode 0xdf68
        10:45:47 CheckRemoveSemantics: 1000004.5.3, VCP error (198)
        10:45:47 Entering VFlushVnode for vnode 0x5
        10:45:47 Entering VFlushVnode for vnode 0xdfa6
        10:45:48 CheckCreateSemantics: (1000004.5.3), VCP error (198)
        10:45:48 Entering VFlushVnode for vnode 0x5
        10:45:48 Entering VFlushVnode for vnode 0xe162
[...]
        
        11:44:22 Callback failed RPC2_DEAD (F) for ws 129.70.138.29:32769
        11:44:22 Unbinding RPC2 connection 1851
        11:44:22 Unbinding RPC2 connection 9080
        11:44:22 Unbinding RPC2 connection 2703
        11:44:22 Unbinding RPC2 connection 565
        11:44:22 Unbinding RPC2 connection 14447
        
[...]
        11:55:14 VShutdown:  shutting down on-line volumes...
        11:55:14 VShutdown: Taking volume coda.root.0(0x1000001) offline...
        11:55:14 ... Done
        11:55:14 VShutdown: Taking volume coda.homes.sneumann.0(0x1000002) offline...
        11:55:14 ... Done
        11:55:14 VShutdown: Taking volume coda.homes.floemker.0(0x1000003) offline...
        11:55:14 ... Done
        11:55:14 VShutdown: Taking volume coda.vol.ai.0(0x1000004) offline...
        11:56:25 Callback failed RPC2_NAKED (F) for ws 129.70.139.62:2430
        11:56:25 Unbinding RPC2 connection 11526
[...]
        11:56:25 Unbinding RPC2 connection 4950
        
        
        
        
        
        
        
        
        
        
        
        
        
        
        
        
        
        
        
        
        
        
        
Received on 2003-09-01 06:26:08