Coda File System

Coda spinning

From: <shivers_at_ccs.neu.edu>
Date: Tue, 15 May 2007 08:44:19 -0400 (EDT)
   From: Jan Harkes <jaharkes_at_cs.cmu.edu>

   > Yeah, the venus-at-100%-of-cpu thing is pretty common right after
   > I get back on the net; it usually lasts for about 10-15 minutes.
   > During this time, by the way, codacon is pretty calm -- it's not
   > blasting out "validate" messages or anything.

   Interesting, in that case the 100% cpu usage probably doesn't have
   anything to do with reintegration. I guess it is the demotion of all
   cached objects as a result of the server/volume state change(s).

   I guess that code path may be missing a yield in the outer loop. This
   wouldn't fix the CPU usage, but make the system a little more responsive
   again. A better fix may be to use some sort of an epoch/event counter
   when the volume state changes and use that to detect which objects need
   to be revalidated. Not sure if such a solution would merge well into the
   existing revalidation mechanism.

It is spinning right now -- it is morning and I opened up my laptop after
a night of suspension. Here's codacon's current output:

    Probe ( 23:39:29 )
    BackProbe lambda.csail.mit.edu ( 23:39:29 )
    Probe ( 23:42:02 )
    BackProbe lambda.csail.mit.edu ( 23:42:02 )
    Probe ( 08:20:30 )
    BeginStatusWalk [27693] ( 08:20:30 )
       [28366, 0, 0, 0] [28365] ( 08:20:30 )
    EndStatusWalk [27693] ( 08:20:30 )
       [28366, 0, 0, 0] [28365, 0, 0] [1, 0, 0.1] ( 08:20:30 )
    BeginDataWalk [2585437] ( 08:20:30 )
    EndDataWalk [2585437] ( 08:20:30 )
       [1, 0, 0.1] [0, 0, 0, 0] ( 08:20:30 )
    unreachable lambda.csail.mit.edu ( 08:21:56 )
    NewConnectFS lambda.csail.mit.edu ( 08:23:02 )
    NewConnection lambda.csail.mit.edu ( 08:23:02 )
    up lambda.csail.mit.edu ( 08:23:02 )
    BackProbe lambda.csail.mit.edu ( 08:23:02 )
    Probe ( 08:23:03 )
    BackProbe lambda.csail.mit.edu ( 08:23:03 )
    bandwidth lambda.csail.mit.edu 31747 54558 77370 ( 08:23:03 )
    NewConnectFS lambda.csail.mit.edu ( 08:23:08 )
    BackProbe lambda.csail.mit.edu ( 08:23:08 )
    ValidateVols / [1] ( 08:23:08 )
    Probe ( 08:25:41 )
    BackProbe lambda.csail.mit.edu ( 08:25:41 )
    Probe ( 08:28:15 )
    BackProbe lambda.csail.mit.edu ( 08:28:15 )
    Probe ( 08:30:48 )
    BackProbe lambda.csail.mit.edu ( 08:30:48 )
    Probe ( 08:33:21 )
    BackProbe lambda.csail.mit.edu ( 08:33:21 )
    BeginStatusWalk [27693] ( 08:35:28 )
       [0, 28366, 0, 0] [28365] ( 08:35:28 )
    Probe ( 08:39:52 )
    BackProbe lambda.csail.mit.edu ( 08:39:52 )
    Probe ( 08:42:22 )
    BackProbe lambda.csail.mit.edu ( 08:42:22 )

Ahh... it just stopped spinning, and codacon simultaneously ouput
    EndStatusWalk [27693] ( 08:43:23 )
       [28366, 0, 0, 0] [28365, 28369, 28369] [1, 28370, 475.4] ( 08:43:23 )
    BeginDataWalk [2585437] ( 08:43:23 )
    EndDataWalk [2585437] ( 08:43:23 )
       [1, 0, 0.0] [0, 0, 0, 0] ( 08:43:23 )

Does that help?
    -Olin
Received on 2007-05-15 08:47:10