Coda File System

From: Jan Harkes <jaharkes_at_cs.cmu.edu> Date: Fri, 31 Jan 2003 13:32:19 -0500

On Fri, Jan 31, 2003 at 10:36:29AM +0100, Vojtech Moravek wrote:
> thaks for reply, but still thinking that problem is anywhere in kernel.
> For eliminate network problems and others bugs describe configuration on
> problematic computer.

It could be a kernel problem, but there are so many more 'likely'
candidates.

I.e. rpc2 is still used when a client and a server are on the same
machine, and rpc2 is trying to estimate the RTT of the loopback device.
On your single CPU machine, we have several competing applications, so
the RTT's might be 'reasonable'. However on the SMP machine they can run
in parallel, and RTT might approach 0 and cause an overflow somewhere.
This is exactly what we had when switched from a shared 10Base-T to a
relatively empty switched 100Base-T network.

> If you have more ideas..will be good soud for me :)

Well, you could turn up the debug level with 'vutil -d 100' on the
problematic machine before doing the 'dd', and 'vutil -d 0' to switch it
back down to 'sane' levels. And then send me the venus.log file in a
private email. That should show me how much time we actually spend in
venus handling the store request. If that is more than 10 seconds, it is
probably not really kernel related.

Here is what I get from the vmstat output,

> -------------2.4.18-smp-14-------
> [root_at_sklad1 root]# vmstat 1
>    procs                      memory    swap          io    > system       cpu
>  r  b  w   swpd   free   buff  cache  si  so    bi    bo   in    cs  us sy  id
>  0  0  0      0 1641284   7408 204052   0   0    60     7  131    12 0   0  99
>  0  0  0      0 1641284   7408 204052   0   0     0     0  517     6 0   0 100
>  0  0  0      0 1641284   7408 204052   0   0     0     0  517     8 0   0 100
>  0  0  0      0 1641284   7416 204052   0   0     0    89  526    21 0   0 100
>  0  1  0      0 1640828   7424 203032   0   0     0  1075  541   136 0   1  99
						       ^^^^
we quickly write 1MB to disk, this is probably the write to the
container file, 'dd' then closes the file which triggers an upcall.

>  0  1  0      0 1640828   7424 203036   0   0     0     0  520    30 0   0 100
>  0  1  0      0 1640828   7424 203056   0   0     0     0  519    90 0   0 100
>  0  1  0      0 1640828   7428 203084   0   0     0     0  521   126 0   0 100
>  0  1  0      0 1640828   7448 203108   0   0     0   180  524   110 0   0 100

Small write bursts with 5 second intervals. btw. your interrupt load
seems on the high side, I typically see somewhere between 100 and 200.

...copied only the entries where data is written...
>  0  1  0      0 1640828   7464 203252   0   0     0   248  522   138 0   0 100
>  0  1  0      0 1640828   7480 203388   0   0     0   216  523   114 0   0 100
>  0  1  0      0 1640828   7480 203420   0   0     0    33  530   152 0   0 100
>  0  1  0      0 1640828   7496 203540   0   0     0   152  526   120 0   0 100
>  0  1  0      0 1640828   7512 203676   0   0     0   176  527   150 0   0 100
>  0  1  0      0 1640828   7528 203812   0   0     0   184  532   110 0   0 100
>  0  1  0      0 1640828   7544 203972   0   0     0   192  524   107 0   0 100
>  0  1  0      0 1640828   7560 204040   0   0     0   156  530    46 0   0 100
>  0  1  0      0 1640824   7560 204048   0   0     0     0  519    26 0   0 100
>  0  0  0      0 1640828   7560 204052   0   0     0    53  530    36 0   0 100

And we're done, after writing another 1.3 MB to disk.

> -------------2.4.18-14-------
>  0  2  0      0 1645608   5972 203496   0   0     0  1032  528    38 0   1  99

Again, dd writes 1MB then calls close which triggers venus to perform a store.

>  0  0  0      0 1645336   6000 204524   0   0     0  1201  553  2106 3   2  95

Whoosh, about 1.17MB in one go, but look at the number of context
switches. This involved 2 processes who were actively triggering each
other. The amount of data is close enough that I assume this is the same
data that is written in the previous sample.

Were these with the Coda server on the same machine as the Coda client?

Jan

Coda File System

Re: write speed