(Illustration by Gaich Muramatsu)
On Fri, Jan 31, 2003 at 10:36:29AM +0100, Vojtech Moravek wrote: > thaks for reply, but still thinking that problem is anywhere in kernel. > For eliminate network problems and others bugs describe configuration on > problematic computer. It could be a kernel problem, but there are so many more 'likely' candidates. I.e. rpc2 is still used when a client and a server are on the same machine, and rpc2 is trying to estimate the RTT of the loopback device. On your single CPU machine, we have several competing applications, so the RTT's might be 'reasonable'. However on the SMP machine they can run in parallel, and RTT might approach 0 and cause an overflow somewhere. This is exactly what we had when switched from a shared 10Base-T to a relatively empty switched 100Base-T network. > If you have more ideas..will be good soud for me :) Well, you could turn up the debug level with 'vutil -d 100' on the problematic machine before doing the 'dd', and 'vutil -d 0' to switch it back down to 'sane' levels. And then send me the venus.log file in a private email. That should show me how much time we actually spend in venus handling the store request. If that is more than 10 seconds, it is probably not really kernel related. Here is what I get from the vmstat output, > -------------2.4.18-smp-14------- > [root_at_sklad1 root]# vmstat 1 > procs memory swap io > system cpu > r b w swpd free buff cache si so bi bo in cs us sy id > 0 0 0 0 1641284 7408 204052 0 0 60 7 131 12 0 0 99 > 0 0 0 0 1641284 7408 204052 0 0 0 0 517 6 0 0 100 > 0 0 0 0 1641284 7408 204052 0 0 0 0 517 8 0 0 100 > 0 0 0 0 1641284 7416 204052 0 0 0 89 526 21 0 0 100 > 0 1 0 0 1640828 7424 203032 0 0 0 1075 541 136 0 1 99 ^^^^ we quickly write 1MB to disk, this is probably the write to the container file, 'dd' then closes the file which triggers an upcall. > 0 1 0 0 1640828 7424 203036 0 0 0 0 520 30 0 0 100 > 0 1 0 0 1640828 7424 203056 0 0 0 0 519 90 0 0 100 > 0 1 0 0 1640828 7428 203084 0 0 0 0 521 126 0 0 100 > 0 1 0 0 1640828 7448 203108 0 0 0 180 524 110 0 0 100 Small write bursts with 5 second intervals. btw. your interrupt load seems on the high side, I typically see somewhere between 100 and 200. ...copied only the entries where data is written... > 0 1 0 0 1640828 7464 203252 0 0 0 248 522 138 0 0 100 > 0 1 0 0 1640828 7480 203388 0 0 0 216 523 114 0 0 100 > 0 1 0 0 1640828 7480 203420 0 0 0 33 530 152 0 0 100 > 0 1 0 0 1640828 7496 203540 0 0 0 152 526 120 0 0 100 > 0 1 0 0 1640828 7512 203676 0 0 0 176 527 150 0 0 100 > 0 1 0 0 1640828 7528 203812 0 0 0 184 532 110 0 0 100 > 0 1 0 0 1640828 7544 203972 0 0 0 192 524 107 0 0 100 > 0 1 0 0 1640828 7560 204040 0 0 0 156 530 46 0 0 100 > 0 1 0 0 1640824 7560 204048 0 0 0 0 519 26 0 0 100 > 0 0 0 0 1640828 7560 204052 0 0 0 53 530 36 0 0 100 And we're done, after writing another 1.3 MB to disk. > -------------2.4.18-14------- > 0 2 0 0 1645608 5972 203496 0 0 0 1032 528 38 0 1 99 Again, dd writes 1MB then calls close which triggers venus to perform a store. > 0 0 0 0 1645336 6000 204524 0 0 0 1201 553 2106 3 2 95 Whoosh, about 1.17MB in one go, but look at the number of context switches. This involved 2 processes who were actively triggering each other. The amount of data is close enough that I assume this is the same data that is written in the previous sample. Were these with the Coda server on the same machine as the Coda client? JanReceived on 2003-01-31 14:14:06