Coda File System

From: Jan Harkes <jaharkes_at_cs.cmu.edu> Date: Wed, 30 Jun 2004 17:53:23 -0400

On Wed, Jun 30, 2004 at 12:52:53AM -0400, Chet Murthy wrote:
> (1) I'm untaring the Linxu kernel into my coda directory on my client,
> which is connected to the server via DSL.
...
> (3) Basically, at some point, I see in codacon:
> 
> connection::unreachable lambda.csail.mit.edu
> 
> (4) I use the timestamp to look in the venus.log, and see:
> 
> [ W(24) : 0000 : 00:36:05 ] *** Long Running (Multi)Store: code = -2001, elapsed = 17576.1 ***

> 
> where the operation might be Reintegrate, one of the SHA operations,
> etc.  But there are lots of operations that seem to hit this timeout.
> And it doesn't seem like this timeout ALWAYS causes a problem.  Often,
> even for long periods, such operations will get such return-codes, and
> things will continue swimmingly.

Correct, the message you pulled out of the log simply tells us that we
made a ViceStore MultiRPC call which returned after 17.6 seconds. This
doesn't really tell us what the return code was.

There are operations that easily take a long enough time to trigger the
message. A reintegration, a getattr on a large file when SHA checksums
are enabled, or a store of any large file. Most of the time these return
successfully, but sometimes they return with ETIMEDOUT because the code
assumes we lost connectivity to the server.

The problem is in the RPC2 library and depends a bit on the fact that
connectivity is checked by the RPC2 part (librpc2), while file transfers
are handled by SFTP (libse). Whenever a remote procedure call is
executed which has a bulk transfer side-effect we send the RPC2 request
to the server, which then starts up the side-effect transfer which is a
separate thread on the client. At some point the original thread which
sent the RPC2 request becomes impatient and retransmits the request. At
this point the server notices that we got a duplicate request which is
still being processed and responds with a RPC2_BUSY.

Now if we don't see this BUSY response in time, we resend the request up
to 5 times and then give up. I think there is a 15 second upper limit,
if we haven't had any response in that time, the server is considered
unreachable. The BUSY could have been dropped or delayed due to the
overall network congestion, or the client might be too optimistic with
its roundtrip time estimate and sends its 5 retries off way to fast,
only adding to the congesting on the network.

The second problem is that each logical RPC2 connnection independently
handles it's own retransmissions. This works fine when there is only a
single request at a time, but if we have several processes hitting the
system at the same time and concurrent requests, each with possibly a
data transfer the effective bandwidth is much lower than the estimated
bandwidth, and as a result the RTT estimate should be considerably
higher.

Finally the whole RTT estimate is a bit fragile during a client->server
data transfer, when we're pulling data we get a bunch of data packets,
send a small ack and then receive the next batch. The timestamps in the
packets are almost all useful for estimating the available bandwidth.
However when sending data we end up queueing a lot in the local network
buffers, so the timestamps on sent packets don't really say much by the
time they are echo'd back on the ack.

It is also here that we suffer from the asynchronicity of the DSL link,
which seems to be better suited to pulling down bulk data (webpages)
than sending. TCP can deal with because of 2 reasons, the TCP
connections is typically either used to pull a lot, or to push a lot,
but rarely alternates between the two. So it can almost assume a
synchronous connection (fast connection when downloading, slow when
uploading), but doesn't have to switch behaviour once it stabilizes. The
second reason is that it is far more relaxed with respect to timeouts it
doesn't matter if it hasn't heard from the other side for a minute or
so, the transfer simply stalls for a while. Coda on the other hand tries
to be knows that some application is seemingly 'frozen' as long as we're
stuck in the RPC2 layer and will give up on an uncooperative network
more quickly.

Jan

Coda File System

Re: Venus errors: "Re: Long Running (Multi)XXXX: code = -2001, elapsed = NNNN"