(Illustration by Gaich Muramatsu)
On Thu, Mar 08, 2007 at 07:33:22PM +0100, Enrico Weigelt wrote: > hmm, right :( > > But we could transmit just the differences / changed fields > (since last transmit). We add an bitmask field to the stat block > which tells us which fields are actually filled. Empty fields are > simply left out. > > The new stat request also contains an bitmask which tells the server > which fields we're actually interested in. With some additinal heuristics > the server could some more fields than the client requested. I guess we strayed a bit from the original problem. The slowness was caused by high latency, not necessarily lack of bandwidth. The only way to overcome the latency is to send more data than the client asks for, such as piggybacking all attributes along with a request for directory data. The drawbacks are that we may be sending information the client already has (wasting bandwidth) or if we assume limited cache space we may be sending information that is less useful than the data the client has to discard to store the new information. Reducing the amount we send or store does not matter much, since the idea was to send more in order to hide the latency. So we would ideally want the client to store an (almost) infinite amount of information so that it never has to discard already cached data. But if we can cache everything without having to discard, we would either be be resending data that is already cached, if it wasn't cached it must have changed. At this point we get to what Intermezzo tried to do, instead of sending callbacks to invalidate cached objects forcing the client to refetch the ones it is interested in, the server would send a log of all recent changes. This approach works, but probably doesn't scale the same way, at some point the total amount of change in the system is limited by the available bandwidth necessary to propagate them to all clients. Coda's model assumes that we are not that interested in everything that happens on the server. So we are notified in case something may have become invalid (callback) and fetch only those updates we know we are interested in based on the user's hoard profile or as a result of application requests. In a way everything is a tradeoff. And because application requests will always be serialized we get penalized on high latency connections. A hoard profile could potentially avoid latency cost since it tells the client the complete set of files that we are interested in. We're not really exploiting such information though, partly because hoarding is an asynchronous background process, nobody is really waiting for it to complete (most of the time) and by only using a single hoard walking thread we interfere less with any foreground (user) activity. The user may accept 50% of his bandwidth being used by the background hoard fetches, but would probably not appreciate it if his webbrowser becomes unusable every 10 minutes. > > Also Coda clients use the getattr result to detect conflicting versions, > > while file and directory contents are only fetched from a single server. > > So if we piggypack file attributes with the directory data the client > > would not see differences between replicas until we try to revalidate > > the cached attributes, or fetch the data from another replica and notice > > the version mismatch. > > Well, we could do this with an separate notification. For example, the > client gives an list of files it has in cache, and the server then sends > change notifications automatically. In other words: the client subscribes > to certain filesystem object. That is already done in the form of callbacks. When we first fetch an object the server remembers this and will send a callback notification whenever it changes. If the client got disconnected it will first send the local object identifiers and versions with ValidateAttrs to check if any of them have changed during the disconnection and reestablish callbacks for the unchanged objects. What I was trying to describe is the replicated server case. When one server knows more (or less) than another and we only get information from that one server we never get to see that there is a difference. Also only the one we talked to will inform us about updates. So if some other client has a poor network connection and only updates the other server we never get told about it. That is until we disconnect/reconnect and start checking all of our cached objects with ValidateAttrs. > BTW: TCP would be good for people sitting behind an firewall, > stream encapsulation (ie. ssh/ssl), etc. Firewalls are often there for a reason. I can see a case for TCP as a means to offload data transfers to the kernel and because people building network routers try to avoid breaking TCP connections. But for RPC operations which are always request-reponse, UDP really isn't that bad, DNS uses it all the time. Our only difference is that we try to associate state with an the host/port of an incoming UDP packet, but masquerading may change those. There are other solutions, a per client random key/identifier, or a polling 'has anything changed' query, or leases. All of those are not insignificant changes. We also don't need to tunnel Coda traffic through an ssh or ssl tunnel, the existing encryption code should do a pretty fine job already. RPC2's encryption follows the various IPsec RFCs as closely as possible. Some differences are that we operate on the UDP level instead of IP and our session keys are not between hosts but between RPC2 endpoints. One significant difference is that we use the modified Andrew RPC handshake for key establishment instead of punting that to either a separate daemon or static key. This handshake is as far as I know assumed to be secure, the original Andrew RPC protocol was analyzed by Burrows, Abadi and Needham which identified a weakness. They suggest an alternate protocol that provides a stronger guarantees, which is the 'modified Andrew RPC protocol' that we use. A Logic of Authentication (1990) http://citeseer.ist.psu.edu/burrows90logic.html The Andrew RPC protocol analysis starts at page 26 (28 of the pdf) JanReceived on 2007-03-08 19:25:50