(Illustration by Gaich Muramatsu)
On Sun, Jul 25, 2004 at 02:14:54PM -0400, shivers_at_cc.gatech.edu wrote: > From: Ivan Popov <pin_at_medic.chalmers.se> > > (yet I do not seem to find how big your client caches were) > > Pretty big, varying from 100Mb to 10Gb. I use a 200MB cache which works pretty well. It translates to about 8000 locally cached objects. But without tweaks, a 10GB Coda client will try to cache up to 420000 files. This in itself shouldn't be a problem except for the fact that there are a couple of places where every object is compared to every other object. So with 8K objects there are about 64 million comparisons, while with 420K objects there are more than 176 billion. If each of these comparisons takes a tenth of a microsecond (pulling random number out of the air here), my venus could run through this loop in about 6 seconds. But your client will need about 17640 seconds (5 hours) for the same comparison loop. In addition, with 8000 objects I use about 21MB of RVM on the client, but when I extrapolate to 420000 objects there would be 1.1GB of RVM needed. And the any-to-any comparison will trash through that memory, if there isn't enough RAM the machine will probably end up swapping itself to death. Even what used to be a simple linear lookupa find 'foo' will become a very expensive swap heavy exercise. Now the RPC2 layer gives us about 15 seconds for a reply on either the client or the server side before we give up and disconnect. Which client do you think is more likely to have unexpected connectivity problems. We have been adding special yield operations in such loops to give the rpc2 layer at least a chance to deal with incoming requests and send a busy reply. But these yields don't take time lost due to swapping into account, and since I've never ran a client with a 10GB cache there might be several other places where additional yields are necessary. At some point the original algorithms need to be rethought. A large cache simply cannot work with O(n^2) operations and even O(n) should be avoided as much as possible. > I note that it has been > 10 years. And it appears to be, in some sense, > a deep part of coda's design philosophy. Not really part of the design philosophy. It is just a result of the implementation, which did of course start over 10 years ago, probably around the same time that it was great if your computer had a 20MB harddrive. We've been working on fixing such implementation problems within the existing framework which can be quite difficult at times. Intermezzo was a 'start from scratch' attempt, but I don't think it really took off. JanReceived on 2004-07-30 12:54:24