(Illustration by Gaich Muramatsu)
On Wed, Jun 06, 2001 at 03:40:06PM +0200, Steffen Neumann wrote: > (Server's Client has 250MB Cache, > Hoard DB is empty) ... > The venus log is filled with "RecordReplacement..." > The client is connected all the time, > (I don't know if CPU it goes to its knees > and thus response lags are long enough > to be considered a "slow" link) Actually, with a 250MB cache the 'priority-queue' calculations for cache-replacement can cause frequent burst of high CPU usage which stall the client long enough to stop responding to server backprobe RPC messages. It is definitely looking like there is something quadratic function there, i.e. 100 MB cache = +/- 4000 files = 16million ops = 1sec = unnoticable 250 MB cache = +/- 10000 files = 100million ops = 10sec = very noticable So I wouldn't rule out that the client is 'temporarily disconnected' when these errors occur, but I'll give it a try to reproduce it here. Another thing it could actually be is a kernel module problem. Whenever objects are thrown out of the cache, Coda informs the kernel. However, with the Linux 2.2 module, the kernel sometimes seemed to ignore these 'purge' requests and the kernel passes stale data to userspace. I just went through several large parts of our /coda tree, but haven't hit any "No such file or directory" errors. > Any ideas ? > I know, this is stressing coda, > but that helps debugging it ;-) Sure does. We already have the basis for something simpler than the priority queues using multiple LRU queues (multiple, because we still need to separate 'high priority' hoarded objects from 'low priority' objects). But I'm still looking to find a better solution for the hoard daemon, it's namectxts are bound to all objects, even non-hoarded ones. Even non-hoarded objects have some implicit 'hoard' priority to let the hoard daemon reestablish callbacks for all objects in the cache. JanReceived on 2001-06-06 10:30:06