(Illustration by Gaich Muramatsu)
On Sun, Jul 03, 2005 at 01:47:51PM +0200, Florian Schaefer wrote: > The symptom is that venus will become a zombie at some point and cannot > be gotten to restart other than with a -init. This happens on both > clients. Therefore I thinks that my volume may be corrupted in some way. It doesn't look like you have a corrupted volume. > I'm sorry, that I cannot serve some back-trace information, but the > binaries don't contain debug information. Yeah, rpm helpfully strips all unstripped binaries during packaging, does installing coda-debug-debuginfo-6.0.11-1-i366.rpm help? > Is there a cure for this illness? Not sure. This is what I think might be happening. The crash happens on a lookup of your realm name on the fake root volume, where the fake root doesn't have any contents. You have a fairly small cache (50MB), and are writing one or more 26MB files in write-disconnected mode. During this your clients runs out of available cache space. The 'cache overflow' message are shown whenever venus has dropped everything it could. One of the dropped objects must have been the data for the fake root root volume that is mounted on /coda. This data is generated as realms are discovered or expired, and we can't actually refetch it from the servers. I am setting a couple of flags on these objects that I thought would prevent the data purge, but clearly they are only (successfully) preventing the fetch attempt. In any case, we're not crashing yet at this point, because the kernel caches directory lookups. Then the reintegration completes and at this point something must trigger an kernel cache invalidation/refresh, and a couple of seconds later we hit the lookup that kills the client. I have several options to fix the problem, avoid purging the fake root directory contents, or automatically recreate it when we are trying to refetch the data. What you could to to mitigate some of the problems, is increase the size of your venus cache to at least double the size of the largest object you might be working with, but preferably more. Something like 100-500MB (100000-500000 blocks) will significantly reduce the relative amount of objects that have to be discarded to make room for a new file, instead of having to throw out 50% or more of all cached data you'd only need to discard 5%, which gives venus a better chance to make a good pick. JanReceived on 2005-07-03 17:49:57