(Illustration by Gaich Muramatsu)
On Thu, Sep 15, 2005 at 12:00:31PM +1200, Jeremy Bowen wrote: > I've had a few repeats of an issue while testing coda lately. > I'm getting into a situation where the following message is repeated every > second: > eg. > Red zone, stalling writer ( 11:36:41 ) > > Having a quick look at the code (coda-src/venus/vproc.cc line 572), this could > be due to the last case regarding a dirty cache. > > redzone = !free_fsos || > free_mles <= MaxWorkers || > free_blocks <= (CacheBlocks >> 4); /* ~94% cache dirty */ > > I'm adding and deleting a large number of temporary files in order to > run some benchmarks on the filesystem so I guess this could cause the > cache to become exhausted. > > Once I get into this situation, it doesn't seem easy to get out of without > killing the client and server :-( > Does this sound normal ? Is there anything I could do to mitigate the effects > of this ? If you get the 'red zone' messages, you're already past the 'yellow zone' ones where every write was delayed for a couple of seconds. The problem seems to be that your client is not pushing updates back to the server and is running out of local storage space to store the dirty state. The slowing down/stalling is done in the hope that this will allow the background reintegration thread to push some of the dirty state back to the servers. However in some cases the reintegration will not proceed. This happens when the servers are unavailable, or when the servers detected some conflict and are blocking reintegration. Without the yellow/red zone logic which is slowing down or even stalling further local mutations, your Coda client would have crashed. But really it is only a symptom of the real problem which is that reintegration not making progress fast enough to get rid of the local dirty state. Of course you could trigger the problem if the cache is relatively small compared to the files you are creating. If you have a 500MB cache and try to store a 600MB file, it will block everything until this file has been pushed back to the servers. After that, the client will refuse to fetch the file because it won't fit in the local cache. JanReceived on 2005-09-22 13:28:39