(Illustration by Gaich Muramatsu)
On Sun, Mar 31, 2002 at 09:28:20PM +0200, Ivan Popov wrote: > I run into a barrier... > > cfs strong; cfs wf -on; find /local/dir | cpio -pvdm /coda/dir > > goes for a while then the server is "growing a small list" taking a break > on answering the client => the client is thrown into write-disconnected > mode, or even disconnected mode, rapidly accumulating updates in the > log... > Then the server wakes up and the client tries to reintegrate, while the > server tends to take breaks for growing the lists. Yeah, I know exactly what is going on. RVM doesn't coalesce free fragments, until allocation fails. When we are growing the small (files) or large (directories) vnode lists we make one of the largest memory allocations that a server can possibly see, so even with mildly fragmented RVM these are the ones where we start defragmenting. The defragmentation basically involves walking the unordered list of free chunks and comparing them with all other free chunks, if the two are adjacent they are merged into a bigger one. This is done repeatedly until nothing can be merged anymore. This can also create very large RVM transactions that cannot be logged. There is no RVM realloc, so during all of this we actually use twice the size of the list we are growing of RVM memory. Some possibly not too hard solutions, - Add 'LWP_Yield' scheduling points during the defragmentation. This won't improve the slowdown or the size of the RVM transaction, but the server will at least be able to respond to incoming RPC's and tell the clients that it's busy. - Keep free chunks ordered so we can make the merge phase more efficient, this doesn't help in reducing the size of the the large RVM transaction after a 'defragmenting malloc'. - Always try to merge when we free a chunk. This will slow down free, and possibly allocations as we are more likely to have to split up larger chunks. But we won't have the periodic stall or huge RVM transactions. JanReceived on 2002-04-01 14:22:33