(Illustration by Gaich Muramatsu)
Hi Jan. Firstly, thank-you for the very prompt reply to the email - much appreciated. When we did our install we were lazy and went with the 5.3.19 RPM's - will it be ok to just build and then deploy the /usr/sbin/codasrv binary and run this with the files from the 5.3.19 RPM's or are there additional files or issues that I need to consider? [ I'm hoping (fingers crossed) that it will be as simple as deploying the new /usr/sbin/codasrv and keeping my existing installations. ] Also, a few things I'm not clear on: We're currently running 30M(log)/500M(data) and I've found that I can't go to 30M(log)/1G(data), are there any other RVM data sizes that are available after the 500M one? If so then how are the sizes calculated? [ The workaround each time I've hit this problem has been to use 'norton-reinit ... -dump', then vice-setup-rvm with bigger sizes, then 'norton-reinit ... -load'. It would be good to know if there is a size we can use next time we need to workaround this problem. ] Best Regards, Chris Shiels. Senior Systems Architect Taglab Ltd. Jan Harkes wrote: >On Thu, Jun 06, 2002 at 04:44:51PM +0100, Chris Shiels wrote: > >>We're consistently running into problems with our codasrv processes dying >>with the following error message in /vice/srv/SrvErr: >> >>kossoff# tail SrvErr >> RVMLIB_ASSERT: error in rvmlib_malloc >> > >I know what it is... > >Basically caused by fragmentation of RVM, aggrevated by a poor judgement >change I made that went into 5.3.18 or .19. > >The quick fix is already committed in CVS, it is a simple one line >change to the vnode allocation stride in coda-src/vice/srv.cc. Still >working on the necessary cleanups to do the real fix, which is to change >the ever growing vnode array to a fixed size hashtable. > >>We've now reached what seems to be the maximum RVM log and data sizes >>available on our platform and we are unable to detect when this will >>happen next or resize to higher values as none seem available. Can >>you please help with this? >> > >If you check out and build the CVS version you should be able to store >about 4 to 8 times the number of files in RVM before this hits you >again. > >>Incidentally we don't think we're storing that much data - each volume >>contains approx. 15000 files for a total size of approx. 180Mb per volume. >> > >I found that I typically hit the limit around 30K files in a volume on a >server with 200MB of RVM data. So 15000 does seem rather low. > >>With 2M(log)/90M(data) we'd see the 'RVMLIB_ASSERT: error in rvmlib_malloc' >>error whilst trying to populate the first volume. >> >>With 20M(log)/200M(data) we'd see the 'RVMLIB_ASSERt: error in >>rvmlib_malloc' >>error whilst trying to populate the fourth volume. >> > >Log size doesn't really matter, except for the defragmentation step that >sometimes seems to succeed when RVM is exhausted. In a way these numbers >seem to match my experimental data pretty well. By using several volumes >instead of one you've managed to store about 60000 files before hitting >the severely fragmented case, which is about twice of what I got with a >single volume. > >>I'm guessing the RVMLIB_ASSERT is error being caused by filling up all >>available space in the RVM log or data. Is this correct? >> > >Yes, (and no). You didn't really run out of space, but there isn't a >large enough consecutive chunk to satisfy the allocation. A volume >consists of an array of pointers to vnodes (file objects) and the actual >vnodes. > > AAVVVV > >When the array is filled, it is resized and new vnodes can be allocated. > > AAVVVVAAAA (new array allocated, old data is copied) > __VVVVAAAA (old array is freed) > __VVVVAAAAVVVV (new vnodes are allocated) > >This is repeated, and gaps are starting to occur, > > __VVVV____VVVVAAAAAAVVVV > > >Defragmenting doesn't work because we have the vnodes at each side and >the way both RVM allocations are satisfied and the fact that an >allocation pool is used to speed up vnode allocation and freeing simply >leads to a lot of unused space in RVM. > >>According to the announcement for 5.3.19 this is done by 'volutil >>printstats' but I just can't see this information. >> > >After printstats is run, it should be at the end of /vice/srv/SrvLog >with the header 'RDS statistics'. > >> release_segment unmap failed RVM_ENOT_MAPPED >> rds_zap_heap completed successfully. >> rvm_terminate succeeded. >> >>Whilst the rdsinit was running I could see it had been mapped at 0x50000000 >>by looking at /proc/pid/maps. Additinally strace -p pid indicated that the >>call to munmap() was successful with exit status 0. >>What's going wrong here? >> > >Possibly colliding with shared libraries or something. We did have an >extremely large server running on NetBSD, which has a slightly different >initial offset for RVM. It might even have been staticly linked to >create a more compact binary and to avoid shared library issues. > >Jan >Received on 2002-06-06 14:28:31