(Illustration by Gaich Muramatsu)
On Thu, Nov 02, 2000 at 12:59:54PM +0100, Simon Josefsson wrote: > While small experiments worked fine, scaling this to larger data sizes > seem problemtic. The size recomendation for the "rvm data partition" > is 3-5 % of the total file data space. I also noted that this > partition was mmap()ed into memory. > > So if I want to store 50GB in coda servers, my servers need to mmap() > 2.5GB. That won't be fun. Does the 3-5 % figure really scale > linearly with data storage size? Pretty much linearly, all metadata such as ACL's and file attributes as well as the directory contents are stored in RVM. These numbers are based on the average filesize of 16KB, and some metric for the ratio of directories to files. Many real-life installations don't match these `rules of thumb'. When your files average about 100MB, there is less metadata needed and thus the RVM/filespace ratio is far less then the suggested 3-5%. > Is there a solution to this problem, or is coda not suitable for > larger data sizes? In any case you need a lot of swap. But when the RVM data is in a file, it is possible to enable the use of private mmaps. This allows for a huge improvement in startup time, as unmodified RVM-data pages are simply discarded and re-read from the underlying RVM data file when needed. Only modified pages are `dirty' and will be written to swap when there is memory pressure. The swap footprint of a Coda server will slowly grow over time during the lifetime of the server. We have been thinking about `cleaning' up the swap footprint by comparing the dirty pages with the on-disk pages after rvm-log truncation and remapping (munmap/mmap) any pages that are identical, but that would cost some cpu/disk performace and so we are not sure whether the benefits would really outweight the cost. But still, we can't mmap more than about 2-3GB of RVM due to 32-bit addressing constraints. However, it is possible to run multiple codasrv processes on a single machine, and the 32-bit address space is a per process constraint. So to export up to 500GB of Coda filespace from one machine, would involve running about ten 50GB Coda server processes. > I noticed the vice-setup script only had templates for 500M-8GB, which > isn't a lot compared to these days hard drives. The biggest Coda server I know of was configured to handle around 35GB of files. JanReceived on 2000-11-02 10:44:44