Coda File System

Re: rvm data

From: Jan Harkes <jaharkes_at_cs.cmu.edu>
Date: Thu, 2 Nov 2000 10:29:24 -0500
On Thu, Nov 02, 2000 at 12:59:54PM +0100, Simon Josefsson wrote:
> While small experiments worked fine, scaling this to larger data sizes
> seem problemtic.  The size recomendation for the "rvm data partition"
> is 3-5 % of the total file data space.  I also noted that this
> partition was mmap()ed into memory.
> 
> So if I want to store 50GB in coda servers, my servers need to mmap()
> 2.5GB.  That won't be fun.  Does the 3-5 % figure really scale
> linearly with data storage size?

Pretty much linearly, all metadata such as ACL's and file attributes as
well as the directory contents are stored in RVM. These numbers are
based on the average filesize of 16KB, and some metric for the ratio of
directories to files.

Many real-life installations don't match these `rules of thumb'. When
your files average about 100MB, there is less metadata needed and thus
the RVM/filespace ratio is far less then the suggested 3-5%.

> Is there a solution to this problem, or is coda not suitable for
> larger data sizes?

In any case you need a lot of swap. But when the RVM data is in a file,
it is possible to enable the use of private mmaps. This allows for a
huge improvement in startup time, as unmodified RVM-data pages are
simply discarded and re-read from the underlying RVM data file when
needed. Only modified pages are `dirty' and will be written to swap when
there is memory pressure. The swap footprint of a Coda server will
slowly grow over time during the lifetime of the server.

We have been thinking about `cleaning' up the swap footprint by
comparing the dirty pages with the on-disk pages after rvm-log
truncation and remapping (munmap/mmap) any pages that are identical, but
that would cost some cpu/disk performace and so we are not sure whether
the benefits would really outweight the cost.

But still, we can't mmap more than about 2-3GB of RVM due to 32-bit
addressing constraints. However, it is possible to run multiple codasrv
processes on a single machine, and the 32-bit address space is a per
process constraint. So to export up to 500GB of Coda filespace from one
machine, would involve running about ten 50GB Coda server processes.

> I noticed the vice-setup script only had templates for 500M-8GB, which
> isn't a lot compared to these days hard drives.

The biggest Coda server I know of was configured to handle around 35GB
of files.

Jan
Received on 2000-11-02 10:44:44