Coda File System

Re: [dhowells@redhat.com: [PATCH] CacheFS - general filesystem cache]

From: Ivan Popov <pin_at_medic.chalmers.se>
Date: Wed, 1 Sep 2004 17:37:05 +0200
On Wed, Sep 01, 2004 at 10:37:48AM -0400, Jan Harkes wrote:
>  The Coda kernel module detects an open, sends the request to venus,
>  venus opens (and optionally pins) a file in cachefs and fills it with
>  the data. Then it passes the still open filehandle back to the Coda
>  kernel module which keeps it around until the last reference
>  disappears. Then the Coda kernel module sends the file release upcall
>  and if it was opened for writing, venus reads directly from cachefs and
>  writes the modified data back to the servers after which it unpins the
>  file.
> 
> The only thing then is to replace the flag we use to indicates whether
> we need to fetch the data or not with a stat(2) or access(2) test on the
> container in cachefs.

With other words, we could make Venus life a bit easier, avoiding care
of cache reclaimation. Think, it can work even on systems without cachefs.

If we run a separate daemon, which guesses how long each file in the cache
was not useful, it could do efficient garbage collection, and make
intelligent decisions based on available diskspace.
Either the daemon would get some notifcation events
or just scan mtime on directories, discovering new files to take care of.

My first impression is that such a daemon can be rather efficient even
without addittional hooks, and portable to any system which has separate
mtime and atime on a local filesystem.

It does not have to have a persistent state, if we can afford a rescan
at startup. The scalability would be limited by the time a stat()
system call takes. Otherwise reading say 1000000 filenames is just several
megabyte directory data (presumably in a nice tree), it would take
couple of seconds on a modern computer.

If it regularly walks the cache filetree and actualizes the file status data,
it can easily maintain a desired cache_size/free_space ratio.
It may do file space monitoring quite often, may be even once every several
seconds, which would let it be quite responsive.

In a more elaborated incarnation it can listen to some notifications
and be able to quickly react, say when a massive file fetching or creation
is eating up disk space. Or just quickly add new files to its tables
when directory contents changes (fam can help with that).

> structure, just a single large top-level directory would do fine. It
> should have the option to pin files even when they are not actively
> referenced

Any flag that Venus can set on the files, like x-bit or anything, would do.

... and then it would be time to reimplement hoarding :)

Well, probably doing the right things inside Venus can do better, but
splitting functionality between independently maintainable parts is
a gain in the long run.
With a plus that it would efficiently work on Linux cachefs :)

My 2c,
--
Ivan
Received on 2004-09-01 11:38:51