(Illustration by Gaich Muramatsu)
On Wed, Sep 01, 2004 at 06:45:30AM -0400, David Howells wrote: > > > I am not sure about persistency across reboots. Also it assumes that the > > > cache is completely managed by some an in-kernel filesystem. So we would > > > need a lot of hooks and changes before venus can put anything in there. > > Not necessarily. You should be able to do it relatively easily from within the > kernel. It needs you to declare your indexes (fill_super/put_supe) and files > (iget/clear_inode), and to make calls from readpage() and writepage() and > releasepage(). Coda would then own its own pages, which would be backed by > CacheFS; CacheFS reads/writes directly from/to the netfs's pages. I don't really understand what you are trying to say here, I really should read the cachefs patch/documentation before trying to discuss it. > I presume Coda loads a whole file at a time into its cache, messes > around with it and writes the whole thing back? I could support that; > and, in fact, I probably need to for my AFS client. Technically we have a very simple glue layer in the kernel. As far as any file operations are concerned we only see 2 events which are bounced up to a userspace cache manager (venus). file open here our cache manager checks if the file is locally cached, and if not fetches the complete file from the servers. Once we have a complete copy we return an open file handle back to the kernel. From this point on all read/write/mmap operations are nothing more than trivial wrappers whose main function is to forward the operation to the underlying file object. file release this tells the userspace manager that a file reference was released, if the file was opened O_WRONLY/O_RDWR and there are no more writers left we mark the object as dirty and write it back to the server at the next opportunity. I still want to add a file sync operation, so that we can write back a snapshot (copy) of the file whenever an application calls fsync, it is also important to catch the moment when a file is closed, as opposed to the last release, we cannot return errors on close by the time we are notified when a file is released. But as far as the Coda kernel module is concerned, it doesn't deal with readpage/writepage/etc. all that is left up to the filesystem that stores the cached files, and this filesystem could possibly be cachefs. However... when a file is fetched from the servers, the data in the container file is written by a userspace process, so the (persistent) tmpfs variant might work here. Ofcourse our code doesn't expect a cache file (or parts of that file) to disappear when we don't explictily pin it all the time, but pinning it would defeat the usefulness of cachefs in the first place. I guess if cachefs looks like a 'lossy' mountable filesystem, it might even work without too many changes. We could just mount it in the place of the venus cache directory and only need to use an ioctl to pin any files that are opened for writing until we're sure that the changes are fed back to the servers. There would be no communication between cachefs and the Coda kernel module, everything still goes through venus. The Coda kernel module detects an open, sends the request to venus, venus opens (and optionally pins) a file in cachefs and fills it with the data. Then it passes the still open filehandle back to the Coda kernel module which keeps it around until the last reference disappears. Then the Coda kernel module sends the file release upcall and if it was opened for writing, venus reads directly from cachefs and writes the modified data back to the servers after which it unpins the file. The only thing then is to replace the flag we use to indicates whether we need to fetch the data or not with a stat(2) or access(2) test on the container in cachefs. > Tell me what you'd like to be able to store in CacheFS, and I'll see what I > can do to accommodate you. Ideally it would be a mountable file system that is persistent across reboots. It doesn't necessarily need to support a directory tree structure, just a single large top-level directory would do fine. It should have the option to pin files even when they are not actively referenced, and it should not delete little bits from the middle of a file. I guess it would also need a way to query the objects to see if it is pinned or not, that way we can check our own metadata against the cache. As you can see, except for the reclaimation, normal filesystems do just fine. Because we always fetch a whole file we already know how much to throw out before we even start to fetch a new file. Our weakness is that we can't do the same when a file is opened for writing, there is no way to tell how large a file will be when it is opened, and we won't see the final size until it is closed. By having the kernel/cachefs do the reclaimation, files would get discarded whenever space is needed instead of after the fact. Also the same cache space could be shared among different filesystems. So I can definitely see some advantages. JanReceived on 2004-09-01 10:40:18