Coda File System

Re: blocking on open()

From: Jan Harkes <jaharkes_at_cs.cmu.edu>
Date: Thu, 23 Sep 2004 10:55:26 -0400
On Thu, Sep 23, 2004 at 02:32:43AM -0500, Brian Finney wrote:
> However it would seem like there would be little if any harm in
> allowing read only operations on parts of a file on demand, e.x. while
> loading the file into the cache also allowing a prog to read as the

Correct, some of the things I was thinking about where early access
would be useful,
    unpacking a (large) tarball that is fetched from Coda.
    windows explorer wants to open all files in a directory to grab
    icons.
    better quota support for writes, we'd finally know when a process
    wants to grow a file beyond the allowed cache/quota size.

Things where it could be problematic or wouldn't help,
    writing an iso image from Coda directly to CD.
    zip archives have their information at the end.
    mp3 (version 1) tags are at the end of a file.
    some programs always open O_RDWR even when only reading a file.

Any changes in this area will require some thought. I'm not sure
exactly, but there is something that prevents us from fetching the same
file twice, so there must be some lock on either the volume or the
object. We would have to change how things are locked down to deal with
this, i.e. writers need to block until the whole file is fetched, but
concurrent readers might already have partial access. Should we have
a way to expliticly cancel/abort a background fetch whenever the file is
closed. That isn't so easy, but if we don't we could quickly run out of
worker threads.

Right now there is a 1:1 match between blocked processes and worker
threads, but if we continue the fetch in the background a single process
could quickly use up all available workers and further requests will end
up getting queued in the kernel, and among those queued requests would
be readers that want to see if they can access deeper into a file, so
we'd end up in the current situation. On the other hand if we spawn new
threads for each fetch a single application could easily trigger
hundreds of concurrent fetches, making the cachemanager use way more
memory and slowing down everyone else.

I still like the idea a lot, I'm just trying to be prepared and know all
the problems that might show up before actually trying to implement
something like this.

Jan
Received on 2004-09-23 10:57:34