(Illustration by Gaich Muramatsu)
On Mon, Aug 15, 2005 at 01:31:43PM -0400, Kris Maglione wrote: > There's nothing stopping Coda (in theory. I haven't seen the code > relating to this) from implementing both partial and full file caching. > Whether it be a knob between two modes of caching, a switch to require > the fetching of all blocks (with unneeded ones at a lower priority, put > off until essential data is retrieved), or just a program like hoard > deciding what files need to be cached fully, and doing so. I'm not > saying that this should or will be implemented, but it is possible, in > theory. For Coda and AFS. Actually there are many reasons to not have block level caching in Coda. - VM deadlocks Because we have a userspace cache manager we could get into the situation where we are told to write out dirty data, but this causes us to request one or more memory pages from the kernel, either because we allocate memory, or are simply paging in some of the application/library code. The kernel might then decide to give us pages that would require write-back of more dirty state to the userspace daemon. We would have to push venus into the kernel, which is what AFS did, but they aren't dealing with a lot of the same complexities like replication and reintegration. - Code complexity It is already a hard enough problem to do optimistic replication and reintegration with whole files. The last thing I need right now is to add additional complexity so we suddenly have to reason about situations where we only happen to have parts of a locally modified file, which might already have been partially reintegration, but then overwritten on the server by another client and how to commit, revert or merge these local changes in the global replica(s). As well as effectivly maintaining the required data structures. The current RVM limitations are on number of file objects and not dependent on file size. You can cache 100 zero length files with the same overhead as far as the client in concerned as 100 files that are 1GB in size. - Network performance It is more efficient to fetch a large file at once compared to requesting individual blocks. Available network bandwidth keeps increasing, but latency is bounded by the laws of physics. So the 60ms roundtrip from coast-to-coast will remain. So requesting 1000 individual 4KB blocks will always cost at least 60 seconds, while fetching a the same 4MB as a single file will become cheaper over time. - Local performance Handling upcalls is quite expensive, there are at least 2 context switches and possibly some swapping/paging involved to get the request up to the cache manager and the response back to the application. Doing this on individual read and write operations would make the system a lot less responsive. - Consistency model It is really easy to explain Coda's consistency model wrt other clients. You fetch a copy of the file when it is opened, and it is written back to the servers when it is closed (and it was modified). Now try to do the same if the client uses block-level caching. The picture quickly becomes very blurry, and Transarc AFS actually had (has?) a serious bug that leads to unexpected data loss in this area if people were assuming that it actually still provides AFS semantics. Also once a system provides block-level access, people start to expect the file system provides something close to UNIX semantics, which is really not a very usable model for any distributed filesystem. JanReceived on 2005-08-30 13:16:13