(Illustration by Gaich Muramatsu)
On Wed, 20 Jan 1999, Peter J. Braam wrote: > > AFS deals with this by 'chunking' -- that is, it demand-loads portions of > > files into the cache as they are needed; I believe it also uses an > > agressive read-ahead policy. The net result is more efficient use of the > > cache for partial file reads or writes, especially for mammoth files. > > I just sent a message about this. > > > However, that raises consistency issues: currently the resolution of > > conflicts between file versions is that of entire file system objects > > (files or directories). Dealing with fine-grained inconsistency severely > > complicates the repair process, I would guess; it is not even clear if the > > client would have access to the whole file version it is attempting to > > integrate. For disconnected operation anyway, it seems like transferring > > the whole file is more useful, as the chances are high that if you access > > a bit of the file, you will access all of it (loading it into emacs, > > writing it out, etc). > > Whoops, this is a good point. However, the conflict resolution mechanisms > themselves would use the chunk fetching code, so it need not really be a > problem. The problem situation I was thinking of was this: Client1 is connected, and retrieves the middle chunk of a file. A write is made to the middle chunk, but before it can be written back, Client1 goes disconnected. we now have a pending write on the middle chunk of a file, but only the middle chunk is on Client1. Client2 now bops up and proceeds to modify the file in some manner, and succeeds. Client1 now reconnects. A Client-Server conflict has arisen and must be resolved for the change to be reintegrated. However, because only a small part of the entire file is available on Client1, the resolution process may now be more difficult. Consider, for example, the case where it is an MS Word file. An application-specific resolver is required, but it doesn't have access to the two complete versions of the file, so it may not even have access to the old file header :(. The whole-file-in-cache is a simplification for version control that I think really does make life easier. On the other hand, chunking would definitely improve performance (especially perceived performance during a more or the like--the latency to the first available data is much lower). Maybe this is an appropriate application for the 'client class' behavior I suggest below, and that we both seem to agree is a large project and should wait :(. > > > I was really hoping to have home directories mounted over coda, with inbox > > > being stored right in the accounts, (and also large procmail filtered > > > mailing-list archived mail folders) but that wont be feasible until at > > > least write-back caching is available in a connected state. > > > > > > I just got coda running recently, but the initial excitement has faded > > > somewhat after discovering the above.. :( > > > > My suspicion is that the arrangement you describe will suffer from Coda's > > weak consistency model: if multiple clients are using write-back caching, > > then conflicts can occur. > > Write back caching wil have the same semantics as connected Coda. > If another client comes along, then the one holding the write back token > will have to reintegrate first. > > Conflicts in Coda arise as easily in connected mode as in AFS you would > overwrite data (last close wins in AFS). The problem with receiving email > in Coda is locking to avoid conflicts. I don't know how AFS does this, > but with NFS it is certainly possible to ruin your mailbox easily. Token-like behavior for file systems is clearly very nice, and would improve consistency. However, this is a departure from the traditional Coda consistency model. With replicated servers, how will tokens be allocated, and by which server(s)? In Coda, conflicts are more easily come upon than AFS-last-close behavior because of replication. Having the 'AFS-class client' that uses last-close and timestamps to manage conflicts might result in unexpected but at least non-interactive behavior. > It's a good puzzle to see if Coda's connected semantics allow for the > atomic creation of a lock file. Perhaps that is just possible. On the > other hand, I don't really have much more faith in AFS or NFS without lock > daemons when it comes to my mail. I would guess that Coda does not allow atomic creation on a replicated volume, only on an unreplicated one. Even then, only Venus will know whether it was atomic and successful; if the client is disconnected, then the userland mail process only sees the lock file creation succeed, and doesn't know it has been logged. Similarly, you might have problems with lock files being left around: client is connected, creates lock file, and then goes disconnected. This is a lock like that nasty netscape problem with netscape crashing and leaving lock files all over the place, only in this case the disconnected mail client still thinks it has an atomic lock :O. As such, a disconnected system really needs to support lock preemption, possibly notification, and certainly verification that a lock is still valid. Perhaps an optional distributed lock manager could be used with Coda (presumably replicated with strong consistency in the style of Ubiq or using a multi-party lock algorithm). Disconnected operation still introduces incomfortable situations, but at least connected clients could guarantee locks. My suspicion is, however, that when there are already specific multi-user locking semantics for a specific application, that application should be served by its own replication mechanism and not by a file system with weak consistency. So replicated IMAP servers might be a better solution, with IMAP's disconnected operation and reintegration techniques. Or a mail reader that takes advantage of Coda as a message store with weak sementics. > > This is not to suggest that Coda is not useful in such an environment; > > it's real benefits come in the case of mobile computing. It might be > > interesting to introduce the concept of different 'classes' of client: > > that is, the semantics and consistency enforced for a particular client > > might depend on the role it was expected to play. > > Yup, unfortunately, that's a rather major project probably. It sounds like it. Ideally I see something like this: venus -consistency strong venus -consistency afs venus -consistency codamobile venus -consistency slush In each case, the strongest consistency available would be used, but the fallback case where it wasn't would be different. That is, if you started venus with codamobile, when connected you'd get AFS or strong consistency, but when disconnected you'd get logging and reintegration. With AFS consistency at startup, you'd get strong or AFS connected, and when disconnected either everything hangs or obeys last-write based on timestamps or something. With strong, you'd get strong or hangs. Robert N Watson robert@fledge.watson.org http://www.watson.org/~robert/ PGP key fingerprint: 03 01 DD 8E 15 67 48 73 25 6D 10 FC EC 68 C1 1C Carnegie Mellon University http://www.cmu.edu/ TIS Labs at Network Associates, Inc. http://www.tis.com/ SafePort Network Services http://www.safeport.com/Received on 1999-01-20 16:08:23