(Illustration by Gaich Muramatsu)
On Wed, Aug 30, 2006 at 10:06:11AM +0200, michel.brabants_at_euphonynet.be wrote: > * 2 GB file-limit on linux. I've read that there is a patch for windows, > but the question is how long before a stable patch wil be merged with the > linux-code? It is not a separate patch, it is already part of the Windows release. Bigfile support is interesting, it added absolutely no changes to venus. As far as Coda is concerned it is still doing it's normal whole file caching thing, it just has to deal with a bunch of smaller files. The kernel module then transparently merges these files together into a single large image and presents it to the application as if it were single multi-GB file. It triggers prefetching on uncached chunks that follow the one the application is currently reading. A Linux implementation most likely wouldn't end up in the Coda kernel module, but could take the form of a preloaded library that intercepts some syscalls. Alternatively it could be a modification to the loopback device driver, something based on the device-mapper, or using Fuse. There are other filesystems don't have large file support. > * The faulty re-integration you mentionned that happens from time to time > it seems. This is a bad one or can it be detected and solved manually (at > least it is detected and solvable then). I read about a coda-rewrite, is > this already done in coda 6.x? It is pretty bad, but really unusual. I have about 3 clients running in write-disconnected mode for at least the past year and it hasn't happened to any of them. But a week or two ago it hit one of the graduated students from our group who was trying to use Coda over a trans-pacific connection that has very variable bandwidth and latency. Not sure if the bad connectivity is somehow related. There is no Coda rewrite except for the ongoing development where we replace and extend existing bits and pieces. Well there was Intermezzo, which was supposed to become 90% of Coda's functionality in 10% of the code. But the Intermezzo developers moved on to work on Lustre. It is kind of a pity that nobody has picked up the orphaned Intermezzo codebase, it was an interesting approach. > I want to add one last thing. I read that if a client is in disconnected > mode, 2x the space of files is/can be used because of keeping a copy of > the latest connected-version or so? Maybe I didn't get it and maybe you do > it already, but couldn't you for example only specify the blocks (on the > filesystem) that have changed? Maybe there are better solutions. Yes and no, and possibly. When a file is written we keep a copy of that write around for reintegration. This way is someone re-writes it before a reintegration he can not 'corrupt' the version we're about to send to the servers. If the new write completes before we started reintegration the old version is optimized away as it would get overwritten by the new version anyways. We can't track block level differences when the new write starts, because there aren't any as yet. We could possibly discard the old version when the new write operation starts, but if the client crashes before the new write has completed we would end up with neither version of the file. btw. I don't think the cache accounting actually tracks these backup copies, so Coda could use more than the configured cache space, but shouldn't discard cached files because of the backup copies. You would also only use 2x the space if you actually modify every single file in your cache during a disconnection. > ASR seems to be nice. Reintegration of subversion-repositories could maybe > be done by using subversion-merge or so in my case. Yes, technically the ASR could merge diverging versions of a subversion repository as if one of the two created a branch at the point where both started to diverge. The nice thing about version-control archives is that you actually have complete history, which makes it possible to discover such a branch point. Not sure how to handle clients that have a checked out working tree from the version that used to be the trunk, but is now moved off to a branch though. (another think I like about git, there is no branch/trunk distinction, so clients end up not caring). JanReceived on 2006-08-30 10:41:10