(Illustration by Gaich Muramatsu)
On Thu, May 17, 2001 at 11:05:51AM +1200, Steve Wray wrote: > > From: Shafeeq Sinnamohideen [mailto:shafeeq_at_cs.cmu.edu] > > On Wed, 16 May 2001, Steve Wray wrote: > > > > > > It doesn't matter what kind of FS is used on the server. Only > > the client > [snip] > > > > > > I'm not sure how to interpret your comment about the client...? > > > > The client venus cache partition must be on an ext2, reiser, or ramfs > > partition for it to work. This is because when the Coda kernel module gets > > a request, it must be able, in the kernel, to forward it to the file > > system that contains the container file so it can do the operation. > > Which is the container? /vicepa? > Is that the cache partition on the client? > I'm still groping around the terminology here... No, /vicepa is on the server. The server doesn't do any tricky stuff, so it doesn't matter what type of filesystem is used. On the client, the file data is stored in 'container files', which are located under /usr/coda/venus.cache/. When an application opens a file in /coda, the cachemanager opens the associated container file and passes the filedescriptor (before 2.4.4 it was device/inode numbers) back. >From that point on, all read/write and mmap operations operate directly on the container file without bothering about sending upcalls to userspace. This allows Coda to achieve the same performance for read and write calls as the filesystem on which the container files are stored. However, the code that redirects the read/write/mmap operations assumes that the container file can be accessed using the kernel's generic read/write and mmap implementation. This assumption is valid for at least ext2fs, reiserfs and ramfs (I checked and tested these). But it is broken for filesystems like tmpfs and vfat. > On the toy client I was working with, all partitions > were LVM/XFS except for those holding rvmlog and rvmdata, > these were seperate logical volumes and were unformatted. LVM (like RAID) shouldn't have any influence because it operates on a much lower layer, the block layer. If you didn't see any strange behaviour, especially when writing to files in /coda, XFS must be using the generic read/write calls. > > The overall design of Coda assumes that writes are much less frequent than > > reads, which is the experience from AFS. Thus Coda is less suited for > > workloads that write heavily. > > ohhhhh so you wouldn't want /usr/*/src on it... > :) Just consider the fact that most object/dependency files are very machine specific, CPU type, compiler version, installed libraries, available include files, etc. I do tend to keep my source in /coda, but typically have the object/build tree somewhere on the local disk. $ cd /home/jaharkes/build/x-obj $ /coda/usr/jaharkes/x/configure $ make x All of the Coda sources and many other applications have no problem with this. > > Of course, the server doesn't do anything special for the client running > > on the same machine, only the bulk data transfers go faster across the > > "network". > > ok so WRT populating a coda directory, it doesn't really matter if > its done by the server or some client? (performancewise) I believe it doesn't make much difference, as long as your network is 100Base-T or faster. > > > Does this mean that Linux is particularly bad for Coda? > > > Is this fixable with any tweaking? Different filesystems? > > > > > On the BSDs, one can place the RVM log file on a raw disk partiton, so > > accesses will not go through a file system. > > Thats what I did in Linux... Linux doesn't really provide raw access to the partition, all data is still goes through the page or buffer caches. Stephen Tweedie made rawio patches for 2.2, but I haven't looked at those. > > Generally, placing the log file an a separate physical disk will help, > > since only the log needs to be appended to synchronously, while the data > > file and /vicepa can be written lazily. > > It can be hard to arrange that on LVM... > :) > it kinda makes the disks transparent... We really should do logging differently, for debugging it is useful to log unbuffered (or use fsync after every fprintf). However, for production use and performance reasons it would be better to allow both libc and the OS to buffer writes. Performance really hasn't been a concern yet, and there are probably many places where we can do a lot better with simple changes, like reducing the amount of unnecesary information in the logs and removing fsync's after fprintf's. JanReceived on 2001-05-17 10:36:08