Coda File System

From: Jan Harkes <jaharkes_at_cs.cmu.edu> Date: Tue, 29 May 2001 11:20:37 -0400

On Tue, May 29, 2001 at 03:21:46PM +0200, Mourad De Clerck wrote:
> My questions now are:
> * do you think cable/adsl is adequate for such an application, 
> considering the overhead of coda?

It is definitely adequate, I myself am using a slow dialup link, but
several of the graduate students here at CMU are using Coda over ADSL.

They did identify a problem, Coda uses bandwidth estimates to adapt it's
behaviour towards available bandwidth. This estimation code is part of
RPC2 and currently assumes a symmetric link. However, we effectively
have both strong and weak connectivity at the same time (depending on
read/write traffic) and wrong decisions are being made about whether to
switch into write-disconnected or fully connected mode and how much data
should be reintegrated at a given time.

> * what happens if a file gets saved again on thesame machine, before the 
> previous update has finished propagating? (is the file still sent two 
> times?)

When Coda is in write-disconnected mode any modifying operations, such
as stores, are kept locally for about 5-10 minutes to allow for
optimizations (newer stores overwriting old ones, create/remove
cancellations, etc.). But once a file is being propagated and a new
store arrives, the store in progress is not aborted. So yes, when the
stores are typically more than 5 minutes apart, the file will be sent
multiple times.

> * should i setup a coda server at each location, or just one at the 
> "main" server (+ maybe a backup) (the vpn has point-to-point links from 
> each node to each node)

One or two servers in a central location is best, Coda servers like to
be "central" with good connectivity. The caches on the clients should 
provide a good buffer to reduce most network traffic.

> * since most of the people work on Mac's i need to get the files on 
> their desktop somehow too... i thought doing this by sharing the /coda 
> directory with netatalk. Could this be a problem?

Haven't tried this. We have successfully shared /coda using userspace
nfsd and samba servers. The main problem is when a conflict is detected
during reintegration of the disconnected operations. There is no way for
the clients to repair such a problem and they will be denied access to
the involved files (or directory tree) until someone on the machine that
exports /coda repairs the conflict.

> * i've heard that coda doesn't scale so well - is this true?

Scalability is relative. A Coda server can currently handle many
clients, and lots of data, but not a lot of files, or many files per
directory. The limitations that I've seen hit are around 150000-200000
files for every 100MB of configured RVM. Because of 32-bit address space
limitations it is not realistic to run a server with more than 1-2GB of
RVM, in which case the server would also need at least the same amount
of virtual memory (real + swap). Extending this, a Coda server would max
out at around 2 million files. Other problems are likely to hit before
this time, such as excessively long server-startup times. Directories
tend to fill up with as few as 4000 names.

> * can i use coda without problems for home directories? (kind of a 
> "roaming profile")

Yes, however some files are continually rewritten in the background by
applications and are prone to getting unwanted conflicts as soon as a
user would work from different machines. These are best replaced by
symlinks to a local disk partition. Examples of these are
~/.netscape/bookmarks, and enlightenment or several gnome application
configuration files.

> * is there a specific filesystem recommendation for use with coda? 
> (ext2, reiser, xfs)

As far as Coda clients are concerned, ext2 has been used the most and
works well. ReiserFS is not supported by anything but recent Linux
kernels (2.4.4 and later). I haven't looked at XFS yet, so it could
still have problems.

For Coda servers life is much better, because the server is a
traditional userspace application. Any filesystem will work.

> * what happens when there's not a lot (relatively - compared to the 
> server) of space on the harddrive for the cache?

It depends more on what the client's working set is. When the client
typically needs 100MB, anything less as a local cache will cause many
file refetches. My email folders add up to about 30MB, a smaller cache
causes a storm of fetches every 5 minutes when checks for new mail are
performed.

Jan

Coda File System

Re: coda questions...