Coda File System

From: Stephan Kanthak <kanthak_at_I6.Informatik.RWTH-Aachen.DE> Date: Wed, 4 Sep 2002 14:38:10 +0200

Hi!

After having a more or less detailed look into distributed file systems after
reading a lot of manuals I am wondering a little bit about one single point
in the current implementations (includes CODA):

Why is there the need of dedicated file servers? I think it would be easier to
use and handle if there are only two types of machines:

- one central server that holds the complete list of files (not the files
  itself) and a list of all connected clients
- a lot of clients with cache(s) that hold the files' data

Every time there is a read access on the client it first tries to directly
serve from the cache. If that is not possible the client cache manager asks
the server if the file exists in the global file tree and the server returns
a client or the list of clients where replications of that file exist.
The file is then transfered peer-to-peer between clients while the server
is already free for the next requests. I think, it is best for the server to
keep the list of all files as empty directory entries on a local filesystem.
Performance should be not worse than NFS in that case, even better
because e.g. under linux the file system cache in memory will only hold
directory inodes.

For modify or write accesses there could be a read access first if the
file does not exist in the local cache and the server distributes tags to all
clients that have copies of that file to mark the file as modified and should
be retransfered on next read/write/modify.

That system could work quite well even on server disconnect, because
the local filesystem still works. And you don't need any adminstration
of file servers or dedicated file servers. The server can also keep track
of preconfigured minimum and maximum replication counts in order to
minimize the danger of single client harddisk failures and overall filesystem
overflow (overcommitment).

If I missed any implementation of a distributed file system that has exactly
the features mentioned above, please tell me. I already tried CODA and
it seems to be not feasable to install it on our site with over 10TB over
hard disks (4 RAID arrays with 640GB each + about 100 local hard disks
with 80 GB) where each client node is heavily used as computing server
using a lot of data from hard disks and the available system memory of 2GB
per node is mostly used by the computing jobs itself.

Comments are welcome.

Cheers,
Stephan Kanthak

_____________________________________________________________________________

Dipl.-Inform. Stephan Kanthak                             Tel +49 241 8021618
Department of Computer Science VI, RWTH Aachen, Germany   Fax +49 241 8022219
Prof. Dr.-Ing. Hermann Ney
e-mail: kanthak_at_informatik.rwth-aachen.de
http://www.informatik.rwth-aachen.de/I6/Colleagues/kanthak/
_____________________________________________________________________________

Coda File System

coda client/server concept