Coda is a distributed file system, i.e. it makes files available to a collection of client computers as part of their directory tree, but ultimately maintains the authoritative copy of the file data on servers. Coda has some features that make it stand out: it supports disconnected operation , i.e. full access to a cached section of the file space during voluntary or involuntary network or server outages. Coda will automatically reintegrate the changes made on disconnected clients when reconnecting. Further Coda has read write, failover server replication, meaning that data is stored and fetch from any of a group of servers and Coda will continue to operate when only a subset of all servers is available. If server differences arise due to network partitions Coda will resolve differences automatically to a maximum extent possible and aid users in repairing what can't be done automatically. Coda is very differently organized from NFS and Windows/Samba shares. Coda does have many similarities to AFS and DCE/DFS.
All of Coda appears under a single directory
/coda
on the client (or under a single drive under Windows). Coda does
not have different exports or shares as do NFS and Samba that are
individually mounted. Under
/coda
the volumes (aka
file sets) of files exported by all the servers (living in your
Coda cell) are visible. Coda automatically finds servers and all a
client needs to know is the name of one bootstrap server that gives
it information how to find the root volume of Coda.
is a group of servers sharing one set of configuration databases. A cell can consist of a single server or up to hundreds of servers. One server is designated as the SCM , the system Control machine. It is distinguished by being the only server modifying the configuration databases shared by all servers, and propagating such changes to other servers. At present a Coda client can belong to a single cell. We hope to get a cell mechanism into Coda whereby a client can see files in multiple cells.
File servers group the files in volumes. A volume is typically much smaller than a partition and much larger than a directory. Volumes have a root and contain a directory tree with files. Each volume is "Coda mounted" somewhere under /coda and forms a subtree of the /coda. Volumes can contain mountpoints of other volumes. A volume mountpoint is not a Unix mountpoint or Windows drive - there is only one drive or Unix mountpoint for Coda. A Coda mountpoint contains enough information for the client to find the server(s) which store the files in the volume. The group of servers serving a volume is called the Volume Storage Group of the volume.
One volume is special, it is the root volume, the volume which
Coda mounts on
/coda
. Other volumes are grafted into
the
/coda
tree using
cfs mkmount
. This command
installs a volume mountpoint in the Coda directory tree, and in
effect its result is similar to
mkdir mountpoint ; mount device
mountpoint
under Unix. When invoking the
cfs makemount
the two arguments given are the name of the mountpoint and the name
of the volume to be mounted. Coda mountpoints are persistent
objects, unlike Unix mountpoints which needs reinstating after a
reboot.
The servers do not store and export volumes as directories in
the local disk filesystem, like NFS and Samba. Coda needs much more
meta data to support server replication and disconnected operation
and it has complex recovery which is hard to do within a local disk
filesystem. Coda servers store files identified by a number
typically all under a directory
/vicepa
. The meta data
(owners, access control lists, version vectors) and directory
contents is stored in an RVM data file which would often be a raw
disk partition.
stands for Recoverable Virtual Memory . RVM is a transaction based library to make part of a virtual address space of a process persistent on disk and commit changes to this memory atomically to persistent storage. Coda uses RVM to manage its metadata. This data is stored in an RVM data file which is mapped into memory upon startup. Modifications are made in VM and also writtent to the RVM LOG file upon committing a transaction. The LOG file contains committed data that has not yet been incorporated into the data file on disk.
is stored somewhat similarly: meta data in RVM (typically in
/usr/coda/DATA
) and cached files are stored by number
under
/usr/coda/venus.cache
. The cache on a client is
persistent. This cache contains copies of files on the server. The
cache allows for quicker access to data for the client and allows
for access to files when the client is not connected to the
server.
When Coda detects that a server is reachable again it will validate cached data before using it to make sure the cached data is the latest version of the file. Coda compares cached version stamps associated with each object, with version stamps held by the server.
Coda manages authentication and authorization through a token. Similar (the details are very different) to using a Windows share, Coda requires users to log in. During the log in process, the client acquires a session key, or token in exchange for a correct password. The token is associated with a user identity, at present this Coda identity is the uid of the user performing the log in.
To grant permissions the cache manager and servers use the token
with its associated identity and match this against priviliges
granted to this identity in access control lists (ACL). If a token
is not present, anonymous access is assumed, for which permissions
are again granted through the access control lists using the
System:AnyUser
identity.
Like every filesystem a computer enabled to use the Coda filesystem needs kernel support to access Coda files. Coda's kernel support is minimal and works in conjunction with the userspace cache manager Venus . User requests enter the kernel, which will either reply directly or ask the cache manager venus to assist in service.
Typically the kernel code is in a kernel module, which is either
loaded at boot time or dynamically loaded when Venus is started.
Venus will even mount the Coda filesystem on
/coda
.
To manipulate acl's, the cache, volume mountpoints and possibly the network behaviour of a Coda client a variety of small utilities is provided. The most important one is the cfs command.
There is also a clog program to authenticate to the Coda authentication server. The codacon programm allows one to monitor the operatoin of the cache manager, and cmon program gives summary information about a list of servers.
The main program is the Coda fileserver codasrv . It is responsible for doing all file operations, as well as volume location service.
The Coda authentication server auth2 handles requests from clog for tokens, and changes of password from au and cpasswd . Only the the auth2 process on the SCM will modify the password database.
All servers in a Coda cell share the configuration databases in
/vice/db
and retrieve them from the SCM when changes
have occurred. The
updateclnt
program is responsible for
retrieving such changes, and it polls the
updatesrv
on the
SCM to see if anything has changed. Sometimes the SCM needs a
(non-shared) database from another server to update a shared
database. It fetches this through an
updatesrv
process on
that server using
updatefetch
.
On the server there are utilities for volume creation and management. These utilities consist of shell scripts and the volutil command. There is also a tool to manipulate the protection databases.