Skip to content

Getting Started

What is Coda?

Coda is a distributed file system, i.e. it makes files available to a collection of client computers as part of their directory tree, but ultimately maintains the authoritative copy of the file data on servers. Coda has some features that make it stand out: it supports disconnected operation, i.e. full access to a cached section of the file space during voluntary or involuntary network or server outages. Coda will automatically reintegrate the changes made on disconnected clients when reconnecting. Furthermore, Coda has read-write, failover server replication, meaning that data is stored and fetched from any of a group of servers and Coda will continue to operate when only a subset of all servers is available. If server differences arise due to network partitions, Coda will resolve differences automatically to a maximum extent possible and aid users in repairing what can't be done automatically. Coda is very differently organized from NFS and Windows/Samba shares. Coda does have many similarities to AFS and DCE/DFS.

Getting clued in with the Coda terminology

A single name space. All of Coda appears under a single directory /coda on the client (or under a single drive under Windows). Coda does not have different exports or shares as do NFS and Samba that are individually mounted. Under /coda the volumes (aka file sets) of files exported by all the servers (living in your Coda realm) are visible (See Figure 1). Coda automatically finds servers and all a client needs to know is the name of one bootstrap server that gives it information how to find the root volume of a Coda realm.

Figure 1: Coda file-system view at a workstation.

graph TD
  root((/));
  root --- usr;
  root --- bin;
  root --- etc;
  root --- tmp;
  root --- lib;
  root --- afs;
  root --- nfs;
  root --- misc[…];
  root --- coda;
  local([Local Files]);
  usr --- local;
  bin --- local;
  etc --- local;
  tmp --- local;
  lib --- local;
  afs --- shared1([Shared Files]);
  nfs --- shared2([Shared Files]);
  coda --- shared4([Shared Files]);

Coda realm. A Coda realm is a group of servers sharing one set of configuration databases (a single administrative domain). A realm can consist of a single server or up to hundreds of servers. One server is designated as the SCM - the System Control Machine. It is distinguished by being the only server modifying the configuration databases shared by all servers, and propagating such changes to other servers.

Coda volumes. File servers group the files in volumes. A volume is typically much smaller than a partition and much larger than a directory. Volumes have a root and contain a directory tree with files. Each volume is "Coda mounted" somewhere under /coda and forms a subtree of the /coda namespace. Volumes can contain mountpoints of other volumes. A volume mountpoint is not a Unix mountpoint or Windows drive - there is only one drive or Unix mountpoint for Coda. A Coda mountpoint contains enough information for the client to find the servers which store the files in the volume. The group of servers serving a volume is called the Volume Storage Group (VSG) of the volume.

Volume mountpoints. One volume is special, it is the root volume for a Coda realm, the volume which Coda mounts at /coda/<realm domain name/. Other volumes are grafted into the /coda tree using cfs mkmount <directory> <volume>. This command installs a volume mountpoint in the Coda directory tree, and in effect its result is similar to mkdir mountpoint; mount device mountpoint under Unix. When invoking cfs mkmount the two arguments given are the name of the mountpoint and the name of the volume to be mounted. Coda mountpoints are persistent objects, unlike Unix mountpoints which need reinstating after a reboot.

Data storage. The servers do not store and export volumes as directories in the local disk filesystem, like NFS or Samba. Coda needs much more meta data to support server replication and disconnected operation and it uses complex recovery mechanisms which are hard to do within a local disk filesystem. Coda servers store file data identified by a number in a directory tree in /vicepa. The metadata (owners, access control lists, version vectors) and directory content is stored in an memory-mapped RVM data file or partition.

RVM. RVM stands for Recoverable Virtual Memory. RVM is a transaction based library to make part of a virtual address space of a process persistent on disk and commits changes to this memory atomically to persistent storage. Coda uses RVM to manage its metadata. This data is stored in an RVM data file which is mapped into memory upon startup. Modifications are made in memory and also written to the RVM log file upon committing a transaction. The RVM log file contains committed data that has not yet been incorporated into the RVM data file on disk.

Client data. Data on the client is stored somewhat similarly: metadata in RVM (typically in /usr/coda/DATA) and cached files are stored by number under /var/lib/coda/cache. The cache on a client is persistent. This cache contains copies of files from the server. The cache allows for quicker access to data for the client and allows for access to files when the client is not connected to the server.

Validation. When Coda detects that a server is reachable again it will validate cached data before using it to make sure the cached data is the latest version of the file. Coda compares cached version information (version vector) associated with each object, with versions held by the servers.

Authentication. Coda manages authentication and authorization through a token. Similar (but the details are very different) to using a Windows share, Coda requires users to log in. During the log in process, the client acquires a session key, or token, in exchange for a correct password. The token is associated with a user identity, at present this Coda identity is the uid of the user performing the log in.

Protection. To grant permissions the cache manager and servers use the token with its associated identity and match this against privileges granted to this identity through access control lists (ACL). If a token is not present, anonymous access is assumed, for which permissions are again granted through the access control lists using the System:AnyUser identity.

Organization of the client

The kernel module and the cache manager

Like every filesystem, a computer enabled to use the Coda filesystem needs kernel support to access Coda files. Coda's kernel support is minimal and works in conjunction with a userspace cache manager venus. User requests enter the kernel, which will either reply directly or ask the cache manager venus to assist in service.

Typically the kernel code is in a kernel module, which is either loaded at boot time or dynamically loaded when venus is started. venus will also mount the Coda filesystem on /coda.

Utilities

To manipulate ACLs, the cache, volume mountpoints and possibly the network behaviour of a Coda client a variety of small utilities is provided. The most important one is the cfs command.

There is also a clog command to authenticate to the Coda authentication server. The codacon command allows one to monitor the operation of the cache manager, and cmon gives summary information about a list of servers.

Server organization

The main program is the Coda fileserver codasrv. It is responsible for doing all file operations, as well as volume location service.

The Coda authentication server auth2 handles requests from clog for tokens, and changes of password from cpasswd and au. Only the the auth2 process on the SCM will modify the password database.

All servers in a Coda realm share the configuration databases in /vice/db and retrieve them from the SCM when changes have occurred. The updateclnt program is responsible for retrieving such changes, and it polls the updatesrv process on the SCM to see if anything has changed. Sometimes the SCM needs a (non-shared) database from another server to update a shared database. It fetches this from an updatesrv process on that server using updatefetch.

On the server there are utilities for volume creation and management. These utilities consist of several shell scripts and the volutil command. There is also pdbtool to manipulate the user and group databases.

Authentication

Once you are logged in to your workstation, you need to get a Coda authentication token by running clog. clog will prompt you for your Coda password and use it to get a token from the authentication server. This token will expire in about 25 hours. After the token expires, you must use clog to obtain a new token for another 25 hours.

The following in an example of running clog twice. The first time, the wrong password was entered:

$ clog
Password:
Invalid login (RPC2_NOTAUTHENTICATED (F))
$ clog
Password:
$

To see your newly acquired token, use ctokens. This will display the tokens and their expiration time.

$ ctokens
Token held by the Cache Manager:

Local uid: 9010
Coda user id: 9010
Expiration time: Thu Apr  6 18:51:35 2000

Use the cpasswd command to change your Coda password. As with passwd, cpasswd will prompt for your current password, then ask you to enter a new password twice.

$ cpasswd
Changing password for raiff
Old password:
New password for raiff:
Retype new password:
Password changed, it will be in effect in about 1 hour

You can "log out" of Coda by using the cunlog command to tell venus to forget your tokens. Once you run cunlog, you will have the same privileges as an anonymous Coda user until you acquire a new authentication token.

Coda File Protection

Coda provides a close approximation to UNIX protection semantics. An access control list (ACL) controls access to directories by granting and restricting the rights of users or groups of users. An entry in an access list maps a member of the protection domain into a set of rights. User rights are determined by the rights of all of the groups that he or she is either a direct or indirect member of. In addition to the Coda access lists, the three owner bits of the file mode are used to indicate readability, writability, and executability. You can use chmod to set these permissions on individual files. Coda rights are given as a combination of rlidwka where:

  • r - Read allows the user to read any file in the directory.
  • l - Lookup allows the user to obtain status information about the files in the directory. An example is to list the directory contents.
  • i - Insert allows the user to create new files or subdirectories in the directory.
  • d - Delete allows the user to remove files or subdirectories.
  • w - Write allows the user to overwrite existing files in the directory.
  • k - Lock. The lock right is obsolete and only maintained for historical reasons.
  • a - Administer allows the user to change the directory's access control list.

Coda also has negative rights, which deny access. Any of the normal rights listed above can also be negative.

Access control lists are managed with the cfs command with the listacl and setacl options. They can be abbreviated as la and sa respectively. To see the access control list of any directory in a Coda file system, use cfs la. The following example displays the current directory's ACL:

$ cfs la .
  System:AnyUser  rl
           raiff  rlidwka

The displayed list, shows that the user "raiff" has all of the access rights possible on the directory and that the group System:AnyUser has read and lookup privileges. System:AnyUser is a special Coda group, that includes all users.

A second example shows another group, System:coda. Anyone who is a member of the group, will have the groups access rights:

$ cfs la /coda
     System:coda  rlidwka
  System:AnyUser  rl

Use cfs sa to change or set a directory's access control list. Options to cfs sa include -negative to assign negative rights to a user and -clear to clear the access list completely before setting any new access rights. You can also use all or none to specify all rights or no rights respectively.

To remove System:AnyUser access to the current directory, you would issue the following command:

cfs sa . System:AnyUser none

To give System:AnyUser read and lookup rights, use:

cfs sa . System:AnyUser rl

To deny rights to a user, use the -negative switch:

cfs sa -negative . baduser rl

This will deny baduser read and lookup rights, even though any other user has these rights. Note that negative rights are maintained separately from the normal rights, so to re-instate badusers' read and lookup access, you must use:

cfs sa -negative . baduser none

If you omit the -negative switch, then baduser will still be denied read and lookup access.

Disconnected Operation

If all of the servers that an object resides on become inaccessible, then the client will use the cached copy of the object (if present) as a valid replica. When the client does this, it is operating in disconnected mode.

Disconnected mode may be the result of a network failure or it could be the result of intentionally removing a laptop from the network. If you make sure all of the files you want to use are cached on your laptop, you can travel with it and access your files as if you were still on the network.

Unfortunately, a cache miss while operating in disconnected mode is not maskable, and you will get a Connection timed out error message. Coda allows you to mark (hoard) files with caching priorities to help keep the ones you want in the cache.

When you are in disconnected mode, you may want to checkpoint the modify log (CML) that Coda keeps to track which files and directories have changed while disconnected. Use cfs checkpointml to do this.

Checkpointing the modify log will ensure that changes you have made will not be lost if the cache manager crashes severely. Coda uses this modify log when it re-integrates with the servers.

Coda adapts easily to low bandwidth connections like (PPP or SLIP modem links). You can use this to periodically reintegrate changes or cache new files when you are on a trip.

When you reintegrate after operating in disconnected mode, run codacon to see the progress of your reintegration.

If reintegration was not successfull, the files that you modified will be put in a tar file in /var/lib/coda/spool/<uid>. Reintegration fails, for example, when you modified a file in disconnected mode while someone else also modified that file on the servers. Read Reintegrating After Disconnection for more information on reintegration.

Hoarding

Coda allows you to advise the cache manager, venus, of critical files that it should try to keep in the cache. You indicate the relative importance of the files by assigning priorities to them. This is known as hoarding. venus maintains an internal hoard database of these files. Hoarding a file helps to ensure that it will be available when operating in disconnected mode. See the hoard.1 manual page and the Constructing a hoard file and Hoarding for a Weekend sections in this document for an example of how to set up your hoard database. A convenient way of setting up your hoard database is by creating a file with commands for hoard. This file is known as a hoard file.

Repairing Conflicts

As a result of Coda's optimistic replica management, object replicas can conflict between servers. A conflict arises when the same object is updated in different partitions of a network. For instance, suppose a file is replicated in two locations (say, serverA and serverB). If these two sites become partitioned and a user on each side of the partition updates the file (userA updates the file on serverA while userB updates the file on serverB), the file will be in conflict when the partition ends. Conflicts may also arise at the end of disconnected operation.

Coda guarantees conflict detection at the first request for an object when both servers are accessible. When a conflict is detected, Coda attempts to perform automatic conflict resolution. In simple cases, the conflict will be resolved automatically, a process which is transparent to the user except for a time delay in accessing the object. However, in more difficult cases, automatic conflict resolution fails and the object is marked in conflict. File system calls on an object which is in conflict fail with the same error code as if the object were a dangling, read-only symbolic link (usually, File not found (ENOENT)). The conflict must be resolved by a user who has appropriate access to the object. To help users resolve conflicts, Coda provides a repair tool which is discussed in Repairing an Inconsistent Directory.