Coda File System

Re: Somehow I'm missing some docs ...

From: Jan Harkes <jaharkes_at_cs.cmu.edu>
Date: Mon, 2 Apr 2001 15:25:41 -0400
On Mon, Apr 02, 2001 at 07:36:02PM +0200, Garry Glendown wrote:
> During vice-setup I've defined the 8G of data storage - starting venus,
> all I get is a 20M /coda filesystem - is that correct, or should the
> command 
> 
> "- setup a client: venus-setup dig 20000 "
> 
> which is displayed by the setup tool look differently? (up to now,
> neither the Howto nor the regular manual have been to explicit about
> certain things - maybe I'm just looking at the wrong places ... :-/ At
> least the docs seem to be a bit outdated) Changing that value and
> restarting venus results in larger /coda filesystems ...

df simply shows the size of the local cache. There really is no way of
telling how much diskspace there is available considering that it is
possible to mix singly and multiply replicated volumes.

When all servers only store singly-replicated volumes the total size
of the cluster could f.i. be 24GB (8GB across 3 servers). However when
everything is triply-replicated, the total available size is only 8GB.
Now when there is a mix of volumes with different replication hosted by
a server, what should it report as 'available'? And how would the
client add up the numbers reported by individual server so that they
actually make sense.

To see how much server-space is available can be done using "cfs lv
/coda/path/to/volume", this shows how many blocks the given volume could
use up when no other volume hosted by any of the servers that hold a
'replica' of the volume would grow.

'df' simply reports local cache usage which is more interesting as far
as the client is concerned. There was some logic behind the Used and
Available numbers, let me see... Used is the amount of blocks that
cannot be replaced, i.e. associated with files that have pending
modifications which haven't been propagated to the servers. Available is
the number of blocks that are still completely unallocated. 'df -i'
output shows how many fso's are currently cached.

So "Used" will normally stay very low for fully-connected clients.
However when the client is disconnected, it is a good indication of how
close we are getting to filling up the complete cache with 'pending
modifications'.

> Maybe some information on what I'm trying to do ... I want to set up a
> shared area for webservers, kind of a little cluster. As such, I'm
> currently talking about some 1-2G of webspace to be stored in coda and
> made available to one or two backup machines, maybe later used for load
> balancing and the likes ... as such, I'd also like to know what I have
> to watch out for in order to get server initiated services (like Apache)
> to have access to the area, as well as local FTP daemons being able to
> write to it (thus allowing updates to be uploaded from elsewhere)

For read access, simply have an ACL for "System:AnyUser rl" on all
publicly accessible directories.

For write access you really need to have a Coda token in case the
web/ftp-servers (i.e. coda-clients) switch to weak-connected operation,
as this can happen even on 'strong' networks. Common reasons for the
switch to weak operation are due to sluggish server responses which can
be caused by excessive network congestion, server CPU usage or swap
activity.

Obtaining these tokens is a problem in itself. One way is to have a
cronjob running under the right userid that re-authenticates, although
it requires clear-text passwords, so a solution with su is probably
better.

    echo "password" | su ftp clog -pipe anonuser

Another solution might be to use a Coda-pam module, which is generally
considered insecure because Coda passwords are not really encrypted and
therefore should not be the same as normal account passwords. But it
does solve the problem of automatically obtaining tokens when a user
logs into the system.

Jan
Received on 2001-04-02 15:27:08