(Illustration by Gaich Muramatsu)
If I had gotten past the fsck issue, I would have been setting up Coda for the first time, and I'd like to give you a laundry list of points that came up in planning my installation. If your goal is to have Coda and InterMezzo used widely, attention to the problems of the generic sysadmin, and particularly the distro builder, is important. For reference, this is coda-5.3.19, lwp-1.9, rpc2-1.13, rvm-1.6, installed on Linux, SuSE 7.3 (kernel 2.4.16) with heimdal-0.4d. I used the module (can't tell what version) that came with the kernel, rather than linux-coda-5.2.3 from the web site. Some users in the mailing list use the traditional client-server model, like for AFS or NFS, while others have a "peer to peer" organization. I planned to have a Coda server on my laptop and another on the base machine, with data migration according to whichever I used most recently. The docs imply that either model will work. Since I'm in a hostile environment (wireless network shared with students), I planned to use Kerberos with Coda. However, my distro comes with Heimdal-0.4d, and your code (coda-src/auth2/krbsupport.c) wouldn't compile. Here are the complaints. I'm not a Kerberos expert, and I don't know what the MIT API says, so I don't know whether to blame you or Heimdal. At line 353, struct krb5_keyblock (=EncryptionKey) has no member named "length" nor "contents". I get the impression from reading the header file that the Heimdal people want you to use a macro to access these. At line 443, struct krb5_ticket has no member named "enc_part2". (The struct in heimdal/krb5.h is nice, neat and lacking that data member.) Perhaps there's a member function for using it. There's a hardcoded constant K5KINIT, mentioned in the docs, giving an absolute path to kinit. Better to call it using execvp, or at least have a ./configure option for setting it (and for engaging Kerberos support as a whole). I wonder what it is used for? The auth daemon shouldn't call kinit on behalf of a user, whereas its own key should be in a keytab. I tried recompiling using Kerberos-4, but Heimdal (at least what's in my distro) doesn't seem to support compiling Kerberos-4 clients, although the server can serve them according to the docs. So I omitted Kerberos entirely, for testing. My plan was to dump Heimdal and install MIT Kerberos, later when I got confidence that Coda was working out. I found the paper about security in the docs, but I couldn't ascertain whether the presently implemented Coda has client-server encryption turned on by default or by a conf file option. I assume this is independent of Kerberos. The legal status of encryption has changed a lot, for the better, since that paper was written. Be the first on your block to use Rijndael :-) The location of the runtime data needs to be straightened out. /usr/coda is not acceptable; /usr mostly belongs to the distro, and will be on a readonly partition stuffed with system binaries. Small amounts of writeable data, like log and PID files, belong in /var/log and /var/run, respectively. I prefer to have small amounts of static config files in /etc (preferably in their own directory, /etc/coda) so I can back up /etc easily, not having to do /usr/coda/etc as a special case, but bypassing commingled log files. For config files, I prefer key value pairs, one per line, with a provision for comments. The location of all databases and writeable files (like the log and PID files) should be settable there. See /etc/ssh/ssh_config or /etc/httpd/httpd.conf. There should be a command line argument on each daemon so files in odd locations can be accomodated. I notice your new environment variable for this purpose, and that's good, but the -f option is more traditional. It's really hard to understand the Venus config file, and some important parameters can't be set there. The first time through the installation instructions I missed where the prewritten startup scripts (coda-src/scripts/venus.init etc.) were located. SuSE has a lot of nice features in their startup scripts (supporting Linux Standard Base) which I would need to put in. I would need to research how to rotate the log files for the auth and update daemons and for Venus. Killing them to rotate the log seems to take them out of service. Venus, at least, seems quite chatty, so the log file would need to be rotated at least daily. You provide a rotation function for the server, but I have my own style of rotation, criteria, and retention period, and I would suggest letting the sysop handle that, at least optionally. Methods of detecting a rotated log file: a. The traditional SIGHUP. b. Stat the file by name, and if the inode number has changed (or the file can't be statted), close and re-open it by name. You would want to do this at most once per minute -- bypass if done recently. c. Stat the file by FD; perhaps a mode bit could have a magic meaning. This is kind of cowboy programming. Or if the ctime changes, it means the file has potentially been rotated. But that precludes this good rotation procedure: cp -p log log.new # Preserving permissions cp /dev/null log.new ln -f log log.0 # Changes ctime, hiss, boo mv log.new log # Actually switching log files mv log.1 log.2 # However many you want to keep mv log.0 log.1 This procedure would work with method b above. Some "cowboy programming" styles of log file rotation are helped if the log file is opened in append mode, so you can do "cp /dev/null logfile" and it will actually get rewound. I didn't check if you already do this. "Monthly tapes are saved for eternity (Admin manual, sect. 12.4). It's important to have a finite retention period to limit the amount of work you have to do in response to a subpoena, and it's important to formally adopt the policy and publish it (giving your adversaries constructive notice), to cover your ass in case of obstruction of justice charges. Think "Arthur Andersen". It's important to be able to start a server in the isolated state, and to let in clients and/or peer servers one by one. For major repairs, the docs suggest to start the server, and then to set it isolated, after partners have noticed it and data has started to flow. For repairs, you will want to have only your own client interacting with the server. The procedure for flushing the client cache seems unreliable. Here's a protocol suggestion: a. The servers maintain a "cache version number" which is incremented when the cache should be flushed. b. On reconnection, clients are told the current number, and they flush if they have the wrong one. c. When an emergency is declared, the servers poll the clients recently heard from, tell them the new number, and make them flush. But the servers won't deliver any files until repairs are finished. Making changes to a replicated readonly volume is very intimidating. It would seem feasible to turn on and off the readonly state of a replicated group, so most of the time the clients can elide checking for changes, but dynamically (when the sysadmin is installing content) they can be made to do it. I noticed a 30 second timeout in one of the header files. When I'm shutting down my two systems for the night, I would prefer not to have to wait 30 seconds for the second one to realize that the first one has gone down, before it too can exit. Does a server that's exiting or becoming isolated poll all recently heard from clients and servers, and positively notify them that it's going down? Similarly, Venus should notify the servers that it's disconnecting. The script MAKECODA mentioned in the manual was not found in any of the component distribution files (lwp, rpc2, rvm, coda). For RVM data and log, the docs mention that a plain file can be used (for trying it out), but certain consistency checks have to be omitted. On Linux you can preallocate a plain file (dd if=/dev/zero of=plainfile count=1440) and then "mount" it using "losetup", causing a block device /dev/loop0 or /dev/loop/0 (with devfs) to appear. You can have several of these. My plan was to try using a loopback device for RVM, to get the advantage of not having to repartition my disc before having committed to Coda, while also getting the missing consistency checks. The kernel module needs a version string built in. I'm not sure whether I prefer noisy or silent modules. Probably a developmental module should syslog (kprintf) its identity and version when loaded. The docs mention that vutil assumes that certain files are in /usr/coda/venus.cache. Since there's a vutil.in, where @prefix@ is substituted, it should be easy to similarly adjust the writeable data area. I would recommend /var/coda/venus.cache, except the PID should be in /var/run and the log file should be in /var/log. One of the doc files has a section that says "you need LWP and RPC2 and...". I suggest you include the documentation package in the list. I was surprised to find no per-cell subdirectory under /coda, like AFS has. I have separately administered nets at home and at work, and I would frequently want to have both mounted at the same time. Adding the "multi-homed" capability to Venus should get a high priority. James F. Carter Voice 310 825 2897 FAX 310 206 6673 UCLA-Mathnet; 6115 MSA; 405 Hilgard Ave.; Los Angeles, CA, USA 90095-1555 Email: jimc@math.ucla.edu http://www.math.ucla.edu/~jimc (q.v. for PGP key)Received on 2002-05-12 19:29:10