(Illustration by Gaich Muramatsu)
As a thank you to this mailing list and the Coda developers I would like to pass on my thoughts and experiences with Coda. I ran Coda for about 8 months on a small office cluster of 4 workstations and one server. I really liked the promise of disconnected operations, and looked forward to running a coda client on my win98 laptop, but it never sounded stable enough to bother with loading it up. I had three major Coda problems over the 8 months. The first was due to clock skew. Coda needs to have the clocks set pretty closely and we kept getting reconnect conflicts until we started running ntpd. The daylight savings time roll over was really a pain. The second problem was on-going, the clients would continually disconnect and reconnect, even when on a fast network connection. This caused no end of random clients running disconnected. A major problem with Coda is that there is no way for a casual user (the developers on our network) to quickly decide if they are running connected or disconnected. There should be some obvious alarm given when a client disconnects. Something like the popup dialogs that UPS software provides when the power fails. Losing your network connection is easily as critical. Anyway, I spent a lot of time helping people get their clients back reconnected. (I saw an e-mail on the list which suggested that this problem was fixed in the latest version, but I gave up before I got to try it.) The final problem was the killer. One day the coda server core dumped with an assert and wouldn't restart. I fooled around with it for a day and got the server running and found that read operations from clients would work OK but that the server core dumped again on the first write operation. Anyway, I gave up, copied my files out to an NFS partition and unloaded Coda from our site. This was the third time I had had such a massive failure in about 6 weeks. My basic conclusion is that Coda is not usable by anyone other than very dedicated researchers until you get rid of all the asserts in the software and replace them with meaningful error messages. My biggest frustration was trying to track down what a particular assert really meant. One example was trying to install a coda server on a machine and specifying an RVM data partition that was larger than the available memory on the machine (I had X and some programs running). Instead of saying "out of memory" or any such error message I got an assert message and spent a couple hours figuring out what that meant. It is really scary to have your users asking you when the server will be back up and all you have to look at is an assert statement that you have to wait for Jan Harkes (who was always very helpful) to interpret. Dumping the asserts, adding an alert mechanism to report when clients disconnect, and modernizing the conflict repair mechanism are the three short comings that I would suggest working on first to make Coda ready for the real world. As it was, I (Ph.D. in Computer Science and used to advanced system administration problems) just was spending way to much time keeping it going and didn't see how any of my users would ever be able to take over any Coda system admin. When that happens I'm ready to give it another try. Thanks for all your help, Doug -- Douglas C. MacKenzie, Ph.D. Mobile Intelligence Corporation 33150 Schoolcraft Road, Suite 108 Livonia, MI 48150-1646 Voice: +1 734 367-0430 Fax: +1 734 367-0431 Cell: +1 248 225-0288 mailto:doug_at_mobile-intelligence.com http://www.mobile-intelligence.comReceived on 2001-01-26 16:08:03