(Illustration by Gaich Muramatsu)
Hi all, The hassle with libdb is about to come to an end. It is impossible to keep support for the 'standard' libdb 1.85, which is very aggressively being deprecated. Already we have to pull in special 'oldlibs' or 'compat' packages, but now even the development libraries and headers are being dropped. Newer libdb's from sleepycat software are not very confidence inspiring, beginning with how agressively they've been 'unsupporting' the old db1 format. Changing the names of header files (db.h -> db1.h -> db185.h) and the library (libdb.so, libdb1.so, libdb1-compat, etc.) And sometimes providing a wrapper around db2, but using the old library name which result in databases that cannot be read by systems that still use 1.85. Just one look at the configure.in test for a compatible libdb is enough to give instant headaches, and it doesn't even cover all cases... At the moment my system has libdb2, libdb3, and libdb4.0. What are the differences? Are there just API differences, or file format changes. And even if a libdb4.0 application can read db2 file, will a libdb2 linked application be able to read it after the newer library has updated it. And a lot of the newly introduced functionality, like transactions, is really not needed by most applications. Too many questions, so I've been looking around how other projects are dealing with this situation. And they are dealing badly. Samba built their own trivial database library (tdb), Enlightenment came up with libedb, some projects went back to gdbm, etc. I finally decided that Coda's needs were both incredibly limited, and very specialized. We currently only use libdb for accessing the user and group membership information. - We only need a simple 'name/uid' to 'object' mapping, a trivial hash based lookup table is more than sufficient. - Predominantly read access, very infrequent writes. - Concurrent readers, single writer model is not a problem at all. - Databases need to be shareable across heterogeneous systems by a simple file copy across the network (updateclnt/updatesrv). - Can't use thread libraries as they conflict with LWP. Based on these criteria most of the libraries I looked at were not applicable. The only promising exception was Dan Bernstein's CDB. It has a very simple file layout and is extremely efficient for lookups even when dealing with large datasets. And the basic file format has been stable since 1996. The only problem was that it required a bunch of separate tools to convert a read/write 'master copy' into the efficiently indexed 'constant database' format. I started off with a read-only implementation according to the specifications. This ended up around 370 lines of code. This part of the code is used most of the time. Adding write support bumped the total size to about 716 lines. The generated databases are fully compatible with the original CDB specifications and can be read with the official tools and vice versa. This 'rwcdb' library has been committed to CVS (coda/lib-src/rwcdb). The sources are released under the LGPL, it might be as useful for other projects as it is for Coda. To prepare for the format change, it is best to keep an exported version of the user and group databases alongside the prot_{index,user}.pdb files. Simply run 'pdbtool export coda.users coda.groups' and remember to re-export whenever the pdb databases are updated, i.e when new users or groups are added. Once the new format becomes active, the new database can be built with 'pdbtool import coda.users coda.groups'. I already have the necessary changes for pdbtool, auth2 and codasrv ready, and will probably commit them to CVS later this week. JanReceived on 2003-01-28 02:08:00