(Illustration by Gaich Muramatsu)
Mention was mode of using UBIK to replicate Protection database entries across multiple sites. I'd hazard to suggest that we not consider re-implementing Ubik and use a slightly different protocol for replication. The problem with Ubik is that it makes it difficult to consider storing PTS, Backup and VLDB data in things other than record-oriented DBM style data-stores. AFS and DFS sites have alot of their administrative data trapped inside of fast, reliable but unwieldy databases with not to simple API's to program against to automate cell operation. Much of the data stored in the Protection Server, Backup Server and Fileset Location database would profit from being stored in a SQL server, or in LDAP browseable directories. Ubik makes it difficult to consider easily engineering such a solution. I think a better solution would be to use D2PC (distributed two-phase commit). To do this, each database site would export a set of methods on the database (though an RPC interface). In addition, each database site would implement a local transaction manager. In the case of BSD DB 2.0, we already have this, In the case of a SQL Server, we can issue a "BEGIN TRANSACTION" statement to the SQL backend, in other cases, we can rely on a implementation of logging. In this scenario, the "Sync Site", which is elected by the Ubik algorithm, would be responsible for acting as the transaction coordinator for each participating replication site. That is, the sync site would generate a new global Transaction IDs (trids) for each operation. The sync site would then contact each participating replication site and ask it to begin a new transaction. At that point, all insert,update,delete operations that are fanned out to each site are covered by a Trid. The sync-site can then tell all participating sites to commit or abort. If anything along the way poops it diaper, everyone rolls back. Finally, in the case of recovery, a replication site that joins a quorum can be recovered by simply comparing database version numbers, and then replaying the log at the sync site to the recovering site from LSNs [k,k'] where the last-known version number at the crashed site was k-1. The distinction is this: Ubik exposes the interface to raw storage over the network, hiding the details of the transaction and recovery. The D2PC approach abstracts the problem to transactions that cover method invocations on objects. I dont think it would be hard to build a mini TP monitor to do this.. It would be a fun and exciting project for someone that wants to learn more about distributed transactions. In addition, I know that Margo and Keith would be very interested in a mini two-phase commit implementation for DB 2.0. -- Jim +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Jim Doyle Boston University Information Technology Systems Analyst/Programmer email: jrd_at_bu.edu Distributed Systems tel. (617)-353-8248 -+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-++--+-+-+-+-+-+-Received on 1998-02-15 22:39:25