(Illustration by Gaich Muramatsu)
This is a message about improvements I have made to the RVM handling, with a view to increasing our debugging capability dramatically. Introduction: ============= Coda manages persistent meta data using an RVM based heap allocation mechanism, which combines transactions and persistence with the usual "malloc/free" metaphore. After initializing rvm you can persistently allocate on the RDS heap as follows: rvmlib_begin_transaction(restore _or_ no_restore) p = rvmlib_rec_malloc(sizeof(vnode)) copy stuff to *p rvmlib_end_transaction(flush _or_ no flush) The vnode is now persistently in the RVM segment. Of course one should store "p" too somewhere in order to find back the vnode upon restart, but I kep it simple. Problem: ======== If you free twice in RVM or commit other sins, the RVM package will likely assert only in rvmlib_end_transaction. How do you find back your sins? Solution: ========= I have hacked for almost two days to get our RVM under control -- I found it next to impossible to track memory problems before, but now I think I have got a handle on it. Let me remind you that rvmlib_rec_free does a _fake_ deallocation, only upon committing the transaction is the memory actually release by rds_do_free. The problem this causes is that a wrong guard on the region is detected only much later, not when rvmlib_free is called, but sometime dozens of procedures and 5 files later. I did the following to get past these problems. rvmlib_rec_{malloc,free} are now macros which, optionally -- start with "venus -rdstrace" -- print out the file and line number on which they were invoked. Here is an example, the example which "locates" my bug. grep rdstrace /usr/coda/venus.cache/venus.log creates output with entries: rdstrace: rec_malloc addr 2115aa8c size 20 file /home/braam/ss-dir/coda-src/venus/fso_dir.cc line 175 ..... rdstrace: rec_free addr 2115aa8c file /home/braam/ss-dir/coda-src/ndir/dir.c line 1271 rdstrace: rec_free addr 2115aa8c file /home/braam/ss-dir/coda-src/venus/fso1.cc line 2084 .... Finally when the transaction ends rds_do_free also prints out what it is doing: rdstrace: start do_free rdstrace: addr 0x2115cfcc size 40 rdstrace: addr 0x2115cf8c size 40 rdstrace: addr 0x2115cf4c size 40 rdstrace: addr 0x2115cf0c size 40 rdstrace: addr 0x2115cecc size 40 rdstrace: addr 0x2115ce8c size 40 rdstrace: addr 0x2115cb4c size 40 rdstrace: addr 0x2115cb0c size 40 rdstrace: addr 0x2115ca8c size 40 rdstrace: addr 0x2115ca4c size 40 rdstrace: addr 0x2115c9cc size 40 rdstrace: addr 0x2115c98c size 40 rdstrace: addr 0x2115c94c size 40 rdstrace: addr 0x2115c90c size 40 rdstrace: addr 0x2115cacc size 40 rdstrace: addr 0x2115b6cc size 40 rdstrace: addr 0x2115b38c size 40 rdstrace: addr 0x2115a24c size 840 rdstrace: addr 0x2115aa8c size 40 .. CRASH .. (assertion in rds_do_free) Clearly we crashed when we wanted to free 0x2115aa8c for the second time. The above locates exactly where I committed my double free sins. Part of this code was built by David Steere, but I don't think he got it into this sort of shape. Let's hope that we will find lots of bugs now ! - Peter -Received on 1998-04-02 20:58:54