(Illustration by Gaich Muramatsu)
On Wed, May 07, 2003 at 11:25:30PM -0700, Steve Simitzis wrote: > i came home from dinner tonight, and found several pages of this in > the logs: > > 0x360f0a08 : fid = ((0x7f000001.0x1fffe.0x17dc2)), comp = sepia03.jpg, vol = 35180988 ... > followed by a bunch of stats: > > VFS Operations > Operation Counts Times > Root : 0 [ 0 0 0] : 0.0 ( 0.0) > OpenByFD : 0 [ 0 0 0] : 0.0 ( 0.0) > Open : 13357503 [ 162 30 1] : 1.1 ( 79.9) > Close : 13356008 [ 0 0 73] : 0.2 ( 36.6) > [...] > > then it dies. Actually it died before all of this got dumped. This is output from the 'VenusPrint' function which is called when there is something terribly wrong. > [ F(05) : 0000 : 23:17:59 ] ***** FATAL SIGNAL (11) ***** > > it died right away: > > 23:17:59 Fatal Signal (11); pid 11743 becoming a zombie... > 23:17:59 You may use gdb to attach to 11743 > > > i attached gdb to the zombie, and here's what i found: > > 0x420292e5 in sigsuspend () from /lib/i686/libc.so.6 > (gdb) where > #3 0x4002b30b in coalesce (tid=0x8103810, err=0x15093ec0) at rds_coalesce.c:71 > #4 0x4002bbbe in rds_do_free (list=0x811b754, mode=no_flush) at rds_free.c:204 > #5 0x080c7365 in strcpy () > #6 0x08085fd6 in strcpy () All these strcpy's are probably a result of not having symbols in the venus binary (i.e. it was stripped). But we crashed while trying to defragment RVM when a new chunk was freed. This could be something like a double free, perhaps we crashed earlier after freeing some RVM memory, but haven't yet destroyed the object that was referencing it. If I remember correctly, RVM free operations use their own transaction so we can't commit the free and remove of the referencing pointer in a single operation. It is possible to get either this case (double free) or some memory leakage wheneven venus crashes. Although I rather have the memory leak (venus still starts) than the double free (venus needs reinit). JanReceived on 2003-05-08 10:07:56