Coda File System

Tracking down ISR deadlock

From: Benjamin Gilbert <bgilbert_at_cs.cmu.edu>
Date: Fri, 06 Jul 2007 14:38:01 -0400
I'm working on tracking down the ISR deadlock that was mentioned in the 
last meeting.  I have a potential interim fix, but I'd rather not apply 
a band-aid until I have a better idea of the cause, and I've had trouble 
reproducing the deadlock.

I'm therefore asking anyone who encounters the deadlock to generate some 
debugging information and send it to me.  If you are able to reproduce 
the deadlock in a consistent (or even intermittent) way, I'd also like 
to hear about it.  It appears to occur under heavy I/O load, and is 
probably aggravated by having a large memory image for the guest OS.


The deadlock has these symptoms:

1.  VMware (and probably your X session) is unresponsive, but it is 
still possible to connect to the host via ssh.

2.  A ps listing shows vulpes in D-state (uninterruptible sleep).

3.  Running "/usr/lib/openisr/readstats 
/sys/class/openisr/openisra/states" shows all of the cache lines in 
LOAD_META and STORE_META states, and the output does not change if you 
run the command several times.


If you encounter the deadlock, please do the following:

1.  ssh to the host and check that the above conditions are met.

2.  Run "echo w > /proc/sysrq-trigger" as root.

3.  Run "dmesg", and cut-n-paste the output into a file on the machine 
you're ssh'ing from.

4.  Send me the output of dmesg, along with the version numbers of your 
Linux distribution, the kernel ("uname -r"), VMware Player ("vmplayer 
--version"), and OpenISR ("isr version").


Thanks
--Benjamin Gilbert
Received on 2007-07-06 15:48:28