(Illustration by Gaich Muramatsu)
On Mon, Jan 16, 2012 at 01:42:17PM -0500, Greg Troxel wrote: > Are you running with DIAGNOSTIC? Just out-of-the-box whatever it is... > What provokes the crash? Unfortunately I don't know - simple/few operations do not, the crash happens when I am running scripts fecthing and starting many binaries and mmap()-ing a lot of libraries. The actual test was a try to start vncserver, running among others bash, perl, Xvnc with its libraries and a number of X clients (unsure how long it comes, probably far from having started all of the programs involved). > linux binary reading a directory in coda Does not crash on itself. > or > > linux binary in coda being run, not trying to read data from coda Can not exclude this but I assume that people are running Linux binaries on NetBSD (without Coda) successfully. Thus - hardly. > or > > you have to do both? I actually do both as I my binaries are Linux-ABI and are on Coda. Note though that the majority of the file system accesses are not directory/file enumeration but plain open(), chdir(), execve(), that is, the lookup being done implicitly in the kernel. Wonder if it can be the concurrency (which the scripts do trigger, a shell pipeline means several parallel execve()s) combined with the "unexpected" (Coda) file system latency which can trigger races. I recall that on Linux Coda triggered corner cases in the file system framework with the bugs persisting for very long time. Finally somebody managed to observe the same problem, as a very rare case, with local file systems too. Another apparent suspect is the broken directory reading code which can lead to misaligned data and reading quite unexpected things? > The big issue with coda and NetBSD now is that getcwd() seems not to > work right. I remember submitting a patch to uClibc long ago as this was a problem on Linux too. Regards, RuneReceived on 2012-01-17 02:39:50