(Illustration by Gaich Muramatsu)
On Fri, Jan 11, 2002 at 10:35:59AM +0800, Xuefeng wrote: > I am trying to deploy Coda as the storage for a Apache web server in Linux > Redhat 7.1 with kernel 2.4.12. During my web performance test using WebBench > 4.1 software with 2 Win2K clients flooding the HTTP requests to the web > server. WebBench reports some errors of "404 Object not found". There is no Could you check from the Apache logs whether these missing objects are typically the same ones. It could be that Apache caches open filedescriptors to frequently requested files and we have a hard time invalidating objects that are still held open. OTOH I'm assuming that the data is not concurrently updated from another client so this should not be causing these problems. It could also be related to the concurrent access/symlink traversal problems that Ivan Popov has reported. I have no lead on why this would happen. Does the client have the whole 'working set' cached, or is it refetching files during the WebBench run (codacon output should show this). This would be interesting to know, because if everything is cached locally there are a whole bunch of codepaths that can be eliminated from the search for the problem. Our apache server at www.coda.cs.cmu.edu has been running without significant problems for a year or two now, ofcourse the load is probably a lot less than what a WebBench benchmark can throw at it. The only problems I've seen are, an as yet unknown 'cache leak' where the client assumes it has less cachespace in use compared to actual usage. When the client is restarted this is corrected. > be in such a scenario. Is there a hard limit on the number of simultaneous > accesses to the same file under Coda FS? The Apapche web server log file Yes, there is a hard limit of 20 worker threads that can concurrently handle userspace requests. However we don't drop or reject additional requests, they are queued in kernel-space until an available worker thread becomes available and picks it up. > complained a "No such file or directory" error and denied the file accesses. > I have also traced the log file at "/usr/coda/etc/venus.log" with a higher > debug level and found that the file that caused the error in WebBench has > been be accessed 127 times with "fsobj::Open" and "fsobj::Release" etc. I > don't know yet whether this "127" is just a number by coincident or it is a > magic number in Coda implementation. Can you tell me the possible causes of > this problem and how to solve it for the time being? Did the log file show any ENOENT errors, this could also be logged as the numeric errorcode '2'. > BTW, the Coda versions I have tried with Linux kernel 2.4.12 include 5.3.17, > 5.3.15 and 5.3.13. A combination of kernel 2.2.18 and Coda 5.3.13 give no > such problem. Anything related to the kernel 2.4 upgrade? Interesting that 2.2.18 doesn't show the problem, could you try 2.2.18 with Coda 5.3.17, that way we can check whether this problem is purely related to the kernel version. There are many changes in 2.4 to the Coda kernel module, but even the VM and generic VFS changes could result in similar problems, i.e. the Coda kernel module might occasionally be unable to allocate buffers for upcalls due to some VM problem. Any 0-order allocation failures showing up in 'dmesg' output? JanReceived on 2002-01-11 12:43:36