Coda File System

Re: Coda for Apache Web server in 2.4 kernel. Anyone?

From: Jan Harkes <jaharkes_at_cs.cmu.edu>
Date: Fri, 11 Jan 2002 12:43:29 -0500
On Fri, Jan 11, 2002 at 10:35:59AM +0800, Xuefeng wrote:
> I am trying to deploy Coda as the storage for a Apache web server in Linux
> Redhat 7.1 with kernel 2.4.12. During my web performance test using WebBench
> 4.1 software with 2 Win2K clients flooding the HTTP requests to the web
> server. WebBench reports some errors of "404 Object not found". There is no

Could you check from the Apache logs whether these missing objects are
typically the same ones. It could be that Apache caches open
filedescriptors to frequently requested files and we have a hard time
invalidating objects that are still held open. OTOH I'm assuming that
the data is not concurrently updated from another client so this should
not be causing these problems. It could also be related to the
concurrent access/symlink traversal problems that Ivan Popov has
reported. I have no lead on why this would happen.

Does the client have the whole 'working set' cached, or is it refetching
files during the WebBench run (codacon output should show this). This
would be interesting to know, because if everything is cached locally
there are a whole bunch of codepaths that can be eliminated from the
search for the problem.

Our apache server at www.coda.cs.cmu.edu has been running without
significant problems for a year or two now, ofcourse the load is
probably a lot less than what a WebBench benchmark can throw at it. The
only problems I've seen are, an as yet unknown 'cache leak' where the
client assumes it has less cachespace in use compared to actual usage.
When the client is restarted this is corrected.

> be in such a scenario. Is there a hard limit on the number of simultaneous
> accesses to the same file under Coda FS? The Apapche web server log file

Yes, there is a hard limit of 20 worker threads that can concurrently
handle userspace requests. However we don't drop or reject additional
requests, they are queued in kernel-space until an available worker
thread becomes available and picks it up.

> complained a "No such file or directory" error and denied the file accesses.
> I have also traced the log file at "/usr/coda/etc/venus.log" with a higher
> debug level and found that the file that caused the error in WebBench has
> been be accessed 127 times with "fsobj::Open" and "fsobj::Release" etc. I
> don't know yet whether this "127" is just a number by coincident or it is a
> magic number in Coda implementation. Can you tell me the possible causes of
> this problem and how to solve it for the time being?

Did the log file show any ENOENT errors, this could also be logged as the
numeric errorcode '2'.

> BTW, the Coda versions I have tried with Linux kernel 2.4.12 include 5.3.17,
> 5.3.15 and 5.3.13. A combination of kernel 2.2.18 and Coda 5.3.13 give no
> such problem. Anything related to the kernel 2.4 upgrade?

Interesting that 2.2.18 doesn't show the problem, could you try 2.2.18
with Coda 5.3.17, that way we can check whether this problem is purely
related to the kernel version. There are many changes in 2.4 to the Coda
kernel module, but even the VM and generic VFS changes could result in
similar problems, i.e. the Coda kernel module might occasionally be
unable to allocate buffers for upcalls due to some VM problem. Any
0-order allocation failures showing up in 'dmesg' output?

Jan
Received on 2002-01-11 12:43:36