(Illustration by Gaich Muramatsu)
On Thu, Sep 30, 2004 at 10:56:28AM +0200, Ivan Popov wrote: > If Venus would use _only_ totally standard and traditional syscalls > like open() read() write(), then we certainly could run the cache manager > and the tools via any abi. > > Just speculating, if it would make the things hard to implement, > if we use a smaller subset of syscalls? > I can think that instead of ioctl(fd, OP, inoutdata) > we could do write(fd,"OPoutdata") followed by read(fd,"indata") > Well, twice as many context switches... but arbitrary lenght data... Interesting thought, we're already using the special file '/coda/.CONTROL' to perform the ioctls on because we can't use regular ioctls on device nodes, symlinks or directories. Our pioctl code is mainly a wrapper that calls ioctl('/coda/.CONTROL', ...). I have never tried, but since Linux passes down an open fd, it doesn't have to be a container file and could be an open socket or pipe. So the pioctl wrapper could work as follows. pioctl opens a magic file /coda/.CONTROLPIPE venus returns an open socket or pipe. pioctl writes request pioctl reads reply pioctl closes magic file This requires no kernel changes on Linux. If we want the pioctl using application to use select instead of a blocking read, some additional changes will probably be needed. One interesting thing is that with the existing implementation this even works when several pioctl using clients are active. They will not share the same endpoint because we keep the 'container' file handle associated with the Coda file handle and redirect the read and write calls at the file level instead of the inode level. Only mmap will try to work on the inode level, we never allow mmap to succeed if another container file handle is already mapped to the inode, since sockets can't be mmapped this shouldn't even be a problem. Ofcourse handling this in venus will require some work, it has to recognize the open for the magic file (the .CONTROL file is dealt with in the kernel itself), but as the top-level volume is already 90% magic it shouldn't really need too many changed. Then we need a listener thread for the socket endpoint probably quite similar to the existing MarinerPort stuff, and we need a protocol, maybe just write the existing binary viceioctl data. Finally, all this is of course of no use if we can't implement the same thing on *BSD, I don't know enough of the kernels to tell whether their VFS has enough of a similar structure to allow for this abuse. Alternatively we could try to use container files, have pioctl open /coda/.CONTROLFILE-XXX, but then how can we tell when the reply is ready. This seems to be a non-workable solution. A final solution I can think of would be to allow (non-venus) processes to open /dev/cfs0 and forward every write to venus as a CODA_PIOCTL upcall and return the reply on the next read. This ofcourse would need some kernel hacking and some way to distinguish venus from normal processes. JanReceived on 2004-09-30 11:19:35