(Illustration by Gaich Muramatsu)
Jan Harkes wrote: > > Think about a ftp site (using CODA kernel-module and podfuk+ftpfs) which > > has large INDEX files (~60MB). File manager likes to determinate the > > datatype - reads the first two bytes - but triggers the download of > > 60MB. > > Quite efficient, isn't it ? ;-((( > > > > (Possible) solution: > > Allow random-access (read/write etc.) to remote files. > > What about an interface to the SYSV-VFS layer, e.g. moving more stuff > > from kernel to userspace layer ? > > This would actually introduce far too much overhead in context switches > between the application and the cache-manager. Besides, as far as Coda > is concerned, it makes it impossible to guarantee consistency. However, > there is another solution which we've been thinking of. > > When a large file is opened, CODA_OPEN could return early, f.i. when > the first 8-64KB have been fetched. The kernel would get a `lease' on > accessing these first pages (both read and write), while the cache > manager pulls in the rest of the file. > > When the application accessing the file seeks or reads past it's `lease' > it is blocked until the data is available, and the cachemanager has > returned a new lease. However when the application is done and closes > the file before everything has been fetched, the ongoing fetch can be > aborted. > > In this model, we can keep streaming data into the container files as > efficiently as possible, while at the same time allowing some early > access to the containerfile. One of the big problems is that most > applications don't handle read/write errors very well, so an interrupted > transfer (disconnection) would lead to silently truncated files. Mostly > due to user `error', when someone opens a file in an editor, makes some > changes but doesn't notice the end of the file was lost and saves it back. I was thinking about a completly different solution: What about letting venus/podfuk decide at OPEN to download the complete file to the local system (e.g. into the "cache") or handle each operation (read/write/seek/locking etc.) manually and map them to the remote file. This would give us the control whether file operations should be executed on the "cached" file or on a remote file. This would be a more universal approach as this also supports both your and my idea, too (for example: venus decides at OPEN to access the remote file readonly but triggers the download into the cache in _parallel_. If download is complete it switches from remote to cached file). And it would allow other projects to use the same interface, either using cached operations, direct operations or both. ---- Bye, Roland -- __ . . __ (o.\ \/ /.o) Roland.Mainz_at_informatik.med.uni-giessen.de \__\/\/__/ gisburn_at_informatik.med.uni-giessen.de /O /==\ O\ MPEG specialist, C&&JAVA&&Sun&&Unix programmer (;O/ \/ \O;) TEL +49 641 99-13193 FAX +49 641 99-41359Received on 2000-10-17 12:41:14