(Illustration by Gaich Muramatsu)
On Wed, Aug 15, 2001 at 09:31:56AM +0800, Jeremy Malcolm wrote: > I am running Coda 5.3.on Red Hat Linux 6.2. It seems to be installed OK > (after a fair amount of trial and error) and so now I am trying to copy > all my data into the coda store. I am doing that from the coda server > which is also running venus. While copying, periodically I will get: > > 08:23:22 Cache Overflow: (52, -214828) You have got over 200MB worth of data that doesn't fit in the client cache. This is probably due to the client going disconnected and logging modification locally. Because a client with a 20MB cache is not expected to hold so much data the amount of fso's (cacheable objects) and CML entries (modification log entries) is very limited. Because of some C++ pecularities involving the stupidly running of object initializers when an allocation fails and returns a NULL pointer. i.e. we are pretty much unable to avoid crashing when the client exceeds the number of FSO's or CML's limits. The only 'official' solution to the C++ allocation problems is to use exceptions which weren't implemented in gcc until 2.95 and what is implemented doesn't work with threaded (or at least LWP threaded) programs. > But then after about five or ten minutes I will get an error about the > device being full (sorry, it had scrolled out of my buffer so I can't > copy it into this email), followed by a message like: Your client probably ran out of FSO's because there were too many pending reintegrations. > 08:23:20 Local inconsistent object at > /coda/programs/distfiles/Windows/fireworks4-TBYB.exe, please check! > ...snip... > 08:23:38 Cache Overflow: (52, -221652) > cp: preserving permissions for ./dreamweaver4/Dreamweaver > 4/Configuration/Objects/Frames/Left Top.gif: No such device And about here venus dies because it probably couldn't allocate another CML entry. > The disk is not full, and neither would coda's store be full. df shows: The limit for the client is not really diskspace, but the number of cacheable objects. > After I kill the copying process (which is still churning through > "device not configured" lines) the last line changes to: > > Coda 9000000 0 9000000 0% /coda The STATFS upcall fails and the kernel module falls back on returning fake information to avoid locking up your system. > and when I do ls I get: > > ls: /coda: Input/output error > > This happens even after I restart venus. After doing so, venus shows up > as a process, but when I run codacon I get: Did you kill the old venus process and unmount /coda before restarting? Venus cannot reattach to a 'running' filesystem, because some files might be open, etc. By forcing you to unmount the FS, which the Linux kernel doesn't allow as long a file is still open, forces you to kill processes that still have references to files in /coda. > There are no errors in /usr/coda/etc/venus.log, at this stage, however. > In /vice/srv/SrvLog I have lots of messages like this: > > 08:27:37 VLDB_Lookup: no more records in VLDB This is strange, it indicates that a lookup is performed for a volume that doesn't exist, or at least this server doesn't know about. Are the /vice/db/VRDB and /vice/db/VLDB files on both servers the same? > Even more interestingly it seems to be only when I delete the file on > the other machine that venus finally shuts down on the main server and I > get the following in its venus.log: Ok, so the old venus was probably still hanging around, or the new venus was being 'busied' by one of the servers. The callback that resulted from the delete operation probably triggered the release of some lock that allowed progress. > [ X(00) : 0000 : 08:39:49 ] fsobj::Recover: invalid fso > (fireworks4-TBYB.exe, (0xffffffff.0xfffffffe.0x2)), attempting to > GC...0x20206b88 : fid = ((0xffffffff.0xfffffffe.0x2)), comp = > fireworks4-TBYB.exe, vol = 20212e88 Recovery only happens during startup, I'm not sure why you are seeing it this late. You might have to reinitialize the crashed venus. > To finally get coda working again on the main machine, I seem to have to > reboot. Shutting down and restarting the coda services doesn't cut the > mustard (even if I check with ps that they are all dead. I also check > if there are any processes still listening on port 370). killall -9 venus umount /coda # has to succeed, otherwise the new venus process won't start venus -init & The umount will most likely fail because some processes still have open references to files in Coda. Sometimes they can be found using 'lsof | grep /coda'. To minimize the chance of switching to logging mode and running the client of of FSO/CML objects use 'cfs strong' before copying more data than would fit in the client cache. 'cfs adaptive' returns the client to it's normal behaviour where it switches to write-disconnected mode when the estimated network bandwidth is low, or when the server is getting loaded. JanReceived on 2001-08-15 09:38:55