(Illustration by Gaich Muramatsu)
On 05/17/04, Jan Harkes <jaharkes_at_cs.cmu.edu> wrote: > That last one is unusual, although I don't immediately see how it could > affect anything. I'd like to know more about the reintegrations, for > instance is convert running for a long time and is the initial CREATE > operation already reintegrated before we're done writing the file. i've suspected the possibility that a process running for a long time is the culprit, and convert is just a particularly bad offender, but it's hard to catch it in action. > > so now i have convert writing the file to /tmp, and then i'm renaming > > the file into coda. since making this change to our web application a > > few days ago, i haven't had a single crash. (still crossing my > > fingers!) > > Well, if that works, definitely keep it like that for now. I'll try to > simulate the convert behaviour with a small test program and see if I > can trigger the problem. it seems to work better than before. venus has crashed once since i made the change, but nowhere near with the frequency as before. it's possible that it's a general problem with long running process writing to files. i'm going to audit our web app and find any instances where we're writing to files (i believe unstuff is used somewhere, for example) and see what i can find. what continues to be curious is how this only happens when df -i /coda reports 94%. i wish i could get a core file out of venus. the assertion crashes are nice because i can poke around in gdb while the process is still running, but i can't seem to get a core file out of it to save for later. -- steve simitzis : /sim' - i - jees/ pala : saturn5 productions www.steve.org : 415.282.9979 hath the daemon spawn no fire?Received on 2004-05-17 21:10:19