Coda File System

RE: coda server crashed and won't recover

From: Stephan Koledin <SKoledin_at_fool.com>
Date: Wed, 16 Aug 2000 17:40:49 -0400
Thanks for the help - I still have plenty to learn. I'm back up and running,
but included a few notes below...

> From: Jan Harkes [mailto:jaharkes_at_cs.cmu.edu]
> Sent: Wednesday, August 16, 2000 12:00 PM
> 
> > 16:22:46 Entering DCC(0x1000004)
> > Magic wrong in Page i           
> > 16:22:46 DCC: Bad Dir(0x1000004.6d.68e9) in rvm...Aborting
> > 16:22:46 JE: directory vnode 0x1000004.6d.68e9: invalid entry ; 
> > 16:22:46 JE: child vnode not allocated or uniqfiers dont 
> match; cannot
> > happen
> 
> Ok, this is the bad volume, create a file called 
> /vice/vol/skipsalvage,
> with the following content.
> 
> -8<--/vice/vol/skipsalvage------------------------------------
> 1
> 1000004
> -8<-----------------------------------------------------------
> 

I could not get the skipsalvage file to work. The server would start up the
exact same way with the same errors. I tried tweaking some of the server
flags (such as -forcesalvage and -quicksalvage), but with no success. It
appeared as if codasrv never checked the file. I verified it's existence in
the source and such, but could not figure out what the problem was. Any
ideas why this didn't work?

> When the server is up and running, we have a nice list of volumes to
> purge, but we can't really do this while the server is running. So we
> have to shut the server down, and use another app to purge the volume
> from RVM.
> 
> $ volutil shutdown
> ...wait for the server to shut down...
> $ cat /vice/srv.conf
> -rvm <logdevice> <datadevice> <datasize>
> $ norton <logdevice> <datadevice> <datasize>
> Loading rvm...
> norton> delete volume 0x1000004
> norton> quit
> $ rm /vice/vol/skipsalvage
> $
> 

This worked great. Although I couldn't get the server up, I was able to mod
the rvm offline with norton and mark the offending volume for deletion. On
next startup, it dropped the volume and went about it's business. Wish I had
known about norton before - wouldn't have had to bother the list. 

> 
> Ah, after several requests we've modified the servers to not become a
> zombie when it crashes. Create a file /vice/srv/ZOMBIFY (which makes
> the startserver pass a -zombify flag to codasrv) to get the old
> behaviour back.
> 

I'll enable this flag now so if I have any more problems I can try and
provide a little more info as to the source of the crash.

Thanks again for all the help.

Stephan B. Koledin
The Motley Fool
Systems DORC
skoledin_at_fool.com
http://www.fool.com
Received on 2000-08-16 17:55:17