Coda File System

Re: Venus error : mount system call failed

From: Jan Harkes <jaharkes_at_cs.cmu.edu>
Date: Wed, 24 Jul 2002 10:56:18 -0400
On Wed, Jul 24, 2002 at 04:28:49PM +0200, Yann Bloch wrote:
> I have three machines : codascm (192.168.0.182), codarep1
> (192.168.0.176) and codarep2 (192.168.0.101).
> 
> There is only one volume (the root volume), which is replicated on the
> three servers.
> 
> I have a client installed on codascm and codarep1. Sometimes, when I try
> to start the client on codascm (but I remember seeing the same thing
> already on codarep1), I get the message : "CHILD: mount system call
> failed. Killing parent.". What does that mean ?

It typically means that the client is unable to get access to the root
of the 'rootvolume'.

...
> 16:19:39 Venus starting...
> 16:19:47 ResolveMax exceeded...returning EWOULDBLOCK

Ouch, you have a conflict in the root of that rootvolume. Perhaps the
servers cannot see each other, or one of your clients is only talking to
a subset of the rootservers.

Conflicts in the rootvolume are a pain to repair. The /coda mountpoint
isn't really 'owned' by our FS, (mount borrows the inode from the
filesystem we mounted on). And once /coda is turned into a symlink the
repair tools can't even access the hidden /coda/.CONTROL file which is
used to pass special ioctl's to venus.

Here at CMU we typically don't do any read-write stuff in the
rootvolume except for creating some initial mountpoints like /coda/usr,
/coda/projects, etc. I then set the ACL to only allow sysadmins to
write to /coda.

If you have an actual conflict in the root of the rootvolume, the
following steps are needed to repair it.

- create a (temporary) volume. Lets say (codaroot.repair)
- force a client to mount this as it's rootvolume.
    (add "rootvolume=codaroot.repair" to /etc/coda/venus.conf and
    restart venus with the -init flag)
- mount the old root volume in this temporary volume.
- repair the conflict.
- remove the rootvolume= config line and reinit venus.

As you can see it's not an easy type of repair.

> 16:19:47 Entering RecovDirResolve (0x7f00000b.0x1.0x1)
> 16:19:47 ComputeCompOps: fid(0x7f00000b.1.1)

I'm guessing your rootvolume is 0x7f00000b.

> 16:20:17 GetVolObj: VGetVolume(7f000002) error 103
> 16:20:17 GetVolObj: VGetVolume(7f000007) error 103
> 16:20:17 GetVolObj: VGetVolume(7f000008) error 103

But some clients are still asking for other volumes, either the
VRDB/VLDB files are not the same on all servers, or some client
still have cached data for purged volumes.

Jan
Received on 2002-07-24 10:57:44