Coda File System

Problems with replication on two servers.

From: Marc SCHLINGER <>
Date: Thu, 23 Apr 2009 11:42:22 +0200

I'm trying to get coda working with one replication server.
on the scm: coda-server-6.9.4-1.i386
on the replica: coda-server-6.9.4-0.3.rc2.i386

I'm installing both servers with vice-setup script.
All goes well.
At the end, of the installation, I gather the /vice/db/vicetab and 
/vice/db/servers, and restart all servers.

The root volume is created at the end of the scm installation so I guess 
it's not replicated on the replica.

Then I create a volume like this :
root_at_scm# createvol_rep test scm.myrealm.yeh/vicepa 

On my client I've edited /etc/coda/realms:
myrealm.yeh      scm.myrealm.yeh   replica.myrealm.yeh

Then I've executed:
root_at_client# veuns-setup myrealm.yeh 20000
root_at_client#  clog -coda admincoda
root_at_client# cfs mkmount /coda/myrealm.yeh/test test
root_at_client# ls /coda/myrealm.yeh/test

Until this step it's okay. I can create files in volume test.

It becomes complicated when on the scm, I block all traffic using iptables.
I see the client starting sending messages to the replica(via tcpdump). 
But when I unblock the traffic on the scm I always get the same error.
On the scm:
18:25:10 GetVolObj: Volume (1000002) already write locked
18:25:10 RS_LockAndFetch: Error 11 during GetVolObj for 1000002.1.1
18:25:46 LockQueue Manager: found entry for volume 0x1000002

On the replica:
18:34:36 Going to spool log entry for phase3

18:34:38 CheckRetCodes: server returned error 11
18:34:38 ViceResolve: Couldnt lock volume 7f000001 at all accessible servers
18:34:38 Entering RecovDirResolve 7f000001.1.1

18:34:38 ComputeCompOps: fid(0x7f000001.1.1)

18:34:38 RS_ShipLogs - returning 0

On the client I got a dangling symlink for volume test.

My question is: Isn't coda fail tolerant? Or do I miss something in my 
installation/configuration ?

Thanks for your great work, and your help.

Received on 2009-04-23 06:08:13