Coda File System

Re: New troubles in coda land

From: Jan Harkes <jaharkes_at_cs.cmu.edu>
Date: Tue, 5 Jul 2005 10:53:51 -0400
On Tue, Jul 05, 2005 at 08:31:31AM -0600, Patrick Walsh wrote:
> > > 1) cfs getpath fid_at_realm
> > > 
> > > 	This command works fine on consistent objects, but not at all on
> > > inconsistent objects.  So when you get a log entry that looks like
> > > this:  
> > ...
> > > VIOC_GETPATH: No such file or directory
> > 
> > Yeah, getpath doesn't set the 'GetInconsistent' flag when it calls
> > fsdb::Get. 
> 
> 	The reason this has come up is we are watching the logs for
> inconsistencies.  When we see an inconsistency, it is labeled by its
> fid.  Unfortunately it sounds like there is no way to take this fid and
> figure out where in the filesystem it resides so we can do conflict
> resolution (short of doing a find command across the entire realm).  Or
> can you think of a way that we can lookup this information?  Perhaps a
> new "cfs getinconsistentpath" command?

I think that the part where we associate parent directory information
during lookup instead of getattr is already done in some experimental
code which changes the way conflicts are expanded.

> > > 2) cfs getmountpoint volid
> > Oh, and it probably would be 'volid_at_realmname' because I could have
> > 7f000002 volumes in the coda.cs.cmu.edu and testserver.coda.cs.cmu.edu
> > realms and the client isn't psychic.
> 
> 	Ah, makes sense.  We should probably update the usage message which
> currently looks like this:
> 
> > Usage: cfs getmountpoint <volid> [<volid> <volid> ...]
> 
> 	To something more like this:
> 
> Usage: cfs getmountpoint <volid_at_realm> [<volid_at_realm> <volid_at_realm> ...]

Ah, yes and the getpath/getpfid calls have the wrong help text as well.

> > > find /coda/realm -noleaf -lname '@*'
> > > 
> > So what is happening here... After the update is made on the servers,
> > they send out callback messages to all clients. There probably is a
> > reasonable chance that one of the find's happens to traverse that part
> > of the tree before the COP2 message arrives, and the client will notice
> > that the version-vectors are not (yet) identical. So the client triggers
> > resolution, which normally should have no problem resolving this because
> > the combination of different version vectors, but identical store
> > identifiers is identified as a missing COP2 update and the VVs are
> > set to be identical and the client is happy. But then the delayed COP2
> > hits, and the versions are different again.
> 
> 	I don't understand why the versions are different after the delayed
> COP2 hits.  When the VVs are set to be identical, then the COP2 hits, it
> should hit with the same VV.  So why would this cause a conflict?

COP2 indicates on what other servers an operation completed, so it
updates the slots in the version vector that do not refer to the local
server. Consider a volume with two replicas during a normal operation,

		server1		server2
 Initial state	[0 0]		[0 0]
 Store foo	[1 0]		[0 1]
 COP2		[1 1]		[1 1]


Now when we add the resolve before the COP2

		server1		server2
 Initial state	[0 0]		[0 0]
 Store foo	[1 0]		[0 1]
 Resolve	[1 1]		[1 1]
 COP2		[1 2]		[2 1]

> 	I'll try that rpc2 patch, but it seems there's still some sort of
> problem with logins killing existing write operations.  I don't know why

Strange, obtaining new tokens shouldn't interrupt any pending RPC2
calls. The connection should get dropped once the reply is received and
the next operation will setup up a new connection based on the new
token.

> this should be.  It seems to me that if a user relogs in when already
> logged in the server should extend the expiration on their tokens but
> otherwise do nothing.

Can't do that, every token contains a unique key which is used to
negiotiate session keys of the connections. We can't keep using the
old token/key. Besides the new token might be for a different Coda
identity so we might have completely different access rights.

Jan
Received on 2005-07-05 10:54:46