Coda File System

Re: New troubles in coda land

From: Patrick Walsh <pwalsh_at_esoft.com>
Date: Tue, 05 Jul 2005 08:31:31 -0600
> > 1) cfs getpath fid_at_realm
> > 
> > 	This command works fine on consistent objects, but not at all on
> > inconsistent objects.  So when you get a log entry that looks like
> > this:  
> ...
> > VIOC_GETPATH: No such file or directory
> 
> Yeah, getpath doesn't set the 'GetInconsistent' flag when it calls
> fsdb::Get. 

	The reason this has come up is we are watching the logs for
inconsistencies.  When we see an inconsistency, it is labeled by its
fid.  Unfortunately it sounds like there is no way to take this fid and
figure out where in the filesystem it resides so we can do conflict
resolution (short of doing a find command across the entire realm).  Or
can you think of a way that we can lookup this information?  Perhaps a
new "cfs getinconsistentpath" command?

> > 2) cfs getmountpoint volid
> Oh, and it probably would be 'volid_at_realmname' because I could have
> 7f000002 volumes in the coda.cs.cmu.edu and testserver.coda.cs.cmu.edu
> realms and the client isn't psychic.

	Ah, makes sense.  We should probably update the usage message which
currently looks like this:

> Usage: cfs getmountpoint <volid> [<volid> <volid> ...]

	To something more like this:

Usage: cfs getmountpoint <volid_at_realm> [<volid_at_realm> <volid_at_realm> ...]

	This did fix the problem, by the way.

> > find /coda/realm -noleaf -lname '@*'
> > 
> So what is happening here... After the update is made on the servers,
> they send out callback messages to all clients. There probably is a
> reasonable chance that one of the find's happens to traverse that part
> of the tree before the COP2 message arrives, and the client will notice
> that the version-vectors are not (yet) identical. So the client triggers
> resolution, which normally should have no problem resolving this because
> the combination of different version vectors, but identical store
> identifiers is identified as a missing COP2 update and the VVs are
> set to be identical and the client is happy. But then the delayed COP2
> hits, and the versions are different again.

	I don't understand why the versions are different after the delayed
COP2 hits.  When the VVs are set to be identical, then the COP2 hits, it
should hit with the same VV.  So why would this cause a conflict?
Shouldn't coda just assume that the files are identical?

> It is fixable, but this is a bit harder. I think the server already
> keeps track of the COP2s/store-ids that it is expecting. I guess
> resolution has to clear that entry to make sure that the delayed COP
> message will be dropped on the floor. Not too familiar with this part of
> the code though, so I don't know if the solution is really as simple as
> it sounds.

	Seems to me like the solution could be on the client side of things
ignoring COP2's with VVs identical to existing VVs...  Or am I not
getting how this works?

> > 4) We're back to having issues with clog (or so I believe).  To
> > reproduce this, you need to log in to coda (as the same user) over and
> > over again every two seconds or so, while in another window
> ...
> > while [ 1 ] ; do mv WORK/s/* WORK/t; mv WORK/t/* WORK/s; done
> ...
> > This will eventually kill coda and require a server restart.
> 
> No idea on that one. It actually succeeds at killing the server, that's
> pretty impressive.
> 
> Actually, I have one idea... If your server is still running, but
> becomes unresponsive, the following rpc2 patch may fix the problem.
> 
> http://www.coda.cs.cmu.edu/cgi-bin/viewcvs.cgi/rpc2/rpc2-src/secure.c?r1=4.14&r2=4.15

	I'll try that rpc2 patch, but it seems there's still some sort of
problem with logins killing existing write operations.  I don't know why
this should be.  It seems to me that if a user relogs in when already
logged in the server should extend the expiration on their tokens but
otherwise do nothing.

	Thanks for looking into all of this, by the way.

..Patrick
-- 
Patrick Walsh
eSoft Incorporated
303.444.1600 x3350
http://www.esoft.com/

Received on 2005-07-05 10:32:23