(Illustration by Gaich Muramatsu)
On Wed, May 05, 2010 at 07:25:04AM +0200, Wolfgang.Liebich_at_siemens-enterprise.com wrote: > When I move some files into a directory on my coda volume, I get a > local/global conflict (I only have one server for this volume), and > venus.log says > > [ W(17) : 0000 : 07:08:39 ] ClientModifyLog::MarkFailedMLE: failed reintegrating: chown <filename> Interesting. Well, chown is a 'priviledged' operation, I think as far as Coda is concerned only members of the group System:Administrators are allowed to use it. But you are probably copying files as root, or with an application that just uses chown and figures it will fail, return an error that is ignored and be a noop when we're not root. But with Coda it works a bit differently, because a client cannot be sure if your Coda identity is a member of the System:Administrators group. So it dutifully logs the chown operation and acts as if it succeeded, but then when you reintegrate the server rejects the operation causing a conflict, because your client's state is now based on the false assumption that chown worked. There are some other odd things around ownership, files are intially 'owned' by the local userid, and then after reintegration the owner is changed to Coda's internal 'Coda userid' value, which confuses some applications (OpenOffice.org) because they believe they cannot write to the file if the uid doesn't match and don't use something like access(2) or simply try to O_RDWR which would both work. So maybe in the long run it should just be in Coda's best interest to start _completely_ ignoring user id values. We already mostly ignore modebits, only owner 'r--' bits are interpreted as overriding the write directory acl and we explicitly strip setuid bits in both client->server and server->client directions. Not replacing the uid with some internal Coda uid which has no relation to anything happing on the clients when a file is created or written to seems to make sense to me. Because clients don't have synchronized /etc/password files and pretty much everyone's local UID on their own machine is 1000 so almost every file is probably going to end up with uid 1000. This could lead to some end-user confusion. What to do with chown is probably a slightly harder decision. One option would be to always return permission denied on the client, but some applications may actually check the return code and fail badly. Another option is to report success, but not actually change the uid, which I think will break rpm and dpkg. The third option is to just let anyone chown any file (that they have write permission for), because if they control their local machine and have write ACL rights they could have written to the file with whatever uid they please, effectively changing the uid to whatever they want if we aren't replacing them with the internal Coda userid anymore. This is something I need to ponder over a little longer. > The "console" log says > > 07:08:39 Reintegrate: users:liebichw, 2/3 records, result = Permission denied > > When I try to do a repair, I get the error message "pathname not > leftmost". In this case, local-global reintegration conflict that failed on a permission denied error, you probably just want to run 'cfs discardlocal /path/to/volume'. It will drop only the first entry in the modification log, which should be the rejected chown operation. > When I call "cfs beginrepair" and then "cfs endrepair", I kill venus. > > Last log message in console: > > Assertion failed: 0, file "fso1.cc", line 1388 > > WTF ???? > Dazed and confused, Turning a file/directory into a dangling symlink and then into an expanded directory with the original file as a child really messes with some if the internal parent/child linkage. When repair ends all the temporary files and directories are removed, but the original object isn't immediately linked back into the directory hierarchy. There are various reasons for this, it may have been discarded or moved during repair and some of the necessary information got lost during the expansion process. The client relies on path traversal from the root to the conflicting object to reconstruct the right linkage, but in some cases when there is an active reference to an in-kernel directory cache entry we don't actually get to see every single directory lookup and when you tried to expand again we did not yet have the conflict linked back into the tree so we fail to find the parent directory. JanReceived on 2010-05-05 11:48:55