Coda File System

Re: odd assertion keeping server from starting - any clues?

From: Jan Harkes <jaharkes_at_cs.cmu.edu>
Date: Wed, 1 Mar 2006 13:05:01 -0500
On Wed, Mar 01, 2006 at 09:34:30AM -0500, Greg Troxel wrote:
> Jan Harkes <jaharkes_at_cs.cmu.edu> writes:
> > The log messages are probably still remnants from the time Coda used a
> > dedicated partition to store container files instead of a filesystem
> > tree and directly accessed container files with some userspace
> > implementation of iget/vget.
> 
> So the log message should be fixed, or perhaps inoder should print out
> the container file name.
> 
> The RT bug tracker on the web site seems non-functional (I couldn't
> view the existing tickets with 'open tickets', and search got be a
> perl stack backtrace.)  What's the canonical place to record bugs like this?

Interesting, I upgraded it and it seemed to work fine after the upgrade,
but I guess it only works right for logged-in users.

You should be able to send to bugs_at_coda.cs.cmu.edu. I'm looking at what
might have broken the RT stuff.

> I did find inode 1562 with a link count of 1, and 1563 was apparently
> unallocated.

Ok, 1562 probably is associated with a completely different file. In
this case 1563 was not known to be allocated, but the file actually did
exist on disk.

> Ah, so container file changes need to be part of RVM, but that's hard.
> Perhaps a new container file needs to be allocated on each update with
> a RVM transaction to point the fid at it, or something like that.

It is more difficult, I think Coda servers update the container files
and bitmaps in such a way that we should not lose data when the RVM
transaction is aborted, but we might leave a newly created container on
disk without anything referencing it.

> > If the server is trying to create an empty file container removing the
> > container file should fix the problem. If the fso happens to be a
> > directory it might crash later on because the newly created object
> > doesn't have '.' and '..' entries and such.

Actually, Satya already corrected me privately on this one, directories
are not stored in container files, so this can't happen as the result of
removing a container file.

> I"m back up - I guess we'll see.
> 
> I did decrement an inode, but apparently I should not have.  I suspect
> I should reinit my server, but perhaps I can first add a second
> replicated server, take this one out and add it back to avoid having
> to restore.

If you decrement to any value larger than 0, the server should be able
to fix up the refcount during salvage because it will notice that the
file has N places in RVM that point at it, but only N-1 references for
the on-disk copy. But I think a side-effect of dropping the refcount to
0 is that the container is removed, so some file would have lost its
contents and during salvage the server will recreate an empty container
for it.

> Or can backup/restore preserve acls and mod times now?

backup/restore always did acls, but restores volumes as read-only
non-replicated volumes, and I don't know yet what would be needed to
turn these back into read-write volume replicas which can then be used
to reconstruct the replicated volume.

The codadump2tar doesn't yet know how to deal with acls. It probably
could map Coda user/group id's to names by looking them up in the pdb
database and then add a shell script to the tarball which is placed in
the top-level directory of the restored volume that contains a bunch of
'cfs setacl ./path/to/dir username/groupname acl-rights' commands to
restore the ACLs

Jan
Received on 2006-03-01 13:06:46