Coda File System

Re: Q: Is Partial Replication Possible?

From: Jan Harkes <jaharkes_at_cs.cmu.edu>
Date: Wed, 26 Feb 2003 16:31:53 -0500
On Wed, Feb 26, 2003 at 02:28:47PM -0500, Brian White wrote:
> > There have been, of course, a number of research projects on similar
> > things (see Darrell Long's Swift, TicketTAIP, and CHEOPS, to name just
> > three).  AFAIK, none of that has been integrated with Coda.  Would make
> > an interesting project.
> 
> It would.  It's actually one I started some serious thinking about at one
> point, but the reality is that I don't have the time or the knowledge of
> the linux kernel to actually undertake a project like that.

99% of the Coda code is userspace, and that one percent only involves
the interaction between a Coda client and the kernel. So hacking Coda
servers doesn't require any linux-kernel knowledge.

Besides if you really need to hack the kernel for such a project, you
have to keep in mind that we're don't just have a Linux kernel module,
but also kernel modules for FreeBSD, NetBSD, Solaris, Win9x and
WinNT/2000/XP.

Which makes any kernel change a big pain in the behind ;)

On Wed, Feb 26, 2003 at 02:35:36PM -0500, Brian White wrote:
> A new server in the group would initially start out as just a blank holding
> disk.  When it needed a file, that file would be fetched and cached locally.

The client would have a hard time keeping track on which servers it
might be able to find a copy. It also makes it harder to make sure that
all copies are updated and keeping track of various versioning changes.
So it would add a considerable amount of overhead just to track where
recent copies are, and when something is updated, which server(s) are
responsible for notifying the clients.

Same holds for servers, during conflict resolution they need to find all
other copies of an object. And it would be a mess when a server doesn't
have a full directory tree, because then we would need some form of
distributed loop detection.

Sure it is possible, but it is kind of a radical change compared to the
current replication strategies... It would be more of a P2P application
where there is less of a difference between clients and servers and
things like 'who do you trust'. Perhaps more cluster oriented instead of
wide area distributed filesystems, such as Intermezzo or Lustre might be
closer to what you are looking for.

> Once held on this new server, another server could purge its copy to make
> space for something else it would like to cache.  As long as there were
> at lest three (for example) other copies out there somewhere, a server
> would know that it was free to purge its local copy and the system as
> a whole would still meet the minumum required redundancy.

Then you need to keep track of where copies are, which means either a
centralized location, which would still give you the same scalability
bottleneck (data transfers are pretty efficient, it is the metadata that
hinders scalability), or in some distributed thingy which would be a
whole other research project by itself.

Jan
Received on 2003-02-26 16:37:18