Coda File System

Re: coda very slow on roadwarrior

From: Enrico Weigelt <weigelt_at_metux.de>
Date: Thu, 8 Mar 2007 19:33:22 +0100
* Jan Harkes <jaharkes_at_cs.cmu.edu> wrote:

Hi,

> > So we could either increase the size of the filetable (maybe such 
> > partial information could be stored a little bit more efficient)
> > or aditionally store these relies as file-data.
> 
> Since the file system doesn't know which attributes the application is
> interested in when it calls stat, we would still need to get all of it.

hmm, right :(

But we could transmit just the differences / changed fields 
(since last transmit). We add an bitmask field to the stat block
which tells us which fields are actually filled. Empty fields are
simply left out.

The new stat request also contains an bitmask which tells the server 
which fields we're actually interested in. With some additinal heuristics 
the server could some more fields than the client requested. 

> Also Coda clients use the getattr result to detect conflicting versions,
> while file and directory contents are only fetched from a single server.
> So if we piggypack file attributes with the directory data the client
> would not see differences between replicas until we try to revalidate
> the cached attributes, or fetch the data from another replica and notice
> the version mismatch.

Well, we could do this with an separate notification. For example, the 
client gives an list of files it has in cache, and the server then sends
change notifications automatically. In other words: the client subscribes
to certain filesystem object.

> Once the subdir is opened once the in-kernel directory cache will
> contain all the necessary information and we don't need the lookups
> anymore. But if nothing is cached the path is resolved by doing
> repeated lookups,
> 
>     d1 = lookup(root, 'coda')
>     d2 = lookup(d1, 'coda.cs.cmu.edu')
>     d3 = lookup(d2, 'usr')
>     ...
> 
> And to do these lookups we need to get the directory contents, because
> only directory data can tell us how to map the name 'coda' in the root
> directory to the next-level object.

hmm, seems this is an kernel issue: if the kernel can tell the fs 
driver than some subdir is accessed (aka. passing the whole pathname
instead of walking through the dirs), we could handle this in one step.

> Nowadays servers have plenty of memory efficient ways to poll many
> sockets so TCP may have become a feasible alternative, but there are
> still some properties about the existing UDP based protocol that are
> useful to us such as a predictable detection of dead servers the ability
> to have 1000's of mostly inactive connections between clients and
> servers, etc and we get to query our networking layer about observed and
> estimated latency and bandwidth values to various servers.

We could provide both TCP and UDP and leave the decision to the 
administrator. 

BTW: TCP would be good for people sitting behind an firewall,
stream encapsulation (ie. ssh/ssl), etc.

> A client typically has many (logical) connections to a server, one for
> each internal thread for each user, so a client with 2 or 3 users can
> easily have 40-60 open connections to a server, multiply that by a
> hundred clients or more and you'd start hitting fd limits on many
> systems. And most connections are not used that much, so TCP connections
> would probably be somewhat sluggish (slow-start) whenever the first rpc
> is sent.

One TCP session per host and user is probably not good for larger
installations (for my site it would be okay). Maybe one session
per Host.


cu
-- 
---------------------------------------------------------------------
 Enrico Weigelt    ==   metux IT service

  phone:     +49 36207 519931         www:       http://www.metux.de/
  fax:       +49 36207 519932         email:     contact_at_metux.de
  cellphone: +49 174 7066481
---------------------------------------------------------------------
 -- DSL ab 0 Euro. -- statische IP -- UUCP -- Hosting -- Webshops --
---------------------------------------------------------------------
Received on 2007-03-08 13:37:52