(Illustration by Gaich Muramatsu)
On Wed, Sep 10, 2008 at 01:13:52PM +0000, Markus Pfeiffer wrote: > On Tue, Sep 09, 2008 at 12:06:38PM -0400, Jan Harkes wrote: > > On Mon, Sep 08, 2008 at 09:40:51PM +0000, Markus Pfeiffer wrote: > > > What is the current status of ipv6 support for coda? I have seen that rpc2 has > > > ipv6 support and that some of the daemons seem to communicate via ipv6 but > > > some other things obviously do not work. > > > > Coda clients and servers use the ipv4 addresses as a 32-bit unique > > identifier and the volume location query responses don't contain > > hostnames, but ipv4 addresses. > > > > There is quite a bit of work left to get the clients and servers to the > > point where we can actually use ipv6 addresses. > > ... > > Are there proposals or whitepapers discussing the necessary protocol changes? > Is there anything I could do to bring such efforts forward? I have a bit of > spare time and coding experience. I started working on this a while ago, and implemented the most likely only protocol level change necessary by adding an rpc call, ViceGetVolumeLocation(IN VolumeId volid, OUT RPC2_BoundedBS HostPort); The server-side code for this call was implemented almost 2 years ago, and is available in any Coda servers since Coda-6.9.1. The idea of this call is that when the client queries the volume information for a replicated volume, it currently gets a list of volume replica identifiers as well as ipv4 addresses. We can then ignore the v4 addresses and use ViceGetVolumeLocation to obtain a string that contains the hostname (or address) and optionally a port number, sort of similar to how the hostname/port number part are specified in a URL, i.e. just a hostname: hostname hostname with port: hostname:2432 ipv4 address + port: 1.2.3.4:2432 ipv6 address + port: [2002:8002:ce58::42]:2432 (not sure if a ipv6 address without a port should use brackets). In any case what the string could contain doesn't really matter that much at the moment because there is no client that makes this rpc2 call, so nobody actually is using the results, however the current server-side implementation simply copies any hostnames as it finds them in the /vice/db/servers file. So now that we can get a hostname (and optionally port number) instead of just a single ipv4 address we get a nice level of indirection because this name can map to any number of ipv4 and/or ipv6 addresses and the DNS results could even be changed on a geographic or network location basis (intranet vs. public addresses for servers). Some of the client changes that are necessary have to do with the fact that srvent (datastructure representing a Coda server) is currently not stored persistently. All places in the code that want access to the srvent use the 32-bit ipv4 address as a lookup key, which is the only information needed to create a new srvent object if it is missing. But with hostnames we probably want to store them as part of the srvent in RVM and have all places that refer to srvents by ipv4 address use either a srvent* or some randomly assigned lookup key. Either way reference counting is probably needed. Once we have a hostname-based server identifier we can use the hostname whenever we create a new rpc2 connection. Of course this means that the client is now doing a name lookup whenever a new RPC2 connection is created. And these things are blocking, which is sort of a hindrance for a userspace threaded appplication because we cannot handle kernel or network requests until the resolver is done. So that is where I sort of got stuck, trying to find an asynchronous resolver library that would allow linking with rpc2's LGPLv2 licensed code and which is somewhat easy to use (getaddrinfo) but still allow a certain amount of control over caching. The caching of lookup results is interesting because if we assume mobility of the clients the DNS level timeout may be too generous, and we probably also need to invalidate the cache whenever we move between networks. On the other hand maybe that is a non-issue if people already use a nameserver cache like nscd or a DNS proxies such as dnsmasq. Another option would be to fork off one or more helper processes similar to what squid does (used to do?) which can perform plain old blocking DNS lookups with an off-the shelf libresolv and maybe do some caching of results and such. Server-side there is a little bit related to callbacks and such however none of this needs to be stored persistently so any necessary changes should be as far as I can tell considerably easier. > Also, if there are other pressing issues which might be easier to tackle I > would also offer my help. I don't really have a good list laying around. There is the list of known/reported bugs, but the hardest part there is often figuring out if the problem is still relevant, what may have been the cause, and finding a way to reproduce the problem. http://www.coda.cs.cmu.edu/trac/report/1 There are of course a lot of things that need to be done, - Faster cache revalidations and reducing overhead of tracking outstanding callbacks by keeping track of all updates in a log and having clients query the log after reconnection, when they receive a volume-level callback, or periodically. - Improve the way directory data is stored. Avoid size issues by storing them in container files instead of RVM. Use RVM to track uncommitted updates to maintain the existing consistency guarantees. Teach the kernel modules to read directly from the container-file representation instead of having venus translate the in-RVM directory data structure into a BSD-FFS 'inspired' on-disk representation, which is in turn parsed by the kernel modules into whatever the VFS needs. - Allow people to use LDAP instead of pdbtool to manage Coda users and groups. This would enable users to manage their own groups and possible improve integration into existing systems. Some experiments with how a Coda user could interact with an LDAP-based backend can be found in /coda/coda.cs.cmu.edu/usr/jaharkes/ldap/. There is a script 'codapts' which has similar functionality as provided by AFS's pts (and Coda's pdbtool) commands. It is also hardcoded to use a single server which is not even running an ldap daemon anymore. The 'token.py' script in the same directory was working towards creating an openldap plugin that could return valid Coda tokens for authenticated LDAP users. The idea being that it should be possible to even take auth2 out of the loop. JanReceived on 2008-09-11 17:03:31