(Illustration by Gaich Muramatsu)
On Thu, Nov 04, 2010 at 10:04:35PM +0100, u-codalist-f7q1_at_aetey.se wrote: > I would like to revisit the old discussion about the dependency of Coda > on the servers' static IPv4 addresses. It is mentioned among others in ... > - let us make the presence of DNS SRV records mandatory (no big deal > nowadays) and postulate that _all_ of the realm servers be present there, - SRV records are not really supported everywhere. For one, as far as I know, our cs.cmu domain still doesn't support them and my home router has an older dnsmasq that definitely doesn't, etc. - If you were really serious you probably should also require DNSsec. - And for the really, really serious use, publish not just Coda/realm servers but also use DNS for the Coda volume name to server mapping, i.e. publish the existing VRDB/VLDB data through DNS. But I don't see how this helps for the static IPv4 case at all, we aren't really having trouble finding servers. > - a client even currently fetches server addresses for a realm at the > first access to the realm, this information is "mostly static", it can > be also unambiguously ordered - let the client cache this information, > including port numbers, per realm, until shutdown, refresh it on the > first access after the next startup, failing this - go disconnected; That is what the client mostly does, no server data is persistent, when we start up we only have references to the ipv4 address in a few places (mostly volume data structures). Besides, I have a client on my laptop that hardly ever actually shuts down, it gets suspended/resumed along with the OS, but I only reboot if I need to run a new kernel. DNS caching lifetimes are very different from Coda client lifetimes. > - when the client resolves a volume location and expects to receive a > server's IPv4 ip number, the volume location service would provide not > an address but an index into the realm's servers' names/addresses array, > the index will comfortably fit into the available 4 bytes; This is not necessary, a lot of thought and work has already gone into this and you have an alternate solution for parts that are already solved but it does not address the hard parts. Let me try to explain what we have and what (in my mind) still has to be done. done: - We have a 'new' RPC2 call, 'ViceGetVolumeLocation', which was added around the end of 2006 and has been present since coda-6.9.1. This call, when given a volume replica id will return the name (and optionally port) of the server as an ascii string. The returned string is actually read from /vice/db/servers, i.e. the same as what the server itself resolves when it tries to discover the IPv4 addresses of all realm servers. needed: + After obtaining the list of volume replicas that exist for a given replicated volume, the client should use ViceGetVolumeLocation to obtain the server names that host each replica. + Instead of allocating a non-persistent datastructure for server information the client should persistently store the server's name in a server specific struct, the existing places where we store an IPv4 address should be changed to store a pointer to this structure. + The client can use DNS to resolve the server name and store the returned addresses (addrinfo) and iterate over them when trying to contact a server. + To avoid blocking the client, resolution should be performed by either a pthreaded helper process (harder to implement in a reliable cross platform fashion), or by using an asynchronous DNS library (last time I checked I didn't find an appropriately licensed version to be integrated into Coda or preferably RPC2). I think there will still be a few surprises, but apart from the GetVolumeInfo there are really not that many RPC operations that pass around IPv4 addresses and RPC2 is already able to work with IPv6. This approach also correctly deals with multihomed servers as well as differences when resolving a DNS name on an internal vs. external network. Finally in the extreme case someone should be able to put a single ascii formatted ipv4/ipv6 address for a server in the /vice/db/servers file if he has a static address but cannot get a resolvable hostname for each server. > It is quite possible that I am missing something of importance > but in my eyes this would work, wouldn't need a heavy rewrite The real problems are in the reverse direction. Servers have no reliable way to know which incoming client connections represent the same client. So the server doesn't know if a new callback connection is necessary so every time a client uses a different address to connect to the server, the server has to treat it as a new client. This can lead to excessive cache revalidation and long timeouts when old callback connections have to be cleaned up (similar to what we're seeing with NAT gateways). Using the same address doesn't necessarily mean it is the same context either, a server may have multiple names/realms that could lead to it and as it doesn't know the context callbacks may not sent to every cached copy of a file on a client. An approach, which should solve both of these cases was to log individual file callbacks on the server to provide a fast alternative for clients to handle short disconnections and use either client polling, or at most send only volume level callbacks, and with fewer callbacks we can send them more aggressive, i.e. send a volume level callback for each incoming RPC2 connection that we know has cached data for the volume, but when then don't have to send any further until the connection is used for revalidation. > - a server changing its ip address will need all its clients to _restart_ > instead of _reinit_ (nothing short of reinit helps me today...); Technically a restart should do it, because IPv4 addresses should only be persistently stored in volume information which should be revalidated when we reconnect. However there are either more places that store the IPv4 address (realm servers datastructure?) or we have an optimization somewhere where cached volume location information is not revalidated. > Of course, refreshing the result of DNS resolution of the server name > list without taking the client down would make it even more attractive > and make a possible server move fully transparent for the clients. That is very hard to do, even playing with very short DNS timeouts, there will be some period where some client would still use one address and other already use the new address. Maybe if a server could have both addresses for some time would make such a move truely transparent. JanReceived on 2010-11-06 12:38:34