(Illustration by Gaich Muramatsu)
On Fri, Oct 26, 2007 at 10:29:50AM +0200, u+codalist-p4pg_at_chalmers.se wrote: > I have discovered a sad consequence of how Venus is using DNS. > > - it relies on DNS to find realm servers -- perfectly correct > - it tries to refresh its knowledge about the ip-addresses of the servers, > which is very, very good > > Unfortunately, as soon as we physically disconnected a machine > from the net so that it can not reach DNS, it becomes hardly possible to > use Coda. Venus is making DNS queries all the time and waiting for answers > which never come. There must be something very different in the way your system is set up compared to mine. First of all, as far as I know we only refresh the realm addresses when the client is restarted, so if you don't actually shutdown the client or machine it works just fine when the network disappears (module 60-90 second RPC2 timeout). Now if venus is restarted but the network is not available, DNS queries actually do time out if there is no response from the DNS servers in this case Coda falls back on previously cached information for the realm. The DNS timeout could be quite long, because there are several levels of fallback going on, at the highest level, venus first tries SRV records and then falls back on doing a normal A record lookup. Below that the resolver library may try various aliases that are defined by the 'search' option in /etc/resolv.conf (although I think I've tried to disable that type of expansion) as well as sequentially trying each defined upstream DNS server. So if you have 3 servers it actually iterates over each of those before it gives up. And on each following query it will probably just try all of the servers again. libresolv by itself doesn't do any caching, so each lookup has to go across the net. Having a local DNS cache helps a lot because it caches both successful as well as failed lookups and avoids a lot of network traffic, but in some cases also handles things like only sending DNS queries to servers that are known to be reachable. I've successfully used both dnsmasq as well as pdns. > One workaround is to ensure that the network is logically shut down, > there are no interfaces/routes for Venus to use. > Something like ifplugd or networkmanager can automatically bring the interface up or down when the network cable is connected or removed. > Another workaround is to ensure that /etc/resolv.conf does not contain > any DNS server addresses. > > Both have two major drawbacks though: > - require root privileges and possibly questionnable changes > in the local setup > - represent an extra burden > > (Apparently, some setups tend to do it automatically at network disconnection > but we can not rely on that!) > > Any other workaround-via-client-setup like using "realms" file seems > as bad or worse. Disconnected operation is an inherent feature of Coda > and should be supported out of the box without relying on extra local tweaks. Install networkmanager or ifplugd, or use a local dns cache like dnsmasq, pdnsd, or even a caching only bind/named. > Venus can gracefully handle rpc2 timeouts, > it might be possibly taught to handle DNS timeouts gracefully as well? > > How hard is it? Do we need a non-standard/non-existent DNS-resolver > library? Right, the lack of DNS caching is really a libresolve issue. DNS has some knowledge about how long a record is valid. You can see it if you use 'dig -ta coda.cs.cmu.edu', the number before 'IN' is after how many seconds that records needs to be revalidated. But this information is not available for an application that uses the standard gethostbyname or getaddrinfo libc calls. So really the application can't be expected to correctly handle caching. But libc6 or ibresolv in many cases isn't caching either, maybe it does if you install something like nscd, but I found that installing dnsmasq tends to be simpler and more predictable (and more configurable). JanReceived on 2007-10-27 01:36:34