(Illustration by Gaich Muramatsu)
On Mon, Oct 17, 2005 at 09:51:20PM -0500, Jerry Amundson wrote: > OK. I thought the whole point was that the client wouldn't *need* to talk to > one server, or any for that matter. Hence, "disconnected operation". Having > "everything die" is not what I'm looking for here... Well, you do need to somehow connect to the replicated group at least once. From the startup log messages it was clear that your client was starting from scratch and had never talked to the servers before. So it didn't have any information cached about what the name of the root volume was. Also, the servers are pretty dumb and although they can resolve some types of conflicts, they have to be told where the conflict is in the first place by a client. This can only happen when a client can see all servers at the same time. > I've re-initialized everything and put /etc/hosts the way it was (the server > as localhost). As expected, startserver fails, killing vice-setup. That's Ehh, I just said in my previous email that having the server resolve to the localhost address is wrong. So why do you put it back to the way it was. > [root_at_aspen vice]# getvolinfo docs aspen > RPC2 connection to docs:2432 failed with RPC2_NOBINDING (F). Swapped args, I see in your follow up email that you already figured that one out. > Ugh. Sorry for the rambling. I just don't see why the "ipaddress" setting is > described in server.conf, as I get the impression Coda servers just won't > work with the server as 127.0.0.1 in /etc/hosts? Yes, perhaps RedHat is > brain-damaged for defaulting it that way, but it's what I have to work > with... Well, I think that in a way Redhat is slightly braindamaged for defaulting that way, but the real problem is that on a very low level, Coda is doing something wrong. We are passing volume location information around based on the IPv4 address of the server, and are doing the name->address mapping by resolving on the server. Now 127.0.0.1 is a perfectly fine address to use to contact the Coda-server if you happen to be on the same machine. But a client is typically not on the same machine and 127.0.0.1 is useless/misleading information in that case. To really fix the problem, the volume location information should be passed to the client in the form of the unresolved fqdn hostname of the server. The client can then resolve that to (the list of) valid address(es) for that server. If the client happens to run on the same machine, it can just as well use 127.0.0.1, if it is on the private network it might use 192.168.x.x, and remotely it may use the real public ip-address, or even several fallback addresses when we're talking to multi-homed hosts. But changing that in the code isn't totally straightforward. Right now the RPC2 GetVolumeInfo request doesn't have the space to store the fqdn names, the client internally indexes servers based on a single 32-bit value which is assumed to be the ipv4 address, etc. And we'd have to deal with things like timing out stale DNS data, or re-resolving when we move from one network to another. So the simple change right now is to make sure that the server hostname maps to a single public ip-address where the server can be reached. JanReceived on 2005-10-18 16:12:33