(Illustration by Gaich Muramatsu)
On Tue, Sep 07, 2004 at 04:26:35PM +0200, Markus Wiesecke wrote: > I am running a codaserver which I have inherited from a colleage leaving > the group. Thus I am not very experienced with this stuff. > >From team to time, the codasrv crashes, and I do not see any reason for > I, for example tonight. The CodaSrv-Process was still running, but no > longer answering queries. > The LogFile from the time reads as follows (sorry if I am pasting to > much, but I do not want to miss the point): I don't really see any indication that it crashed, the fact that you could shut it down with the init script means that it was still accepting connections as we use 'volutil shutdown' which sends a shutdown RPC command and not a signal. I'm not entirely sure why it is getting NAK messages from the clients and why there is no indication that clients are trying to rebind to the server. > 18:37:22 Callback failed RPC2_NAKED (F) for ws 129.70.139.98:32811 > 18:37:23 Callback failed RPC2_NAKED (F) for ws 129.70.139.45:2430 > 18:57:24 Callback failed RPC2_NAKED (F) for ws 129.70.139.61:32771 > 00:01:25 Callback failed RPC2_NAKED (F) for ws 129.70.139.44:2430 > 00:01:27 Callback failed RPC2_NAKED (F) for ws 129.70.139.164:2430 > 00:01:27 Callback failed RPC2_NAKED (F) for ws 129.70.139.75:2430 > 00:01:27 Callback failed RPC2_NAKED (F) for ws 129.70.139.166:2430 > Can you see any reason for the crash? The IP 129.70.138.34 belongs to a > laptop, which I suppose that it was shut down at the time the RPC2_DEAD > appeared in the logs - may this be a reason for a crash? RPC2_DEAD simply means that we were unable to reach the client, so that would coincide with it being a laptop that is taken offline. The NAK messages indicate that the client doesn't think the connection was active anymore so they are more worrysome. It almost looks like the server is still able to talk to the clients, but the clients are unable to connect back. Since none of the network related state is stored persistently, a restart should fix most if not all of the problems. I just wonder what the problem is. Is there anything in /vice/srv/SrvErr? JanReceived on 2004-09-07 16:33:07