(Illustration by Gaich Muramatsu)
On Fri, Aug 13, 2004 at 02:03:56PM -0700, Steve Simitzis wrote: > the reliability of what aspect? > > On 08/13/04, Jan Harkes <jaharkes_at_cs.cmu.edu> wrote: > > > I'm wondering if there are some flags to relax the rpc2 timeouts to a > > minute or more (instead of the current 15 seconds). That should add a > > bit to the overall reliability. The connection timed out problems you mentioned earlier. If we increase the timeout, RPC2 will be more patient and not give up when it doesn't receive a response from the server within 15 seconds. Ofcourse this also means that it takes longer before we realize that a request or reply packet was lost and we need to retransmit. This can be compensated for by increasing the number of retries over the timeout period. However increasing the number of retries won't help when the packet was simply delayed because of the server was busy it won't help, and if it was really lost due to network congestion we're likely to only make the congestion worse and end up more likely to lose packets. So it is really a two edged sword, increasing timeout will make us more resilient, but increases the delay observed by users when there is packet loss or when a server crashed. Increasing the number of retries will help if we lost packets, but if the loss was due to congestion we're only adding to the problem. JanReceived on 2004-08-13 18:43:28