(Illustration by Gaich Muramatsu)
On Wed, Jun 06, 2001 at 06:37:57AM -0700, Ed Kuo wrote: > > Hello > > I have encounter a problem of similar situation with > "Big Server" mail in codadev mail list. > > After error in making mozilla..... > ... > [chris_at_cluster1 /coda]# mkdir 12345 > mkdir: cannot create directory `12345': No space left > on device Totally different error, but you're on the right track. > [chris_at_cluster1 /coda]# volutil setlogparms 0x1000002 > reson 4 logsize 16384 > V_BindToServer: binding to host cluster1 Correct, the server ran out of resolution log entries. These are still used by singly replicated volumes. But they are not thrown out when the COP2 message is missing (second phase of the 2-phase commit), and there is never a reason for resolution, so they tend to hang around forever. I've tried to add a hack to createvol_rep that disables resolution for newly created singly replicated volumes. However, it doesn't seem to work, basically trying to do 'volutil setlogparms <newvolume> reson 0'. The volutil command might have failed because the server died when it ran out of reslog entries and wasn't restarted. > [chris_at_cluster1 /coda]# > (There are totally about 4000 directories under > mozilla tree) In a tree is no problem, it is the 4000-7000 files in a single directory that Coda doesn't handle. The low limit on the number of directory entries is only really a problem in a few cases (my maildir format email directories, or Greg's RFC mirror). The current directory format isn't that useful for directories with many entries anyways. Coda uses a simple 128 bucket hash for directory lookups. With +/- 7000 entries, every hash-chain has an average length of about 54 entries, so IMHO lookup performance is already staring to become pretty bad around this point > My coda configuration is: > cluster1:SCM, cluster3:non-SCM > Only root volume:coda.root is provided, /vice:300M /vicepa:10G > created by: "createvol_rep coda.root E0000100 /vicepa" > after codasrvs on both cluster1 and cluster3 started up > and /vice/db/servers and /vice/db/VSGDB modified. Hmm, if this is really a replicated volume, there must have been some network flakyness that kept the servers out of sync. The crashed server has over 4000 operations of which it didn't know whether they reached the second machine. And clients didn't detect any differences between the replicas because that would have triggered resolution which would have truncated the resolution logs. > (originally it was RvmLog/RvmData:30M/315M before > enlarged trying to solve the no-space-left error) Rvm log really doesn't have to be that large, our server typically run with a log of between 2MB and 6MB, the log is only used to record on-going transactions. The servers tend to apply logged modifications to the data segment pretty often. > I wonder if there is anything not set well. Any > suggestion? Or it is some limitation problem? > > Helps are greatly appreciated. First thing would be to extend the resolution log size like you were trying to do using 'volutil setlogparms reson 4 logsize 16384'. Then, on a client, run 'cfs cs ; cfs strong ; ls -lR /coda'. This should trigger resolution for the parts of the tree that are out of sync between the servers. On Wed, Jun 06, 2001 at 06:37:57AM -0700, Ed Kuo wrote: > Chris Identity crisis? ;) JanReceived on 2001-06-06 10:02:25