Coda File System

okay, what am I doing wrong?

From: Rod Van Meter <Rod.VanMeter_at_nokia.com>
Date: 08 Jan 2003 16:34:23 -0800
I have two clients (my laptop and my desktop) and one server, all
running 5.3.20 on Red Hat 8.0 (WITHOUT my IPv6 code :-).  Right now, I'm
in a bad state: I can create files on my laptop that never get
reintegrated properly, and my laptop also doesn't see files created on
my desktop.

The files from the desktop are getting correctly pushed to the server
(they appear in /vicepa), but the ones from the laptop don't.

The laptop moves around from network to network, but even being rebooted
while connected to the same network as the other two doesn't help.

When I did clog rdv, I got:

[root_at_localhost rdv]# 16:10:37 Checkpointing developers:rdv
16:10:37 to /usr/coda/spool/500/developers_rdv@_coda_rdv.tar
16:10:37 and /usr/coda/spool/500/developers_rdv@_coda_rdv.cml
16:10:37 Reintegrate: developers:rdv, 100/114 records, result = Unknown
error 198

=====

[root_at_localhost rdv]# cfs lv /coda
  Status of volume 0x7f000000 (2130706432) named "codaroot"
  Volume type is ReadWrite
  Connection State is Connected
  Minimum quota is 0, maximum quota is unlimited
  Current blocks used are 3
  The partition has 22625968 blocks available out of 22626568
  Write-back is disabled

[root_at_localhost rdv]# cfs cs
Contacting servers .....
All servers up

=====

The tail of /usr/coda/etc/venus.log on my laptop:

        Rename : sid = (ba26f17.562), time = 1041955171, uid = 500 tid =
-1 bytes = 265
                pred = (0, 0), succ = (0, 0)
                to_be_repaired = 0
                repair_mutation = 0
                frozen = 0, cancel = 0, failed = 0, committed = 0
                spfid = (0x7f000001.0x3.0x42), sname =
(labbook-200301.txt)
                tpfid = (0x7f000001.0x3.0x42), tname =
(labbook-200301.txt~)
                sfid = (0x7f000001.0xfffffffe.0x800b8)
                spvv = [ 0 0 0 0 0 0 0 0 ] [ 0 0 ] [ 0 ]
                tpvv = [ 0 0 0 0 0 0 0 0 ] [ 0 0 ] [ 0 ]
                svv = [ 0 0 0 0 0 0 0 0 ] [ 0 0 ] [ 0 ]
        Create : sid = (ba26f17.563), time = 1041955171, uid = 500 tid =
-1 bytes = 246
                pred = (0, 0), succ = (0, 0)
                to_be_repaired = 0
                repair_mutation = 0
                frozen = 0, cancel = 0, failed = 0, committed = 0
                pfid = (0x7f000001.0x3.0x42), name =
(labbook-200301.txt)
                cfid = (0x7f000001.0xfffffffe.0x800ba), mode = 664
                pvv = [ 0 0 0 0 0 0 0 0 ] [ 0 0 ] [ 0 ]
        Store : sid = (ba26f17.650), time = 1042067193, uid = 500 tid =
-1 bytes = 228
                pred = (0, 0), succ = (0, 0)
                to_be_repaired = 0
                repair_mutation = 0
                frozen = 0, cancel = 0, failed = 0, committed = 0
                fid = (0x7f000001.0xfffffffe.0x800ba), length = 3391
                vv = [ 0 0 0 0 0 0 0 0 ] [ 0 0 ] [ 0 ]
                rhandle = (0,0,0)       ph = 0.0.0.0 (-1)
        Create : sid = (ba26f17.651), time = 1042068719, uid = 500 tid =
-1 bytes = 234
                pred = (0, 0), succ = (0, 0)
                to_be_repaired = 0
                repair_mutation = 0
                frozen = 0, cancel = 0, failed = 0, committed = 0
                pfid = (0x7f000001.0x1.0x1), name = (delme2)
                cfid = (0x7f000001.0xfffffffe.0x800d9), mode = 664
                pvv = [ 0 0 0 0 0 0 0 0 ] [ 0 0 ] [ 0 ]
        Store : sid = (ba26f17.654), time = 1042069566, uid = 500 tid =
-1 bytes = 228
                pred = (0, 0), succ = (0, 0)
                to_be_repaired = 0
                repair_mutation = 0
                frozen = 0, cancel = 0, failed = 0, committed = 0
                fid = (0x7f000001.0xfffffffe.0x800d9), length = 54
                vv = [ 0 0 0 0 0 0 0 0 ] [ 0 0 ] [ 0 ]
                rhandle = (0,0,0)       ph = 0.0.0.0 (-1)
[ I(23) : 0000 : 16:10:37 ] IncReintegrate: (developers:rdv,-106) result
= Unknown error 198, elapsed = 514.1 (80.8, 46.9, 386.4)
[ I(23) : 0000 : 16:10:37 ]     new stats = [  49,  10.9,  2885.5,  
51,  12.0], [  52,  11.6,    33.5,   60,  16.0]

[ W(19) : 0000 : 16:11:07 ] Cachefile::SetLength 1536
[ W(19) : 0000 : 16:11:07 ] fsobj::StatusEq: ((0x7f000001.0x54.0x2b)),
Owner 500 != 82191

[ T(01) : 0124 : 16:11:20 ] BeginRvmFlush (1, 37356, T)
[ T(01) : 0124 : 16:11:20 ] EndRvmFlush

[ W(19) : 0000 : 16:11:29 ] Cachefile::SetLength 1536
[ W(19) : 0000 : 16:11:29 ] repvol::LogRemove: record cancelled,
labbook-200301.txt~, size = 247
[ W(19) : 0000 : 16:11:29 ] Cachefile::SetLength 3690
[ W(19) : 0000 : 16:11:29 ] repvol::LogRemove: record cancelled,
.#labbook-200301.txt, size = 248
[ W(19) : 0000 : 16:11:48 ] Cachefile::SetLength 4636
[ W(19) : 0000 : 16:11:48 ] repvol::LogRemove: record cancelled,
.#labbook-200301.txt, size = 248

[ T(01) : 0131 : 16:12:13 ] BeginRvmFlush (1, 28436, T)
[ T(01) : 0131 : 16:12:13 ] EndRvmFlush

[ T(01) : 0135 : 16:12:48 ] BeginRvmTruncate (207, 66028, I)
[ T(01) : 0135 : 16:12:48 ] EndRvmTruncate

[ H(06) : 0003 : 16:13:13 ] HDBDaemon just woke up
[ H(06) : 0003 : 16:13:13 ] DataWalk:  Restarting Iterator!!!!  Reset
availability status information.
[ H(06) : 0003 : 16:13:13 ] Tally for vuid=0:
[ H(06) : 0003 : 16:13:13 ] BeginRvmFlush (1, 160, F)
[ H(06) : 0003 : 16:13:13 ] EndRvmFlush
[ H(06) : 0003 : 16:13:13 ] Tally for vuid=0:
[ H(06) : 0003 : 16:13:13 ]     Priority=600: Available=516
Unavailable=0 TotalSize=516 Unknown=0

[ H(06) : 0004 : 16:13:13 ] HDBDaemon about to sleep on hdbdaemon_sync

[ T(01) : 0138 : 16:13:18 ] BeginRvmTruncate (4, 320, I)

[ V(04) : 0139 : 16:13:18 ] repvol::CheckLocalSubtree:
(developers:rdv)reset has_local_subtree flag!

[ T(01) : 0138 : 16:13:18 ] EndRvmTruncate
[ T(01) : 0138 : 16:13:18 ] BeginRvmFlush (1, 60, T)
[ T(01) : 0138 : 16:13:18 ] EndRvmFlush

[ T(01) : 0139 : 16:13:23 ] BeginRvmTruncate (0, 220, I)
[ T(01) : 0139 : 16:13:23 ] EndRvmTruncate

======
And then later, on the console:

[root_at_localhost rdv]# 16:23:27 volume developers:rdv has unrepaired
local subtree(s), skip checkpointing CML!

This, despite the fact that I believe there should be no conflicts, and
repair says:

[rdv_at_localhost rdv]$ repair
This repair tool can be used to manually repair server/server
or local/global conflicts on files and directories.
You will first need to do a "beginrepair" to start a repair
session where messages about the nature of the conflict and
the commands that should be used to repair the conflict will
be displayed. Help message on individual commands can also be
obtained by using the "help" facility. Finally, you can use the
"endrepair" or "quit" to terminate the current repair session.
repair > beginrepair
Pathname of object in conflict? []: /coda/rdv
Could not allocate new repvol: Object not in conflict
beginrepair failed.
repair > beginrepair
Pathname of object in conflict? []: /coda/rdv/nokia
Could not allocate new repvol: Object not in conflict
beginrepair failed.
repair > quit

=====

Advice on how to go about debugging this?  What should I be looking for,
and where should I look for it?

		--Rod
Received on 2003-01-08 19:44:41