(Illustration by Gaich Muramatsu)
Hi all, since a few weeks, we have repeatedly had problems with one Coda client that doesn't seem to push his updates to the server. We have monitoring on every client and get a call when the CML entries go over 25. I've found a (from what I think) local/global conflict. I'll just post some info, not sure what you need to be able to point me in the right direction. We have two servers and currently about 8 clients. The problem client is called cmp06. The volume with the conflict is named cmpprod. This already happened before. The actions we resorted to the last two times were stop all apps using files in /coda, stop venus, de-install venus and "rm -rf /var/log/coda /var/lib/coda /var/cache/coda" and then reinstall venus again from scratch. This worked for a while, modifications were correctly pushed to the servers and showed up on other clients. Output of commands run on cmp06: root_at_cmp06:/# ctokens Tokens held by the Cache Manager for root: @nkh.spup.net Coda user id: 10001 Expiration time: Sat Apr 2 21:37:02 2011 root_at_cmp06:/# cfs cs Contacting servers ..... All servers up root_at_cmp06:/# cfs lv /coda/nkh.spup.net/cmpprod Status of volume 7f000004 (2130706436) named "cmpprod" Volume type is ReadWrite Connection State is Reachable Reintegration age: 0 sec, time 15.000 sec Minimum quota is 0, maximum quota is unlimited Current blocks used are 2965098 The partition has 7823104 blocks available out of 11756312 *** There are pending conflicts in this volume *** There are 30 CML entries pending for reintegration (3617288 bytes) The command cfs listlocal /coda/nkh.spup.net/cmpprod never returns and gives no output at all (waited for a little over 30 minutes) The directory containing the conflict shows: root_at_cmp06:/coda/nkh.spup.net/cmpprod/voicemail/company8184892/217684# ls -alFh 20110401-130733-31546631891-1301656053.482641-386.wav lrw-r--r-- 1 root nogroup 29 Apr 1 20:41 20110401-130733-31546631891-1301656053.482641-386.wav -> @7f000004.000035ce.00002610_at_n The client has coda-client 6.9.5 installed from your Debian package, the servers have coda-server and coda-update Debian packages with version 6.9.4. The /var/log/coda/venus.log is filled with entries like these: [ W(177) : 0000 : 21:08:54 ] WAIT OVER, elapsed = 5005.9 [ W(177) : 0000 : 21:08:54 ] WAITING(VOL): cmpprod, state = Reachable, [0, 0], counts = [0 0 5 0] [ W(177) : 0000 : 21:08:54 ] CML= [30, 103], Res = 0 [ W(177) : 0000 : 21:08:54 ] WAITING(VOL): shrd_count = 0, excl_count = 0, excl_pgid = 0 And the /var/log/coda/venus.err contains: 21:00:02 volume cmpprod has unrepaired local subtree(s), skip checkpointing CML! 21:02:27 DispatchWorker: signal received (seq = 654736) 21:10:02 volume cmpprod has unrepaired local subtree(s), skip checkpointing CML! So I executed repair with the following transcript: root_at_cmp06:/coda/nkh.spup.net/cmpprod/voicemail/company8184892/217684# repair This repair tool ... <cropped> ... the current repair session. repair > beginrepair Pathname of object in conflict? []: /coda/nkh.spup.net/cmpprod/voicemail/company8184892/217684/20110401-130733-31546631891-1301656053.482641-386.wav And is does not give any results, already waited for over 10 minutes now. The directory listing doesn't show any expanded replicas, only the broken symlink. The other clients all show the above mentioned file with a size of 0 bytes. I'm not sure whether this is too much, too little or "sufficient" debug info. If anyone needs more info, please let me know so I can provide it. Thank you very much in advance for your effort. Kind regards, Simon de Hartog Special Technical Services SpeakUp B.V. http://www.speakup.nl/Received on 2011-04-01 15:48:33