(Illustration by Gaich Muramatsu)
On 04/02/2011 02:15 PM, Simon de Hartog wrote: > we are quite worried about dropping data, but if the current situation > only becomes worse, we choose to discard some data indeed. There are periodic CML checkpoint files in /var/lib/coda/spool that typically can be used to recover files that were not yet reintegrated. There is a user specific directory which contains per-volume .cml and .cpio files. The .cpio contains the file data, the .cml lists the operations. I typically just grab a copy of the .cpio and extract it so that at least newly created files are recovered. Of course the checkpoint is only made when the volume doesn't have a write lock, so in your failure case it might not have been updated. Normally you can also force checkpointing with cfs ck /coda/path/to/vol > voicemails. What happens is that a file is created to store the .wav > info and then after it has been closed it is quickly renamed so that the > filename contains the length of the voicemail in seconds. > > Could this be an issue that not always goes smoothly, i.e., creating a > file and renaming it before it is created and reintegrated on the > servers? I can't really imagine it would be a problem, but I'm not > familiar with Coda sources. Rename is very reliable when both the source and destination are the same directory and it is still pretty good when the destination directory is a child of the source directory. There are some situations where a rename to a sibling or parent directory can be problematic. This is because resolution works per-directory, when both src and tgt are in the same directory everything can resolve in a single operation, but if they are different directories and we fail to resolve the target directory first we use the resolution logs to find and try to resolve the source directory, but there is a limit to how deeply we recurse when resolution fails. Then when the automatic resolution fails and we are left with a server-server conflict, we have to manually repair the source first and the default suggested repair option (to recreate missing files on all replicas) actually won't work because the file is technically not missing but located in a different directory. JanReceived on 2011-04-03 02:06:58