Coda File System

Serious Coda bugs

From: Jan Harkes <>
Date: Thu, 30 Dec 1999 23:57:44 -0500
Today I discovered 2 serious bugs. one has only been in the development
tree for a couple of weeks. The other must have been present for the
past 6-8 months.

The `development' bug affects only store operations to replicated
servers. At some point we switched the file transfer code from
FILEBYNAME to FILEBYFD, but the code in sftp for the BYFD case has
possibly had a bug ever since multi-rpcs were introduces. It didn't
handle the fact that the fileoffset is shared between the multiple
transfers. So the first server would get the first window, the second
server the second window, etc. Until the file is transferred at which
point the servers complain about a length discrepancy and return an
error. As a result, the client switches to write disconnected operation,
and reintegrates the store. The backfetches are normal rpcs and the
reintegration succeeds.

The other bug is related to shadowing files during reintegration. When a
file is opened for writing but involved in a reintegration, a new
container is created and the data is copied over. Then a swap is
performed to avoid interfering with a possibly ongoing backfetch. This
swap is not recorded in RVM, and the next time venus starts it attaches
the fso to the discarded shadow file. If the fso was marked dirty, an
assertion is triggered, otherwise the file is refetched.

Also, due to how shadowfile names are created, if a file is shadow-ed
twice, the containerfile is truncated during the copy. Instant dataloss.

It is a testament to the robustness and failure tolerance of Coda's
design that these problems have gone mostly unnoticed. I've committed
experimental patches to the CVS for both bugs, but have only tested them


ps. There is one known Y2K problem, but it is quite harmless. Look in
your codacon in the new year, it will show 01/01/100 :) (except if you
run bleeding edge with the aforementioned other bugs).
Received on 1999-12-31 00:03:34