(Illustration by Gaich Muramatsu)
Hello Satya, On Fri, Mar 23, 2007 at 01:22:41PM -0400, M. Satyanarayanan wrote: > But seriously, is computing SHA-1 on fetch and each close ( ) too much > overhead on typical machines of people on codalist? > -- Satya As I perceive it, a hash is (even if currently not mandatory) part of the file's metainformation and as such computing on fetch is not always necessary. (given other problems one faces while modifying bigger files on Coda) the extra hash calculations at store would certainly not make any noticeable difference, especially if the calculations are done lazy, just before the store would be sent to the servers, after all other possible optimizations. --- a different, related, matter: the checksums should include a per file random IV so that given a contents one could not predict the corresponding Coda hash. Otherwise the metainformation indirectly reveals the file contents. The IV should be made available to the client along with the file contents, but not otherwise. (It would also be necessary for creation of usable lookaside data sets.) That would imply sometimes double hash calculation at store, once with the old IV to discover that the file has changed, then again with a new random IV for storing both the contents and the new IV and the hash. I think that would be ok. Note that the first calculation, to verify if the file is intact, is only necessary with "ill-behaved" applications who open r/w. With r/o open Venus knows for sure the store is unnecessary, a simple check for changed size/time would cover a huge part of r/w cases. The calculation of the new hash, on the other side, should be imho mandatory. The server can possibly help old clients by calculating the hash itself, during the migration period... Of course, there is another question whether the server has some place to store that extra information and if the protocol, current or a compatible one, can handle it? Regards, RuneReceived on 2007-03-23 14:48:42