(Illustration by Gaich Muramatsu)
On Thu, Jun 16, 2005 at 08:12:19AM -0400, Greg Troxel wrote: > Don't worry - expanding replicas seems way too hairy. I'm going to > try moving LOG, DATA and vicepa. > > On another logical machine (the one whose old hw will be the new > server), I moved all the venus state with rsync, and found wrong > contents and had the machine lock up. So I imagine venus has the > inode #s hardwired and passes those to the kernel which might not > check well enough (NetBSD 2.99.15). I'll mklka (thanks again Satya > for doing this) and reinit. Yes, on *BSD venus passed down a device/inode number pair which is used to access the container file*. The inode numbers are stored in RVM probably to avoid the overhead of calling stat() on the container file before we pass the information to the kernel. One way to fix this would be to refresh the RVM cached inode numbers by replacing the 'inode == tstat.st_ino' test in CacheFile::ValidContainer with an assignment. The other way would be to not store the inode numbers in RVM and stat() the container whenever we return from the CODA_OPEN upcall. * The Windows kernel modules don't work based on device/inode numbers, so they use CODA_OPEN_BY_NAME and we return the name of the container file. Linux is different as well, it uses CODA_OPEN_BY_FD and we return an open file descriptor. The inode numbers are not guaranteed to be unique or are even non-existent on filesystems that live completely in the pagecache like ramfs and tmpfs. I think there also was a problem with journalling filesystems (reiserfs) which didn't actually commit any writes to disk if we directly accessed the container file by device/inode number pair because the journal commit was only performed when a filehandle was closed. JanReceived on 2005-06-16 10:35:58