(Illustration by Gaich Muramatsu)
On Mon, Apr 04, 2005 at 09:55:14AM -0600, Patrick Walsh wrote: > > > Our total storage need in coda will be around 40gb. > > > > Then you want to run rvmsizer to check - probably you will be fine with > > one server process, then use the maximal rvm size available, 1G. > > rvmsizer suggests a rvm size of 70mb (and that's with a little cushion > added by me). Would you recommend bumping it up to 500mb or 1G anyway? > Note that these machines have 1GB of RAM and if the RVM must reside in > memory, then it seems that it ought to be smaller than 1GB (provided the > number of files and directories is sufficiently small). Is that the > correct thinking? Or is the RVM metadata information no longer > completely mapped into memory? It is still completely mapped in memory, but it is typically possible to fit at least 1GB into the available 4GB space. Larger than that becomes more difficult and requires tricks like statically linking binaries so that shared libraries aren't loaded in the places where we want RVM and tweaking the base address where RVM is places to avoid it from bumping into the stack, etc. btw. my servers.. One group holds about 42GB of file data and uses 226MB of RVM, the other holds 36GB of data and uses 158MB of RVM. The volumes on these servers are a mix, and contain pretty much everything from the Coda webpages/ftp and public CVS to user home directories. > > > * Second: I think I remember reading something about avoiding ext3. Is > > > that for the actual files? Or just for rvm metadata and logs? > > > > It is no problem nowadays. > > Since I'm creating a partition just for file data for coda, is there a > best-performing fs type to use? ext2 or ext3? Or does it make > absolutely no difference? For servers it shouldn't matter all that much, but the ext3 journalling might make recovery a bit more reliable. Basically during startup the server checks for every file in RVM whether the corresponding file exists in /vicepa and typically triggers an assertion if this isn't the case, mostly to prevent one corrupt server in a replicated group from spreading the corruption to the other servers. If the data is still correct on another server it is often safer to just destroy the corrupt volume on a crashed server, then recreate the underlying replica and resolve it's contents back from the other server(s). > Thanks for the help. I'm on my third test rollout of coda and I'm > getting a better handle on what I'm doing. At some point I'll be back > here with questions about backing up. I'm a bit unclear as to why > standard dump type utilities can't be run on a /coda filesystem. Also, Our file identifiers are 128-bit, but the Linux VFS (and probably other Unixes) only exports 32-bit inode numbers, so we hash the 128-bit identifiers to mostly unique 32-bit ones. Userspace tools like dump and tar use the inode number to identify hard-linked files, and as such any collisions will be interpreted as a hard link and any files that happen to have the same 32-bit inode number as a previously backed up file will end up getting skipped. One solution would be to modify the backup tools to check if 'nlink == 1' in which case the second file clearly cannot be a hard linked copy of the first one. > I thought that having multiple replicating servers provided automatic > backup. But I'm not ready to tackle these questions fully just yet... But updates might have only reached one of the replicas, we only detect version skew between replicas when a client checks the file or directory attributes which contain the version vector. If a client only committed updates on a single server, because of network problems or the other servers were down, then those differences will only be resolved when at some later point in time a client that is connected to all replicas happens to look at the divergent objects. JanReceived on 2005-04-04 17:30:28