(Illustration by Gaich Muramatsu)
On Wed, Jun 27, 2007 at 10:36:35AM -0700, Yan Seiner wrote: > I'm trying to build a pair of coda "appliances" - basically embedded > boxes with a VPN and coda server/client, each acting as a samba server > to its network. The goal is to have two identical replicas of the same > data. > > One side would be the server, the other would be the client. Otherwise > the boxes would be identical. > > I've got coda built and installed, and now I'm trying to map out my > approach. > > The hardware consists of a 200 MHz ARM CPU with 32 MB of RAM. The data > consists of approximately 300 GB of CAD files. > > Is this enough RAM? Can the RVM metadata be kept in a swap partition or > do I need physical RAM for it? Sounds like your hardware is in the same ballpark as the linksys NSLU I have at home. I guess it 'could' run a server, but I really haven't tried. The metadata is VM backed, so having swap space is definitely useful, the server doesn't really care about physical ram except that swapping will slow it down, which in turn would cause the client to switch to disconnected or weakly connected operation even over a well connected network. We use a private mmap of the RVM data file, so in low memory situations, clean in-memory pages are simply discarded and paged back in. Dirty pages are written to swap. A problem with your setup is that the box that runs the server will also have to run a client in order to provide local access. But that will mean that the meta-data is cached both by the server as well as the client and possible some in the kernel and in the samba daemon. And I think that would really get a bit tight with only 32MB of memory. Other problems are that clients connected to the samba daemon won't be able to repair conflicts, so a conflict is pretty much deadly in such a setup (and in Coda's optimistic model unavoidable). Also unlike samba and nfsd daemons, the Coda servers are stateful, they remember which clients fetch a copy of what objects and send callbacks if any of the files change. Every callback requires a bit of allocated memory, with many files x many clients it does add up, but in your case you'd only have two, maybe three, clients. Our old Coda deployment ran on reasonably modest hardware, the Coda testserver was a Pentium 90 with 64MB of memory and it didn't really have much trouble, although it did have swap, rvm log, rvm data and the file data (/vicepa) on separate spindles (4 scsi drives) The main server group used to consist of something like PII 200Mhz with 128MB of memory, but again we spread swap, rvm log, rvm data and file data across different disks. > Also, how should I structure this so that all the data is available to > both sides, even in the event of a VPN failure? (These boxes would be > pretty much on opposite sides of the globe, so I can't really be sure > the VPN will be available 100%.) This means that the client would have > to actively hoard all the data? Is that practical? Or should I use a > different approach? It really depends on how many file objects you are talking about. If each file is 1GB, then we're just talking about ~300 files and I would not see any possible problem hoarding everything. If each file is ~4KB then I don't think it is feasible (at the moment) the client won't be able to keep all the metadata in memory and it will basically bring the device to a virtual halt in a swap frenzy about every 10 minutes during the hoard walk. Have you considered a setup that periodically mirrors or syncs both sites with something like unison or rsync. I just think that if your clients are going to be using a stateless filesystem to access the data on the appliances, they would just suffer from the drawbacks of Coda's weaker consistency model (no file locking, files becoming inaccessible due to conflicts) without really benefitting from Coda's features (persistent local disk cache, fast access to cached file data, writeback logging and log optimizations, directory ACLs for access control). JanReceived on 2007-06-30 00:40:16