(Illustration by Gaich Muramatsu)
i have the same problem whenever i do something like tar'ing up /coda. this is on a regular ol' 100 meg ethernet network, with no firewalling or anything else suspect going on. i'm also not really sure what to do. On 07/20/04, shivers_at_cc.gatech.edu wrote: > I am trying to set up a coda filesys. My experience is that when the > system works, it's very nice. This is, however, rare. Mostly the system > acts in flaky & sensitive ways that require constant intervention. Can > anyone advise me? > > I am running all the latest stuff: release 6.0.6-1 on both client & server. > Server & clients all run on linux boxes. The server has 1Gb of RVM, in a file, > not a partition, due to hints I've seen in the docs & on the mailing list > about paging, mmaping, etc. (1Gb of RVM is an undocumented option handled by > the vice-setup scripts.) The RVM log is 25Mb, on a raw partition. The files > live in a 400Gb ext3 filesys on /vicepa. For the initial trial, I started with > my personal music collection -- 9Gb of mp3 & flac files. The mp3 files are > roughly 1-5 Mb each; the flac files are 5-30Mb each. So: a small number of big > files. > > The server is sitting in a real machine room, with real network connectivity: > gigabit ethernet to a lan, and a real pipe to the internet from there. > > I set up several clients: > CM: Client "CM" is a linux box sitting behind a standard home cable modem. > It has constant connectivity ranging up to 1Mb/s. > > LOCAL: Client LOCAL is on the server; no network in the picture. > > WAN: Client WAN is a linux box sitting on an ethernet in my office at > Georgia Tech -- no cable-modem between it & the Net. > > I mention the cable-modemness of the client connection, because Jan has posted > earlier saying that the asymmetry of cable-modem bandwidth confuses coda's > bandwith measurements -- it assumes incoming bandwidth equals outgoing. > > ------------------------------------------------------------------------------- > Failure 1: > > The first thing I did was copy 9Gb of music files into my coda fs on the LOCAL > client -- that is, the files were copied from a local ext3 filesys into a coda > filesys *on the machine where the coda server runs*. This worked, thought it > was a little weird to see "red zone -- stalling blah blah" messages on such > net-less operation. > > I was able to access these files from client CM & WAN in onesies & twosies > with no problem. Then I tried, on client WAN (that is, the client that > communicates with the server over a long-distance Internet connection, but > doesn't have a wimpy cable modem connecting it to the Net): > > find . -type f -exec md5sum {} \; > > At first, it ran like a champ. Then it didn't. Here's the tail of the > recursive md5sum walk: > 4c507b84f2191ad0c9e8921e0f543ac7 ./affection/cd.db > 152c03eacd67ba3f28462abcacd85453 ./affection/track08.cdda.flac > 8cd0cf505aa05c7bebbac9fa94560289 ./affection/track08.cdda.mp3 > 841ab15fa7bde3e019c06f7b0394351d ./affection/audio_02.inf > 9e6b48f2d7fe648f83bec2a221a8e5d8 ./affection/track11.cdda.flac > ae7ca0a4a359a8f92d9a079a7cc8e364 ./affection/audio_12.inf > 0ccaf063cbf89f7345955257d96134ad ./affection/cdp-q > 4c1d9cc34e790c8fb52975a9973ce10d ./affection/track13.cdda.mp3 > 4f9377b72053579bcb80e59bb5ad610e ./affection/audio_10.inf > 44e6c8a3596b59f589d934c624141e8d ./affection/track07.cdda.flac > 9bea35fbbb5f679a8de559bdfd37bf6c ./affection/track10.cdda.mp3 > 4b0fb02ca5944804cc403b6ff1f3797a ./affection/audio_01.inf > md5sum: ./affection/track05.cdda.flac: Connection timed out > find: ./affection/audio_08.inf: Connection timed out > find: ./affection/track05.cdda.mp3: Connection timed out > find: ./affection/audio_11.inf: Connection timed out > find: ./rampal1: Connection timed out > find: ./rampal2: Connection timed out > find: ./sleepbeauty+toyshop: Connection timed out > find: ./porter-on-mind: Connection timed out > find: ./th-md5s: Connection timed out > find: ./mozart-horn-concerti3: Connection timed out > find: ./th: Connection timed out > find: ./algreen: Connection timed out > find: ./bush-story: Connection timed out > find: ./mozart-wind-concerti: Connection timed out > find: ./oconor-piano: Connection timed out > find: ./eagles-hits: Connection timed out > find: ./anything-goes-yoyo: Connection timed out > find: ./beeth-piano1: Connection timed out > > Again, note that this lossage occurred on a system with no cable modem, and > presumably symmetric bandwidth to the server. > > ------------------------------------------------------------------------------- > Failure 2: > > On client CM -- the system connected to the Net via a home cable-modem -- > I attempted to copy a really large dir of media (about 200Gb) into my coda > filesys: > > cp -prv . /coda/lambda.csail.mit.edu/shivers/music-npx > > It managed to copy about 15 files, then codacon began blatting out > > Red zone, stalling writer ( 00:33:35 ) > > messages, and then the client went write-disconnected. > > % cfs lv ~/c > Status of volume 0x7f000000 (2130706432) named "coda:root" > Volume type is ReadWrite > Connection State is WriteDisconnected > Minimum quota is 0, maximum quota is unlimited > Current blocks used are 10324696 > The partition has 371998976 blocks available out of 382693232 > Write-back is VIOC_STATUSWB: Invalid argument > > Note the weirdo final line -- "VIOC_STATUSWB: Invalid argument"? What's that? > > During this time, the net connection was completely solid. I mean, I might > have gotten less than my nominal 1Mb/sec, but the connection was always there. > > So the real-world operation of coda here is that if you start writing a lot of > data, you disconnect, and then your writes just fail. So you can't ever count > on some operation actually working; it could very easily fail mid-stream. > > ------------------------------------------------------------------------------- > Failure 3: > > On client CM -- the one connected via cable-modem -- I also did a > > find . -type f -exec md5sum {} \; > > in the coda dir holding the 9Gb of music. It won for a couple of files, then > began to barf out msgs like this > > md5sum: ./thelonious/track03.cdda.flac: Connection timed out > find: ./thelonious/track03.cdda.mp3: Connection timed out > find: ./thelonious/audio_17.inf: Connection timed out > find: ./thelonious/track07.cdda.mp3: Connection timed out > . > . > . > > ------------------------------------------------------------------------------- > > What I find, in general, is that I cannot rely on file ops completing. Apps > that access my coda files sometimes win and sometimes seem to drive the > system into disconnected state, and then I must go through a > cfs wr > cfs cs > cfs lv . > dance to reconnect. This happens when I am on a client with a completely > stable connection to the ethernet. We are not talking phone lines here. > This essentially renders coda unusable. > > I tried jacking up the timeout & retry values on the client and server to > see if that would help. Maybe it did, some. But I am still definitely losing. > > I also tried doing a > cfs strong > I don't have a super-clear idea of how this would affect my operation -- > the one-line description with the cfs doc is that it prevents the system > from ever going into weak-connectivity mode, but that doesn't mean it would > prevent the system from going write-disconnected. In any event, when I do > this, my client becomes more or less coda-catatonic. > > Some questions: > 1. Am I doing something wrong? > > 2. Do other people lose in this way? / Are other people winning? > I do not see similar reports on this mailing list. Is it that no one > is hammering on their servers with big files? Is it that no one is > connecting via cable modems? I don't have a good feeling for how many > people are really using coda and in what configuations. > > 3. Is coda not ready for really big repositories (800Gb filesys, 1Gb rvm > metadata)? > > 4. Any advice at all? > > I'm a little dismayed to be losing at such a simple stage of useage. I'm > not having problems with reintegration conflicts or any of the real voodoo. > I'm getting hosed just reading & writing files *while connected*. > > BTW, I'm also surprised that coda is having problems with asymmetric network > connections like cable modems in 2004. The lion's share of mobile connections > these days at private residences is through connections of this sort. > -Olin -- steve simitzis : /sim' - i - jees/ pala : saturn5 productions www.steve.org : 415.282.9979 hath the daemon spawn no fire?Received on 2004-07-21 01:57:19