(Illustration by Gaich Muramatsu)
On Tue, 15 Mar 2005 16:36:37 -0500, Jan Harkes <jaharkes_at_cs.cmu.edu> wrote: > On Mon, Mar 14, 2005 at 10:05:31AM -0300, Gabriel B. wrote: > > > And once codasrv is started it asks if it can create the rootvolume. > > > > I'm using the .deb package from CM servers. It never asked me about > > root volume. I even opened a thread asking if the docs were outdate > > because of this. > > And in that thread I responded, > > "Things have changed in the hope to simplify the initial setup." > http://www.coda.cs.cmu.edu/maillists/codalist/codalist-2005/7198.html > > The change happened with the release of Coda-6.0.7, > > "createvol_rep doesn't use the VSGDB anymore, instead of specifying a > server group with the mystical E0000XXX numbers, the servers are now > explicitly named; 'createvol_rep <volumename> <server>[/<partition] ...'" > http://www.coda.cs.cmu.edu/pub/coda/Announcement.6.0.7 > > > > > pbtool > > > > > nu bossnes > > > > > ng www 1076 (bossnes id) > > > > > > > > createvol_rep / camboinha.servers/vicepa -3 > > > > that didn't worked. i waite 2hours and ctrl_Clled. > > > > > > Where did you get that '-3'? > > > > from the list command > > GROUP www OWNED BY bossnes > > * id: -3 > > * owner id: 1076 > > * belongs to no groups > > * cps: [ -3 ] > > * has members: [ 1076 ] > > Right, so createvol_rep interprets that as a server named '-3', and > because it is '-3' we fail to catch it with the following test, why is that? camboinha# createvol_rep bad args: createvol_rep <volname> <server>[/<partition>] [<server>[/<partition>]]* [groupid] How can i specify a group id then? > > # Validate the server > grep $SERVER ${vicedir}/db/servers > /dev/null > if [ $? -ne 0 ]; then > echo Server $SERVER not in servers file > exit 1 > fi > > So we end up running > volutil -h "$SERVER" getvolumelist "/tmp/vollist.$$" > > Which then tries to contact a server named '-3'. Now on my machine it > quickly returns with '-3 is not a valid hostname'. hum.. here it hangs. running the script with bash -x i see it hangs in ++ sed 's/[^\/]*\(.*\)/\1/' + PART= + grep -3 /vice/db/servers So, indeed it treated the -3 as a server/partition argument. The partition is null and the -3 is treated as an argument by some grep linuxism. the fact that yours continue processing, means you probably on some other unix. > > > > > volutil create_rep /vicepa / 00001 > > > > bldvldb.sh > > > > (a valid workaround?) > > > > > > No it is not, since this only creates the underlying volume replica > > > (which should be named "/.0") And again, where does that strange 00001 > > > number come from? The createvol_rep script does this first but then > > > creates the replicated volume by dumping the existing (currently empty) > > > VRDB into the /vice/db/VRList file, appending a entry that describes > > > which replicas are part of the replicated volume and recreates a new > > > VRDB file from the data in the updated /vice/db/VRList. > > > > hum... is it in binary form or human readable? can you send an example? > > Again, where does that strange 00001 come from? That is setting the > replicated volume id to 1, but you can't actually have a replicated > volume with the volume id 1 as replicated volumes are supposed to always > have a volume id that looks like 0x7f0000nn. > > The first byte in the 4-byte volume id number is used to map to the > specific server identifier in /vice/db/servers on which the volume > replica is located, and 0x7f (127) is reserved to indicate that this is > a replicated volume that isn't located on any particular single server > but represents a group of individual replicas. We need this, because in > some cases we get just the volumeid and we don't know if it is supposed > to be a replicated volume or some underlying volume replica. > > So although the VRList file is human readable and to a certain extend > can be edited by hand. I don't think it will be a wise thing to do so > without really knowing how it is used to glue individual replicas > toghether. There are a lot of constraints of what is considered valid > or not and and a single misplaced character can break the parser that > has to convert it back to the VRDB file. > Hum, nice info. thanks! now that i can use createvol_rep i'm sticking to your adivice and not editing those files by hand. Except this time that i run out of Inodes (was using the 2M files ftree) i manually blanked all those files and created a new vicepa with 16M files. now i can create the volumes by the book: createvol_rep h.album.i.big camboinha.servers/vicepa > > > The only 'worst case' that I know of is when we initially contact a > > > realm since we are hit by multiple RPC2 timeouts, one when we try to > get > > > the rootvolume name, one when we fall back on getvolumeinfo, at this > > > point the mount fails, but with a colorizing ls we get an additional > > > readlink and getattr on the mountpoint both of which also trigger an > > > attempt to mount the realm (i.e. another 4 timeouts). So we end up > > > blocking for about 6 minutes if the realm is unreachable. > > > > I just let a "cfs lv /coda/camboinha.servers" running friday. It's > > monday and i had to control+c it. > > The server is still running tought. > > On the client run, > > strace -e trace=network -p `pidof venus` (optionally add "-o > strace.dump") > > This should show all the network related stuff that the client is doing. > Here is what I get when I run 'cfs lv /coda/coda.cs.cmu.edu' > > # strace -e trace=network -p `pidof venus` > Process 17369 attached - interrupt to quit > sendto(8, ..., 92, 0, {... sin_port=htons(2432), > sin_addr=inet_addr("128.2.222.111")}, 16) = 92 > sendto(8, ..., 92, 0, {... sin_port=htons(2432), > sin_addr=inet_addr("128.2.209.199")}, 16) = 92 > sendto(8, ..., 92, 0, {... sin_port=htons(2432), > sin_addr=inet_addr("128.2.191.192")}, 16) = 92 > recvfrom(8, .., 4360, 0, {... sin_port=htons(2432), > sin_addr=inet_addr("128.2.222.111")}, [16]) = 156 > recvfrom(8, ., 4360, 0, {... sin_port=htons(2432), > sin_addr=inet_addr("128.2.209.199")}, [16]) = 156 > recvfrom(8, ..., 4360, 0, {... sin_port=htons(2432), > sin_addr=inet_addr("128.2.191.192")}, [16]) = 156 > Process 17369 detached > > Now coda.cs.cmu.edu is mapped by an entry in /etc/coda/realms to a group > of 3 servers, so we're sending the request to all three servers, and > then get three replies back. The Coda client decides which reply to > actually use. > > If your server isn't responding you would only see sendto's are > exponentially increasing intervals for about 60 seconds, at which point > the RPC2 layer gives up. This then percolates up and we end up returning > ETIMEDOUT to userspace. > > > now, i did a cunlog and a clog, and "time ls -la /coda/" > > <ctrl+c> > > > > real 14m44.403s > > user 0m0.001s > > sys 0m0.003s > > You could try to use /bin/ls, which shouldn't be colorizing and as such > doesn't try to stat every entry in coda, readlink all unmounted > mountpoints, and then stat every link destination. > > Also, is there anything in the venus.log file which could indicate that > we've already ran out of worker threads (message looks something like), > > DispatchWorker: out of workers (max 20), queueing message > hum, sorry, can't try those. i wiped that install (from the .deb) and build the sources. I don't have thos weird errors anymore. > > > > i then created two more volumes. now venus report "2 volume > replicas" > > > > > > Did you mount those volumes then? How would venus know about the newly > > > created volumes? Those 2 replicas are probably the one that is at /coda > > > and the one at /coda/camboinha.servers. > > > > It's show in the venus startup. And i'm starting it with -init every > > time. cfs hangs as well, so i will never know wich volumes it claim to > > have found. > > Those volumes are 'CodaRoot_at_localhost', the volume/directory that > is mounted at /coda). And 'Repair_at_localhost', the volume that is used > during local/global repair. Both are internal volumes that always exist. > Both volumes are in the 'localhost' realm, which is an invalid name > since localhost represents 'this machine' and as such is not usable in a > Coda's global volume naming scheme. > > > Did someone have success using the .deb version? I tried it with 3 > > sets of server/clients. each more troublesome than the other. > > Not counting 'testserver.coda.cs.cmu.edu' and the 6 servers responsible > for 'coda.cs.cmu.edu' and the 2 servers for 'isr.coda', all running the > server packages on debian-testing? > > I am using both client and server debian packages on a machine at home, > my laptop and my desktop at work (debian-unstable) although I tend to > alternate with recent CVS builds. Also one of the students in our group > is using the Coda client debian packages on something like 12-15 laptops > to move data for his experiments. > > Now a lot of things depend on whether you have a traditional /dev, devfs > or udev, if your kernel is 2.4 or 2.6, if your machine has a static or > dynamic IP address, if the network connection is permanent or > intermittend, if you have a multi-homed machine and how exactly the > multi-homing is set up, since there are about 3-4 different variations > on that theme, if there are (possibly masquerading) firewalls in your > network, and much more. > > There are literally thousands of combinations that might make or break a > seemingly simply setup. Coda servers by far the most sensitive, since > they are expected to be reachable through a single static ip-address and > that servers have reliable fairly fast connections to each other (i.e. > located in the same machine room) and there are some assumptions like > 'gethostbyname(hostname())' returning a usable IP address that we can > pass to a client instead of 127.0.0.1. > > A Coda client is a lot less picky, as it is assumed to be mobile and as > such can hop from one network to another and possibly has unreliable > connections. My laptop switches quite a bit between various wired and > wireless networks as well as running the Coda traffic through an openvpn > tunnel. I used to use dialup almost daily, but nowadays it is a cable > modem connection. But still, I've configured everything so that it never > tries to route to the Coda servers over multiple networks at the same > time as the servers wouldn't know where to send the replies to. Nice to hear all that. truly. unfortunatelly here i have almost all you pointed out above, but the scripts never behaved nicelly with the .deb simply installing from source solved most of my problems. i'm even running a client on the server. Now i only have to deal with the hassle of using million of files.... With the .deb i was constantly fighting with the scripts. now i can simply follow the docs :) Thanks! gabrielReceived on 2005-03-16 09:29:14