(Illustration by Gaich Muramatsu)
Hello Jan, It is great to see your reflections. On Thu, Jul 10, 2014 at 04:43:36PM -0400, Jan Harkes wrote: > On Wed, Jun 04, 2014 at 11:21:33AM +0200, u-codalist-rcma_at_aetey.se wrote: > > First of all: I do _not_ suggest rewriting Coda from scratch :) > > Actually I would love a rewrite of (parts of) Coda from scratch. I wish we had the prerequisits you calculated on the corresponding web page! > But replication wouldn't be a priority on my list. If you look at > Coda's development history replication was added first to improve > resilience to failing servers, and disconnected operation was added > later. I am aware that the disconnected mode and replication somewhat overlap in the resulting functionality (resilience). Nevertheless from my perspective as a deployer it is extremely precious to be able to take a server down for maintenance or replacement without ever disrupting the service (and without the need to annnounce the downtime in advance and remind the users and answer questions - this takes a lot of resources, both mine and of the service desk). In fact the whole Coda server setup and operation here is fundamentally using server replication. I'd be much less enthusiastic about deploying Coda if I hadn't this feature. It is not scalability or performance I am looking for, but handling a client cache miss during server downtime. With replication it is transparent (often even without a noticeable delay), without replication this becomes an error. > It would be quite interesting to drop all the server replication and > resolution. No more server-server conflicts to cause problems for > reintegration etc. Surely there are gains to collect but this would cost a very valuable feature. I pondered which features could be dropped from Coda to make it simpler - without loosing its applicability to where we use it. The only ones I could live without (and actually would like to) are backup and callbacks. I like the orifs' approach to replication - not making any distinction between replication and backup (and making conflicts possibly easier to resolve). Introducing a similar approach in Coda would make it both simpler and more easily manageable (aka attractive :) by dropping the backup subsystem. > A lot of the smarts in Coda is > already handled by the client, not having to deal with replication and > resolution should make the servers even simpler. Mostly reintegration > and callbacks. Another feature which can and should be dropped in Coda is callbacks (a cheap and reliable replacement example can be seen in orifs-like regular syncs, venus already polls the servers anyway). We talked about callbacks several years ago and I know you like the feel of consistency which they provide - but they do not and can not give any guarantee of consistency. So you can not trust them, especially if you allow disconnections, and they still cause real pain (when the consistency "pseudo-promise" stalls the workflow trying to inform a dead client). > or from an http server. Especially now with 10GB+ > in a context where we talk about protocols > networking and things like tcp offloading, sending bulk data using small > UDP packets with SFTP just doesn't seem to make much sense to me anymore. I fully agree, the data transfer should be done better. I wonder how much we would lose if we'd let an rpc2-alike work over tcp instead. There will be extra/bigger/less-controllable delays at disconnections but this is possibly an acceptable price for simpler code and better performance? > C++ doesn't necessarily make it friendly, the biggest issues that I know > of are Coda's dependence on userspace cooperative threading (LWP), which > in turn makes it incompatible with things like C++ exception handling > and kernel level threads which is what you are quite likely going to > find in a lot of non trivial code. It prevented us from successfully > integrating OpenLDAP with the servers, etc. Anyway since then you ported lwp to pthreads. :) And btw I am happy that Coda did not get any dependency on OpenLDAP, with its extra layer of complexity. In a way LDAP is like Kerberos - having a quite low rate of well-understood deployment. It also assumes certain policied and approaches which easily become harmful. To the contrary I am very much impressed by your rwcdb which provides the proper functionality in a very lightweight fashion. At least for tens of thousands of accounts we do not see any problems synchronising identity and group databases from sources external to Coda. Any ḱnown bottlenecks are in the external databases. Using LDAP queries would just make it more complex, less reliable and most probably even less current/consistent :) Regards, RuneReceived on 2014-07-11 08:00:48