[Linux-ha-dev] Future of /proc/ha

Alan Robertson alanr@bell-labs.com
Thu, 28 Oct 1999 21:16:00 -0600


Volker Wiegand wrote:
> 
> Dear colleagues,
> 
> As I was continuing my work on /proc/ha during the last days, I ran across
> a couple of problems and I would like to get some input from others before
> I decide how to proceed.
> 
> (1) The /proc interface is a moving target. In order to support kernels
> down to 2.0.1 (which seems to be reasonable for me), I am already running
> three versions. And supporting other open operating systems is completely
> out of reach.
> 
> (2) Especially when it comes to handling resources, even on an average
> sized cluster the amount of data in one "file" can exceed 4 KB, which is
> the allocation size in /proc. This means I cannot read atomicly any more.
> 
> (3) The work /proc/ha is doing is to provide information for user land
> programs which comes from ... user land programs. Debugging can only be
> accomplished by printk() outputs. And a post-mortem after a node crashed
> is a bit ... difficult to get :-)
> 
> (4) I have no chance to make the information persistent even if I wanted
> to do so eventually.
> 
> All this leads me to the attitude that we should abandon /proc/ha. Alas,
> what is then the way to go? I have recently been toying around with the
> Berkeley DB code (db-2.7.7.tar.gz) from Sleepycat and it looks very very
> promising. They also have a license which would allow to incorporate the
> code in our work without charging.
> 
> Does anyone want to convince me that I should continue /proc/ha? And does
> anyone want to convince me that we should not further investigate B-DB as
> the underlying storage module for cluster/node/resource state including
> transactional support?

Volker,

My only concern about the Berkley DB code is that most databases are *highly*
prone to corruption during crashes.  I would recommend that no permanent state
be kept in a database.  When a node comes up after a crash, it's old idea of the
cluster topology is now invalid anyway, so I don't envision this being a
problem.

A good API for accessing the data is essential.

Go for it.

	-- Alan Robertson
	   alanr@bell-labs.com