[Linux-ha-dev] Re: Fence agents converge
lmb at suse.de
Wed Oct 8 04:58:44 MDT 2008
On 2008-10-07T17:29:09, Dejan Muhamedagic <dejanmm at fastmail.fm> wrote:
> > He is refering to failure in low-memory situations. fork() might not be
> > possible then, and memory needs to be pre-allocated - this is one of the
> > primary reasons why stonithd loads C plugins and instantiates them prior
> > to actually using them.
> > But it turns out that C code seems to be mostly too hard for people to
> > write.
> This is true, but I'm not sure how relevant. Interface to
> external stonith plugins has been available for 3-4 years and so
> far there were three contributed plugins (two in /bin/sh, one in
If you look at the full list of plugins though, the external/*
interfaces are more popular by now I think, and have been growing
faster. When I see scripts from consulting - which alas they don't
always contribute back -, they are all external too.
> > Yes, we'd be enforcing a single scripting language. (And not even one I
> > personally like much.) But I think it'd be worth it.
> This is somewhat contradictory to the argument that people can't
> contribute because they find C intimidating or too hard. I'm sure
> there are some who find python awkward, so I don't see how
> imposing python could help.
I think the number of people who find python awkward is decreasing; but
definitely the number of people who can write python code is higher than
those who can write C plugins.
We need to make scripting languages easier.
Those who _can_ write C plugins can still do so; python can load C code.
(SWIG etc, although I've never done that with python myself ;-) And
there's no reason why the python code could not also call out to other
> > For the regression testing which honzaf wanted to write, said classes
> > would simply only be allowed to interact with the external world through
> > telnet/ssh/snmp/... input/output abstraction, which would allow us to
> > easily record and replay during unit tests.
> About fencing and mlock: I've often wondered how much is this
> relevant in today's computing. Can't recall any incident of the
> kind, i.e. that the host to fence another one was so short on
> memory that the fencing operation failed. Typically, such a host
> has to take over some heavy resources right after fencing (rdbms,
> web server), that'd surely make a hundred-fold bigger memory
That is actually a good point. The mlock()ability is only a (desirable)
I'm interested in the ability to do those regression tests uniformly,
which would benefit from common IO abstractions, which _are_ easier in
an OO world.
> It is also debatable what demands more memory: a python (think
> garbage collection) instantiating objects or a process doing a
> fork. If I were to place a bet, I think I'd go with the former.
Total memory, sure. But it will already be allocated, and it is not
possible to pre-allocate forks.
Theoretically speaking, this problem is going to get worse when more
cluster file systems depend on us to fence in their recovery path. That
essentially places the cluster stack into the write-out path.
> Also, which of the two would you consider more predictable?
Actually - the pre-allocated python instance. ;-)
> Looks like you've already discussed this matter in Prague.
> I think I need some time to process it ;-)
Oh, I may be quite wrong. It's just my idea. ;-) Maybe the others don't
like it and we want to stick to C as the suggested language for those
(Though even then I'd suggest we do revise the current code. I dislike
it - I would like to see PILS go, and much rather statically link the
additional modules in. There has _never_ been a single case where
someone contributed a C plugin which wasn't installed w/o simply
upgrading the package. We jump through a number of hoops for flexibility
which noone ever used. And the stonith package pulls in all dependencies
anyway, so not even disk-space is saved.)
> > > > - HB * add status of light-out
> > > This could be nice to have for informational purposes.
> > We actually need it for the abilityt find out whether nodes are
> > expected to be up or down;
> How does this depend on the node having power (apart from the
Verifying that the node is actually in the state which we want it to be
Teamlead Kernel, SuSE Labs, Research and Development
SUSE LINUX Products GmbH, GF: Markus Rex, HRB 16746 (AG Nürnberg)
"Experience is the name everyone gives to their mistakes." -- Oscar Wilde
More information about the Linux-HA-Dev