[Linux-ha-dev] adding/deleting nodes from the system
alanr at unix.sh
Thu Mar 8 06:53:03 MST 2001
bmartin at penguincomputing.com wrote:
> On Wed, Mar 07, 2001 at 07:40:09PM -0700, Alan Robertson wrote:
> > bmartin at penguincomputing.com wrote:
> > First, the current heartbeat cluster management code gets confused by more
> > than two nodes in the cluster. If you're planning on rolling your own
> > cluster manager, this is no problem. Caveat emptor!
> Well, yes, I falied to mention my hacks to the ResourceManager and
> mach_down scripts.
> Basically I changed the haresources to have a standby node listed and
> then when mach_down is called only the satndby will pick up the resources.
> Crude but effective.
I can see that you're my kinda guy ;-)
> > All further discussion presumes that this has somehow been dealt with...
> > Second, your idea isn't necessarily a bad one, but I've been thinking about
> > the whole thing off and on and have several related thoughts I'll go into
> > below...
> > The "dangerous" aspect could be dealt with by saying such operations are
> > only available to certain client names as you said, or to certain user ids,
> > or whatever.
> > We could just let anyone who knows the shared secret join the cluster, and
> > eliminate the administration of node names altogether. You'll see some
> > vestigal code for that purpose in the base. It's #ifdefed out right now.
> Yea, I saw that. It would seem that the only reason not to do this would
> be security. And once the secret has been comprimised, we've already said
> goodbye to that :)
> > Regarding having clients change configuration...
> > I believe that the configuration module should either be a loadable module
> > like lots of the others or be controlled by a client, or some combination of
> > the two.
> > A loadable module would certainly be flexible... Then you could use
> > duplicated flat configuration files like now, or a distributed database, or
> > a file on a shared filesystem, or...
> This sounds like a very good idea, but I think clients should be able to
> muck things up too :)
> > Client configuration also has advantages. It would allow it to be
> > arbitrarily flexible. However, certain things need to be known from the
> > beginning (probably via command line), or you can't bootstrap. Things like
> > knowing the network topology are probably best gotten via a loadable
> > module. But things like knowing the resource configuration may be best done
> > by the client.
> Yes I agree.
By the way, I forgot to mention that the testing job gets an order of
magnitude harder once you have client-initiated reconfigurations. This is
nothing to sneeze at.
> So is a machine part of the network topology or a resource?
I would tend to think that things like network interfaces and the like are
part of the network topology. Curiously enough I tend to think of the set
of machines as *not* being part of the network topology.
> Along these lines, I have managed to do a bit of a dance around the
> current ResourceManager script to allow me to change the resources on
> the fly, through explicitly telling heartbeat to grab or release
> the resources at the right time.
> Now this too is a hack that is meant for the time being.
This is the kind of thing I want to avoid until the resource management is
separated out into a separate client module.
> If there were a set of extentions to the heartbeat client api that
> allowed clients to add/remove nodes from the system, and also add/remove
> resources from the system, I wouldn't need such hackery.
See comment above.
> > Another option would be to have a named client register to give permission
> > for nodes to join the cluster. Then a client would have to approve any
> > given node joining the cluster before it was allowed to join.
> This would be good for the really paranoid types.
Or for the 2-node cluster manager that wasn't prepared to deal with a third
> > I tend towards just letting any node that knows the shared secret and is
> > talking on our network/port join the cluster. This is *way* simple.
> > Simplicity is a great virtue when one looks at very inexpensive clusters.
> > Inexpensive clusters don't generally have expensive sysadmins running them.
> Yes I think this is a good idea as well. Is there really any reason not to?
> Like I said above, if someone has the shared secret, it's all over anyways.
> I suppose with the current ResourceManager in place, adding another node
> can really break things.
Hard to argue with that ;-)
> But that's the only reason I can see for this being a bad idea.
> > Of course, one can always have a flag which says whether this behavior is
> > permitted.
But one STILL needs an administrative way of telling the cluster to remove a
node from the cluster.
> > By the way, it is probably the case that the clients ought to get informed
> > when heartbeat restarts, so they can reconnect and restart their interface.
> > It might even be the case that they should ACK this message before the
> > restart is allowed to proceed. This is generally a good idea, as Brian can
> > probably attest.
> Yes. This would be good. Although just so you know, I'm not planning any
> tricks invloving restarting anymore :)
Moreover, it's necessary - even if no tricks are being played. I also think
that heartbeat should stop and start named clients when stops and starts.
> BTW - I won't break anything if I compile with MITJA defined, will I?
Not in my copy you won't ;-).
Maybe you should ask Mitja...
Mitja was my first feature developer for heartbeat. He authored the
authentication code that makes the #ifdef MITJA code even reasonable.
If you test this, let us know how it works out.
-- Alan Robertson
alanr at unix.sh
More information about the Linux-HA-Dev