[Linux-ha-dev] Starting heartbeat when interfaces are down
Graham, Simon
Simon.Graham at stratus.com
Tue Oct 23 20:14:50 MDT 2007
>
> And indeed, the cluster does come up - without a node. A more
accurate
> summation is that "a single node in the cluster doesn't come up". So,
> the _cluster_ does recover from this error. It just does it without
> that node. So, service is not interrupted.
>
At the end of the day then, I think my problem comes down to the fact
that I am not using static IP addresses for the NICs -- I know you
consider the use of DHCP (and also, I would guess zeroconf) addresses a
bad thing - however, consider the case where one is trying to automate
the cluster config/setup - in this case, the actual IP addresses used
for the NIC are completely irrelevant to anyone other than the hb code
(because users of the cluster should ONLY be using the cluster alias
address).
If you use DHCP/Zeroconf then if a NIC does not have link at boot time,
it will not get an address assigned and HB will refuse to start with
this error:
Oct 17 05:41:47 heartbeat[10189]: 2007/10/17_05:41:49 ERROR: glib: Get
broadcast for interface eth1 failed: Cannot assign requested address
Oct 17 05:41:47 heartbeat[10189]: 2007/10/17_05:41:49 ERROR: glib: IP
interface [eth1] does not exist
Oct 17 05:41:47 heartbeat[10189]: 2007/10/17_05:41:49 ERROR: Illegal
bcast [UDP/IP broadcast] in config file [eth1]
Oct 17 05:41:47 heartbeat[10189]: 2007/10/17_05:41:49 ERROR: Heartbeat
not started: configuration error.
Oct 17 05:41:47 heartbeat[10189]: 2007/10/17_05:41:49 ERROR:
Configuration error, heartbeat not started.
This actually can lead to HB not starting anywhere (consider the case of
a two node cluster with a direct cable connect for one of the NICs -- if
one node is powered off, then the other one will not have link on the
NIC and therefore will not assign an address)
I'd be interested in more discussion on why DHCP/Zeroconf is considered
anathema.
I'd also be interested in knowing if anyone is working on supporting IP
V6 broadcast/multicast for the hb comms links (in which case a static
address can be allocated with no configuration required)
> This is the rationale for this behavior. It's not perfect behavior,
> but
> it's not completely irrational either...
>
> --
> Alan Robertson <alanr at unix.sh>
Thanks for the explanation - it helps a lot and is exactly what I was
looking for.
Simon
More information about the Linux-HA-Dev
mailing list