[Linux-HA] monitoring before start process !

Max Hofer max.hofer at apus.co.at
Fri Sep 21 05:10:59 MDT 2007


On Friday 21 September 2007, Matthieu FEROUL wrote:
> Hello,
>
> I've problem with heartbeat on version 2.
> Heartbeat always monitor services before starts their.
>
> I add a "start_delay" parameter on my cib.xml for the monitor section
> but, when heartbeat begin, it monitors first and start the services after.
This veriy first monitor is called a probing (start_delay has only effect 
after the resource was successfully started the first time).

When you start heartbeat on a node it calls a monitor operation on the node 
for ALL CONFIGURED resources (regardless if the resource could never run on 
the node because of some constraints you configured).

The intention of the probe is to check if the resource runs on the machine or 
not (sometimes resources still run - for example a not clean stop of 
heartbeat - or they are started by the operating system and not heartbeat)

Under normal conditions (usually resources are not running if heartbeat is 
started) all monitors return a "not running"

If you make custom RA your montor operation has to ensure to return a 
value "not running/error/running" under all conditions.

This applies also to resource agents which are not meant to run on the machine 
because it needs special HW (which is installed on other cluster nodes) but 
not installed on node which comes up (remember - the probe is started for ALL 
CONFIGURED resources).




>
> Thanks for your help!
>
> Matthieu F.
>
> This is my cib.xml :
>
> <cib admin_epoch="0" have_quorum="true" num_peers="2"
> cib_feature_revision="1.3" ccm_transition="2" generated="true"
> dc_uuid="3c2d7f03-e764-4e0f-a928-e1edd8ad9019" epoch="25"
> num_updates="1406" cib-last-written="Mon Sep 10 15:49:40 2007">
>   <configuration>
>     <crm_config>
>       <cluster_property_set id="cib-bootstrap-options">
>         <attributes>
>           <nvpair id="cib-bootstrap-options-symmetric_cluster"
> name="symmetric_cluster" value="true"/>
>           <nvpair id="cib-bootstrap-options-no_quorum_policy"
> name="no_quorum_policy" value="stop"/>
>           <nvpair id="cib-bootstrap-options-default_resource_stickiness"
> name="default_resource_stickiness" value="0"/>
>           <nvpair
> id="cib-bootstrap-options-default_resource_failure_stickiness"
> name="default_resource_failure_stickiness" value="0"/>
>           <nvpair id="cib-bootstrap-options-stonith_enabled"
> name="stonith_enabled" value="false"/>
>           <nvpair id="cib-bootstrap-options-stonith_action"
> name="stonith_action" value="reboot"/>
>           <nvpair id="cib-bootstrap-options-stop_orphan_resources"
> name="stop_orphan_resources" value="true"/>
>           <nvpair id="cib-bootstrap-options-stop_orphan_actions"
> name="stop_orphan_actions" value="true"/>
>           <nvpair id="cib-bootstrap-options-remove_after_stop"
> name="remove_after_stop" value="false"/>
>           <nvpair id="cib-bootstrap-options-short_resource_names"
> name="short_resource_names" value="true"/>
>           <nvpair id="cib-bootstrap-options-transition_idle_timeout"
> name="transition_idle_timeout" value="5min"/>
>           <nvpair id="cib-bootstrap-options-default_action_timeout"
> name="default_action_timeout" value="30s"/>
>           <nvpair id="cib-bootstrap-options-is_managed_default"
> name="is_managed_default" value="true"/>
>         </attributes>
>       </cluster_property_set>
>     </crm_config>
>     <nodes>
>       <node id="3c2d7f03-e764-4e0f-a928-e1edd8ad9019" uname="fw_b"
> type="normal"/>
>       <node id="93cf03c5-a511-46d0-964c-4544ac5b19b3" uname="fw_a"
> type="normal"/>
>     </nodes>
>     <resources>
>       <group id="group_1">
>         <primitive class="ocf" id="IPaddress" provider="heartbeat"
> type="IPaddr">
>           <operations>
>             <op id="IPaddress_start" name="start" timeout="30s"/>
>             <op id="IPaddress_mon" interval="60s" name="monitor"
> start_delay="20s" timeout="30s" on_fail="fence"/>
>           </operations>
>           <instance_attributes id="IPaddr_inst_attr">
>             <attributes>
>               <nvpair id="IPaddress_attr_0" name="ip" value="80.1.0.3"/>
>               <nvpair id="IPaddress_attr_1" name="netmask" value="24"/>
>               <nvpair id="IPaddress_attr_2" name="nic" value="eth0"/>
>             </attributes>
>           </instance_attributes>
>         </primitive>
>         <primitive class="ocf" id="IPSEC" provider="iwall" type="ipsec">
>           <operations>
>             <op id="IPSEC_mon" name="monitor" interval="60s"
> timeout="60s" on_fail="fence"/>
>           </operations>
>         </primitive>
>          </group>
>     </resources>
>     <constraints>
>       <rsc_location id="rsc_location_group_1" rsc="group_1">
>         <rule id="prefered_location_group_1" score="100">
>           <expression attribute="#uname"
> id="prefered_location_group_1_expr" operation="eq" value="fw_a"/>
>         </rule>
>       </rsc_location>
>     </constraints>
>   </configuration>
> </cib>
> _______________________________________________
> Linux-HA mailing list
> Linux-HA at lists.linux-ha.org
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems



-- 
Max Hofer
APUS Software G.m.b.H.
A-8074 Raaba, Bahnhofstraße 1/1
T| +43 316 401629 11
F| +43 316 401629 9
W| www.apus.co.at
E| max.hofer at apus.co.at


More information about the Linux-HA mailing list