[Linux-ha-dev] [PATCH] Proposal SNMP subagent extention (revised)
Keisuke MORI
kskmori at intellilink.co.jp
Tue Dec 4 03:53:06 MST 2007
Hi,
I'm posting the updated version of the SNMP hbagent extension for V2.
This patch is reflecting the comments from Andrew and Dejan.
Thank you very much about that!
The differences from the previous patch are:
1) Add three new fields in the MIB object as below:
LHAResourceIsManaged: the managed status of a resource.
LHAResourceFailcount: the fail-count value of a resource.
LHAResourceParent: the name of the parent resource if present.
2) Add a simple check in SNMPAgentSanityCheck.in
for the functionality provided by this patch.
3) Revise the code implementation details as advised by Dejan.
4) Based on the more recent -dev: f153a9be0bdf
I would appreciate if there're any further comments and suggestions,
particularly about the MIB definition.
It is just good enough for our usage, but I would like to know
if someone still needs more information in the SNMP MIB object
to be implemented.
See README in the attachment for details how the current MIB looks like.
Regards,
Keisuke MORI
NTT DATA Intellilink Corporation
-------------- next part --------------
SNMP Subagent Extention for CRM Resources
1. Introduction
The purpose of this patch is to extend the SNMP subagent to get and
receive a trap about the CRM resource information provided by
Heartbeat Version 2.
This patch introduces two new SNMP MIB objects.
1) LHAResourceTable: resources' name, type, on which node they are
running, and their status. On the other words, you can get the
information which is provided with crm_mon through the SNMP interface.
2) LHAResourceStatusUpdate: when a resource's status changes, you are
notified with this SNMP trap.
2. Added MIB
The following is the added MIB at this patch.
---------------------------------------------------------------------------
| OID | Object Name | Value type | Description |
---------------------------------------------------------------------------
|4682.8 | | LHAResourceTable | table | |
---------------------------------------------------------------------------
| |.1 | LHAResourceEntry | | |
---------------------------------------------------------------------------
| |.1.1 | LHAResourceIndex | Integer32 | |
---------------------------------------------------------------------------
| |.1.2 | LHAResourceName | DisplayString | |
---------------------------------------------------------------------------
| |.1.3 | LHAResourceType | INTEGER | unknown(0) |
| | | | | primitive(1) |
| | | | | group(2) |
| | | | | clone(3) |
| | | | | masterSlave(4) |
---------------------------------------------------------------------------
| |.1.4 | LHAResourceNode | DisplayString | |
---------------------------------------------------------------------------
| |.1.5 | LHAResourceStatus | INTEGER | unknown(0) |
| | | | | stopped(1) |
| | | | | started(2) |
| | | | | slave(3) |
| | | | | master(4) |
---------------------------------------------------------------------------
| |.1.6 | LHAResourceIsManaged | INTEGER | unmanaged(0) |
| | | | | managed(1) |
---------------------------------------------------------------------------
| |.1.7 | LHAResourceFailcount | Integer32 | |
---------------------------------------------------------------------------
| |.1.8 | LHAResourceParent | DisplayString | |
---------------------------------------------------------------------------
NOTE : "master" status means "promoted", and "slave" means "demoted".
All master/slave resources start up as slave at first, and until
they are demoted or promoted explicitly, heartbeat only knows
they "started".
So, LHAResourceStatus's value is according to the crm_mon output.
NOTE : For the present, you can get the information only about *running*
resources or the resources that their values of fail-count are
larger than 1. Because it's difficult to decide which node
a resource *stopped* on...
3. Added Trap
The following is the added Trap at this patch.
---------------------------------------------------------------------------
| OID | Object Name | Value type | Description |
---------------------------------------------------------------------------
|4682.900.8 | LHAResourceStatusUpdate | | |
| |------------------------------------------------------------
| | LHAResourceName | DisplayString | |
| |------------------------------------------------------------
| | LHAResourceNode | DisplayString | |
| |------------------------------------------------------------
| | LHAResourceStatus | INTEGER | 0 : unknown |
| | | | 1 : stopped |
| | | | 2 : started |
| | | | 3 : slave |
| | | | 4 : master |
---------------------------------------------------------------------------
NOTE : This trap is sent only when the resource operation succeeds.
Concretely, the extended hbagent gets the cib information when it
changes, and parse it. And if the rc_code of the operation (like
CRMD_ACTION_START) is "0", then the hbagent sends a trap.
4. Demo Output
[root at u5node1 ~]# snmpwalk -v 1 \
-c public localhost LINUX-HA-MIB::LHAResourceTable
LINUX-HA-MIB::LHAResourceName.1 = STRING: group0
LINUX-HA-MIB::LHAResourceName.2 = STRING: prmIp
LINUX-HA-MIB::LHAResourceName.3 = STRING: prmApPostgreSQLDB
LINUX-HA-MIB::LHAResourceName.4 = STRING: clone0
LINUX-HA-MIB::LHAResourceName.5 = STRING: clone0
LINUX-HA-MIB::LHAResourceName.6 = STRING: clone0-dummy:0
LINUX-HA-MIB::LHAResourceName.7 = STRING: clone0-dummy:1
LINUX-HA-MIB::LHAResourceName.8 = STRING: ms-sf
LINUX-HA-MIB::LHAResourceName.9 = STRING: ms-sf
LINUX-HA-MIB::LHAResourceName.10 = STRING: master_slave_Stateful:0
LINUX-HA-MIB::LHAResourceName.11 = STRING: master_slave_Stateful:1
LINUX-HA-MIB::LHAResourceType.1 = INTEGER: group(2)
LINUX-HA-MIB::LHAResourceType.2 = INTEGER: primitive(1)
LINUX-HA-MIB::LHAResourceType.3 = INTEGER: primitive(1)
LINUX-HA-MIB::LHAResourceType.4 = INTEGER: clone(3)
LINUX-HA-MIB::LHAResourceType.5 = INTEGER: clone(3)
LINUX-HA-MIB::LHAResourceType.6 = INTEGER: primitive(1)
LINUX-HA-MIB::LHAResourceType.7 = INTEGER: primitive(1)
LINUX-HA-MIB::LHAResourceType.8 = INTEGER: masterSlave(4)
LINUX-HA-MIB::LHAResourceType.9 = INTEGER: masterSlave(4)
LINUX-HA-MIB::LHAResourceType.10 = INTEGER: primitive(1)
LINUX-HA-MIB::LHAResourceType.11 = INTEGER: primitive(1)
LINUX-HA-MIB::LHAResourceNode.1 = STRING: u5node1
LINUX-HA-MIB::LHAResourceNode.2 = STRING: u5node1
LINUX-HA-MIB::LHAResourceNode.3 = STRING: u5node1
LINUX-HA-MIB::LHAResourceNode.4 = STRING: u5node1
LINUX-HA-MIB::LHAResourceNode.5 = STRING: u5node2
LINUX-HA-MIB::LHAResourceNode.6 = STRING: u5node2
LINUX-HA-MIB::LHAResourceNode.7 = STRING: u5node1
LINUX-HA-MIB::LHAResourceNode.8 = STRING: u5node1
LINUX-HA-MIB::LHAResourceNode.9 = STRING: u5node2
LINUX-HA-MIB::LHAResourceNode.10 = STRING: u5node2
LINUX-HA-MIB::LHAResourceNode.11 = STRING: u5node1
LINUX-HA-MIB::LHAResourceStatus.1 = INTEGER: started(2)
LINUX-HA-MIB::LHAResourceStatus.2 = INTEGER: started(2)
LINUX-HA-MIB::LHAResourceStatus.3 = INTEGER: started(2)
LINUX-HA-MIB::LHAResourceStatus.4 = INTEGER: started(2)
LINUX-HA-MIB::LHAResourceStatus.5 = INTEGER: started(2)
LINUX-HA-MIB::LHAResourceStatus.6 = INTEGER: started(2)
LINUX-HA-MIB::LHAResourceStatus.7 = INTEGER: started(2)
LINUX-HA-MIB::LHAResourceStatus.8 = INTEGER: master(4)
LINUX-HA-MIB::LHAResourceStatus.9 = INTEGER: master(4)
LINUX-HA-MIB::LHAResourceStatus.10 = INTEGER: started(2)
LINUX-HA-MIB::LHAResourceStatus.11 = INTEGER: master(4)
LINUX-HA-MIB::LHAResourceIsManaged.1 = INTEGER: managed(1)
LINUX-HA-MIB::LHAResourceIsManaged.2 = INTEGER: managed(1)
LINUX-HA-MIB::LHAResourceIsManaged.3 = INTEGER: managed(1)
LINUX-HA-MIB::LHAResourceIsManaged.4 = INTEGER: managed(1)
LINUX-HA-MIB::LHAResourceIsManaged.5 = INTEGER: managed(1)
LINUX-HA-MIB::LHAResourceIsManaged.6 = INTEGER: managed(1)
LINUX-HA-MIB::LHAResourceIsManaged.7 = INTEGER: managed(1)
LINUX-HA-MIB::LHAResourceIsManaged.8 = INTEGER: managed(1)
LINUX-HA-MIB::LHAResourceIsManaged.9 = INTEGER: managed(1)
LINUX-HA-MIB::LHAResourceIsManaged.10 = INTEGER: managed(1)
LINUX-HA-MIB::LHAResourceIsManaged.11 = INTEGER: managed(1)
LINUX-HA-MIB::LHAResourceFailCount.1 = INTEGER: 0
LINUX-HA-MIB::LHAResourceFailCount.2 = INTEGER: 0
LINUX-HA-MIB::LHAResourceFailCount.3 = INTEGER: 0
LINUX-HA-MIB::LHAResourceFailCount.4 = INTEGER: 0
LINUX-HA-MIB::LHAResourceFailCount.5 = INTEGER: 0
LINUX-HA-MIB::LHAResourceFailCount.6 = INTEGER: 0
LINUX-HA-MIB::LHAResourceFailCount.7 = INTEGER: 0
LINUX-HA-MIB::LHAResourceFailCount.8 = INTEGER: 0
LINUX-HA-MIB::LHAResourceFailCount.9 = INTEGER: 0
LINUX-HA-MIB::LHAResourceFailCount.10 = INTEGER: 0
LINUX-HA-MIB::LHAResourceFailCount.11 = INTEGER: 0
LINUX-HA-MIB::LHAResourceParent.1 = STRING:
LINUX-HA-MIB::LHAResourceParent.2 = STRING: group0
LINUX-HA-MIB::LHAResourceParent.3 = STRING: group0
LINUX-HA-MIB::LHAResourceParent.4 = STRING:
LINUX-HA-MIB::LHAResourceParent.5 = STRING:
LINUX-HA-MIB::LHAResourceParent.6 = STRING: clone0
LINUX-HA-MIB::LHAResourceParent.7 = STRING: clone0
LINUX-HA-MIB::LHAResourceParent.8 = STRING:
LINUX-HA-MIB::LHAResourceParent.9 = STRING:
LINUX-HA-MIB::LHAResourceParent.10 = STRING: ms-sf
LINUX-HA-MIB::LHAResourceParent.11 = STRING: ms-sf
cf.) Then crm_mon's output is...
============
Last updated: Mon Dec 3 17:39:27 2007
Current DC: u5node2 (7e035df7-607d-42b4-a1e7-e8d9db108e6c)
2 Nodes configured.
3 Resources configured.
============
Node: u5node1 (77b7c7b1-68ba-4542-9f10-b75de73e9ffd): online
Node: u5node2 (7e035df7-607d-42b4-a1e7-e8d9db108e6c): online
Resource Group: group0
prmIp (heartbeat::ocf:IPaddr): Started u5node1
prmApPostgreSQLDB (heartbeat::ocf:pgsql): Started u5node1
Clone Set: clone0
clone0-dummy:0 (heartbeat::ocf:Dummy): Started u5node2
clone0-dummy:1 (heartbeat::ocf:Dummy): Started u5node1
Master/Slave Set: ms-sf
master_slave_Stateful:0 (heartbeat::ocf:Stateful): Started u5node2
master_slave_Stateful:1 (heartbeat::ocf:Stateful): Master u5node1
Sample SNMP traps
Dec 3 17:38:31 manager_node snmptrapd[1343]: 2007-12-03 17:38:31 u5node2 [
192.168.70.129]: DISMAN-EVENT-MIB::sysUpTimeInstance = Timeticks: (22408) 0
:03:44.08 SNMPv2-MIB::snmpTrapOID.0 = OID: LINUX-HA-MIB::LHAResourceStatu
sUpdate LINUX-HA-MIB::LHAResourceName = STRING: master_slave_Stateful:0 LI
NUX-HA-MIB::LHAResourceNode = STRING: u5node2 LINUX-HA-MIB::LHAResourceStat
us = INTEGER: started(2)
Dec 3 17:38:31 manager_node snmptrapd[1343]: 2007-12-03 17:38:31 u5node2 [
192.168.70.129]: DISMAN-EVENT-MIB::sysUpTimeInstance = Timeticks: (22420) 0
:03:44.20 SNMPv2-MIB::snmpTrapOID.0 = OID: LINUX-HA-MIB::LHAResourceStatu
sUpdate LINUX-HA-MIB::LHAResourceName = STRING: clone0-dummy:0 LINUX-HA-M
IB::LHAResourceNode = STRING: u5node2 LINUX-HA-MIB::LHAResourceStatus = INT
EGER: started(2)
Dec 3 17:38:32 manager_node snmptrapd[1343]: 2007-12-03 17:38:32 u5node1 [
192.168.70.128]: DISMAN-EVENT-MIB::sysUpTimeInstance = Timeticks: (19126) 0
:03:11.26 SNMPv2-MIB::snmpTrapOID.0 = OID: LINUX-HA-MIB::LHAResourceStatu
sUpdate LINUX-HA-MIB::LHAResourceName = STRING: clone0-dummy:1 LINUX-HA-M
IB::LHAResourceNode = STRING: u5node1 LINUX-HA-MIB::LHAResourceStatus = INT
EGER: started(2)
Dec 3 17:38:32 manager_node snmptrapd[1343]: 2007-12-03 17:38:32 u5node1 [
192.168.70.128]: DISMAN-EVENT-MIB::sysUpTimeInstance = Timeticks: (19129) 0
:03:11.29 SNMPv2-MIB::snmpTrapOID.0 = OID: LINUX-HA-MIB::LHAResourceStatu
sUpdate LINUX-HA-MIB::LHAResourceName = STRING: master_slave_Stateful:1 LI
NUX-HA-MIB::LHAResourceNode = STRING: u5node1 LINUX-HA-MIB::LHAResourceStat
us = INTEGER: started(2)
Dec 3 17:38:34 manager_node snmptrapd[1343]: 2007-12-03 17:38:34 u5node1 [
192.168.70.128]: DISMAN-EVENT-MIB::sysUpTimeInstance = Timeticks: (19314) 0
:03:13.14 SNMPv2-MIB::snmpTrapOID.0 = OID: LINUX-HA-MIB::LHAResourceStatu
sUpdate LINUX-HA-MIB::LHAResourceName = STRING: master_slave_Stateful:1 LI
NUX-HA-MIB::LHAResourceNode = STRING: u5node1 LINUX-HA-MIB::LHAResourceStat
us = INTEGER: master(4)
Dec 3 17:38:36 manager_node snmptrapd[1343]: 2007-12-03 17:38:36 u5node1 [
192.168.70.128]: DISMAN-EVENT-MIB::sysUpTimeInstance = Timeticks: (19516) 0
:03:15.16 SNMPv2-MIB::snmpTrapOID.0 = OID: LINUX-HA-MIB::LHAResourceStatu
sUpdate LINUX-HA-MIB::LHAResourceName = STRING: prmIp LINUX-HA-MIB::LHAR
esourceNode = STRING: u5node1 LINUX-HA-MIB::LHAResourceStatus = INTEGER: st
arted(2)
Dec 3 17:38:42 manager_node snmptrapd[1343]: 2007-12-03 17:38:42 u5node1 [
192.168.70.128]: DISMAN-EVENT-MIB::sysUpTimeInstance = Timeticks: (20067) 0
:03:20.67 SNMPv2-MIB::snmpTrapOID.0 = OID: LINUX-HA-MIB::LHAResourceStatu
sUpdate LINUX-HA-MIB::LHAResourceName = STRING: prmApPostgreSQLDB LI
NUX-HA-MIB::LHAResourceNode = STRING: u5node1 LINUX-HA-MIB::LHAResourceStat
us = INTEGER: started(2)
5. Other changes
This patch modifies the following too.
1) Make SNMP_CACHE_TIME_OUT variable.
apply the value which is specified with -r option.
2) Fix some memory leaks.
debug with valgrind.
3) Update SNMPAgentSanityCheck to keep up with the new functionality.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: hbagent_v2resource.patch
Type: text/x-patch
Size: 57650 bytes
Desc: hbagent_v2resource.patch
Url : http://lists.community.tummy.com/pipermail/linux-ha-dev/attachments/20071204/7b4d9c44/hbagent_v2resource-0001.bin
More information about the Linux-HA-Dev
mailing list