[Linux-HA] heartbeat v2 with serial cable
Guochun Shi
gshi at ncsa.uiuc.edu
Wed Aug 17 10:52:39 MDT 2005
Since everything works fine except the error message from serial cable, I assume you have other communication media (probably ethernet) configured.
The problem looks like the serial cable cannot handle big messages, i.e. cib messages.
I think we should not try to send big messages through seiral port. Heartbeat need to detect that and silently drop them for serial port --- but if there is only one media configured,
heartbeat then gives a warning/error
-Guochun
At 08:21 AM 8/17/2005 +0200, you wrote:
>Hello,
>
>I've installed heartbeat v2 on two machines (HP ProLiant ML110). The nodes
>are connected via 1Gb network (cross cable) and a null modem cable.
>Furthermore are two interfaces on each node for different networks installed.
>
>I've configured heartbeat to use crm. Whenever I update the CIB I get
>error messages on the non DC node like this
>
>Aug 16 16:01:34 elmgt2 cibmon: [3448]: info: mask(cib_apply_diff): + </status>
>Aug 16 16:01:34 elmgt2 cibmon: [3448]: info: mask(cib_apply_diff): + </cib>
>Aug 16 16:01:36 elmgt2 heartbeat: [3427]: ERROR: NV failure (string2msg_ll):
>Aug 16 16:01:36 elmgt2 heartbeat: [3427]: ERROR: Input string: [>>> __name__=cib_fragment section=status (2)cib=>>>^U__name__=cib^U(2)configuration=>>>^V__name__=configuration^V__parent__=1^V(2)crm_config=>>>^W__name__=crm_config^W__parent__=1^W<<<^W^V(2)nodes=>>>^W__name__=nodes^W__parent__=1^W<<<^W^V(2)resources=>>>^W__name__=resources^W__parent__=1^W<<<^W^V(2)constraints=>>>]
>Aug 16 16:01:36 elmgt2 heartbeat: [3427]: ERROR: sp=(2)cib=>>>^U__name__=cib^U(2)configuration=>>>^V__name__=configuration^V__parent__=1^V(2)crm_config=>>>^W__name__=crm_config^W__parent__=1^W<<<^W^V(2)nodes=>>>^W__name__=nodes^W__parent__=1^W<<<^W^V(2)resources=>>>^W__name__=resources^W__parent__=1^W<<<^W^V(2)constraints=>>>
>Aug 16 16:01:36 elmgt2 heartbeat: [3427]: ERROR: string2struct(): string2msg_ll failed
>Aug 16 16:01:36 elmgt2 heartbeat: [3427]: ERROR: add_string_field: stringtofield failed
>Aug 16 16:01:36 elmgt2 heartbeat: [3427]: ERROR: ha_msg_addraw_ll: addfield failed
>Aug 16 16:01:36 elmgt2 heartbeat: [3427]: ERROR: ha_msg_addraw(): ha_msg_addraw_ll failed
>Aug 16 16:01:36 elmgt2 heartbeat: [3427]: ERROR: NV failure (string2msg_ll):
>Aug 16 16:01:36 elmgt2 heartbeat: [3427]: ERROR: Input string: [>>> origin=finalize_join_for t=crmd version=1.0 subt=request reference=join_ack_nack-dc-1124200892-19 crm_task=join_ack_nack crm_sys_to=crmd crm_sys_from=dc crm_host_to=elmgt1 join_id=2 join_ack_>>> origin=do_cl_join_result t=crmd version=1.0 subt=request reference=join_confirm-crmd-1124200892-22 crm_task=join_confirm crm_sys_to=dc crm_sys_from=crmd crm_host_to=elmgt1 (2)crm_xml=>>>^T__name__=cib_fragment^Tsection=status^T(2)cib=>>>^U__name__=cib^U(2)configuration=>>>^V__name__=configuration^V__par
>Aug 16 16:01:36 elmgt2 heartbeat: [3427]: ERROR: sp=(2)crm_xml=>>>^T__name__=cib_fragment^Tsection=status^T(2)cib=>>>^U__name__=cib^U(2)configuration=>>>^V__name__=configuration^V__parent__=1^V(2)crm_config=>>>^W__name__=crm_config^W__parent__=1^W<<<^W^V(2)nodes=>>>^W__name__=nodes^W__parent__=1^W<<<^W^V(2)resources=>>>^W__name__=resources^W__parent__=1^W<<<^W^V(2)constraints=>>> t=cib cib_clientid=28134ac3-b48b-4c32-a1aa-19786a06701 cib_callopt=1048576 cib_callid=20 cib_op=cib_apply_diff cib_update=true (2)cib_update_diff=>>>^T__name__=diff^T(2)diff-removed=>>>^U__name__=diff-re
>Aug 16 16:01:37 elmgt2 heartbeat: [3427]: ERROR: NV failure (string2msg_ll):
>Aug 16 16:01:37 elmgt2 heartbeat: [3427]: ERROR: Input string: [>>> __name__=diff (2)diff-removed=>>>^U__name__=diff-removed^U__parent__=1^U(2)cib=>>>^V__name__=cib^Vnum_updates=904^V(2)statu>>>]
>
>
>But everything works.
>If I remove the serial heartbeat from the configuration the error
>messages disappear.
>The serial cable is tested and accords to the documentation.
>
>elmgt2:~# cat /proc/tty/driver/serial | head -n 3
>serinfo:1.0 driver revision:
>0: uart:16550A port:000003F8 irq:4 tx:302525 rx:344629 RTS|CTS|DTR|DSR|CD
>1: uart:16550A port:000002F8 irq:3 tx:0 rx:0
>
>elmgt2:~# stty -a </dev/ttyS0
>speed 38400 baud; rows 0; columns 0; line = 0;
>intr = ^C; quit = ^\; erase = ^?; kill = ^U; eof = ^D; eol = <undef>; eol2 = <undef>; start = ^Q; stop = ^S; susp = ^Z; rprnt = ^R; werase = ^W; lnext = ^V; flush = ^O;
>min = 1; time = 1;
>-parenb -parodd cs8 hupcl -cstopb cread clocal crtscts
>-ignbrk brkint -ignpar -parmrk inpck istrip -inlcr igncr -icrnl -ixon -ixoff -iuclc -ixany imaxbel
>-opost -olcuc -ocrnl onlcr -onocr -onlret -ofill -ofdel nl0 cr0 tab0 bs0 vt0 ff0
>-isig -icanon iexten -echo echoe echok -echonl -noflsh -xcase -tostop -echoprt echoctl echoke
>
>elmgt2:~# statserial /dev/ttyS0
>Device: /dev/ttyS0
>
>Signal Pin Pin Direction Status Full
>Name (25) (9) (computer) Name
>----- --- --- --------- ------ -----
>FG 1 - - - Frame Ground
>TxD 2 3 out - Transmit Data
>RxD 3 2 in - Receive Data
>RTS 4 7 out 1 Request To Send
>CTS 5 8 in 1 Clear To Send
>DSR 6 6 in 1 Data Set Ready
>GND 7 5 - - Signal Ground
>DCD 8 1 in 1 Data Carrier Detect
>DTR 20 4 out 1 Data Terminal Ready
>RI 22 9 in 0 Ring Indicator
>
>elmgt2:~# grep -i tty /var/log/syslog
>Aug 16 16:34:48 elmgt2 heartbeat: [3812]: info: glib: Starting serial heartbeat on tty /dev/ttyS0 (38400 baud)
>Aug 16 16:34:50 elmgt2 heartbeat: [3812]: info: Link elmgt1:/dev/ttyS0 up.
>
>
>I've tested speeds from 9600 to 38400 baud without any change.
>What is the reason for these error messages and why do they appear since
>everything works fine. Or do I've to be worried about the serial heartbeat?
>
>
>Thanks
>
>Frank
>
>_______________________________________________
>Linux-HA mailing list
>Linux-HA at lists.linux-ha.org
>http://lists.linux-ha.org/mailman/listinfo/linux-ha
>See also: http://linux-ha.org/ReportingProblems
More information about the Linux-HA
mailing list