[Linux-HA] Fw: Streamline Highly Availability and Load Balancing

Guochun Shi gshi at ncsa.uiuc.edu
Sun Aug 28 21:17:51 MDT 2005


from the log, only mail1 started the resource group.

Make sure all resources are stopped in both machines before you start heartbeat.

-Guochun

At 10:55 AM 8/29/2005 +0800, you wrote:

>Thanks, following are the log files for both devices. 
>
>mail1.twn.tuv.com 
>
>heartbeat: 2005/08/29_10:27:53 info: AUTH: i=2: key = 0x8a5300c, auth=0x24fc48, authname=sha1 
>heartbeat: 2005/08/29_10:27:53 info: ************************** 
>heartbeat: 2005/08/29_10:27:53 info: Configuration validated. Starting heartbeat 1.2.3.cvs.20050404 
>heartbeat: 2005/08/29_10:27:53 info: heartbeat: version 1.2.3.cvs.20050404 
>heartbeat: 2005/08/29_10:27:54 info: Heartbeat generation: 39 
>heartbeat: 2005/08/29_10:27:55 info: UDP Broadcast heartbeat started on port 694 (694) interface eth0 
>heartbeat: 2005/08/29_10:27:55 info: pid 2175 locked in memory. 
>heartbeat: 2005/08/29_10:27:55 info: Local status now set to: 'up' 
>heartbeat: 2005/08/29_10:27:56 info: pid 2194 locked in memory. 
>heartbeat: 2005/08/29_10:27:56 info: pid 2195 locked in memory. 
>heartbeat: 2005/08/29_10:27:56 info: pid 2196 locked in memory. 
>heartbeat: 2005/08/29_10:27:56 info: Link mail2.twn.tuv.com:eth0 up. 
>heartbeat: 2005/08/29_10:27:56 info: Status update for node mail2.twn.tuv.com: status up 
>heartbeat: 2005/08/29_10:27:56 info: Local status now set to: 'active' 
>heartbeat: 2005/08/29_10:27:56 WARN: string2msg_ll: node [tm-iwss] failed authentication 
>heartbeat: 2005/08/29_10:27:56 info: Status update for node mail2.twn.tuv.com: status active 
>heartbeat: 2005/08/29_10:27:56 info: AnnounceTakeover(local 0, foreign 1, reason 'HB_R_BOTHSTARTING' (0)) 
>heartbeat: 2005/08/29_10:27:56 info: AnnounceTakeover(local 0, foreign 1, reason 'T_RESOURCES' (0)) 
>heartbeat: 2005/08/29_10:27:56 info: STATE 1 => 3 
>heartbeat: 2005/08/29_10:27:56 info: STATE 3 => 2 
>heartbeat: 2005/08/29_10:27:56 info: AnnounceTakeover(local 0, foreign 1, reason 'T_RESOURCES' (0)) 
>heartbeat: 2005/08/29_10:27:56 info: other_holds_resources: 0 
>heartbeat: 2005/08/29_10:27:56 info: Running /etc/ha.d/rc.d/status status 
>heartbeat: 2005/08/29_10:27:56 info: STATE 2 => 3 
>heartbeat: 2005/08/29_10:27:56 info: Exiting status process 2250 returned rc 0. 
>heartbeat: 2005/08/29_10:27:56 info: Link mail1.twn.tuv.com:eth0 up. 
>heartbeat: 2005/08/29_10:27:56 info: Running /etc/ha.d/rc.d/status status 
>heartbeat: 2005/08/29_10:27:56 info: Exiting status process 2257 returned rc 0. 
>heartbeat: 2005/08/29_10:28:03 WARN: string2msg_ll: node [tm-iwss-2] failed authentication 
>heartbeat: 2005/08/29_10:28:06 WARN: string2msg_ll: node [tm-iwss] failed authentication 
>heartbeat: 2005/08/29_10:28:07 info: local resource transition completed. 
>heartbeat: 2005/08/29_10:28:07 info: AnnounceTakeover(local 1, foreign 1, reason 'T_RESOURCES(us)' (0))                       heartbeat: 2005/08/29_10:28:07 info: Initial resource acquisition complete (T_RESOURCES(us)) 
>heartbeat: 2005/08/29_10:28:07 info: remote resource transition completed. 
>heartbeat: 2005/08/29_10:28:07 info: AnnounceTakeover(local 1, foreign 1, reason 'T_RESOURCES(us)' (1)) 
>heartbeat: 2005/08/29_10:28:07 info: other_holds_resources: 1 
>heartbeat: 2005/08/29_10:28:07 info: other_holds_resources: 1 
>heartbeat: 2005/08/29_10:28:07 info: other_holds_resources: 1 
>heartbeat: 2005/08/29_10:28:07 info: 1 local resources from [/usr/lib/heartbeat/ResourceManager listkeys mail1.twn.tuv.com] 
>heartbeat: 2005/08/29_10:28:07 info: Local Resource acquisition completed. 
>heartbeat: 2005/08/29_10:28:07 info: AnnounceTakeover(local 1, foreign 1, reason 'T_RESOURCES(us)' (1)) 
>heartbeat: 2005/08/29_10:28:07 info: Exiting req_our_resources(ask) process 2365 returned rc 0. 
>heartbeat: 2005/08/29_10:28:07 info: Running /etc/ha.d/rc.d/ip-request-resp ip-request-resp 
>heartbeat: 2005/08/29_10:28:07 received ip-request-resp 172.16.51.170 OK yes 
>heartbeat: 2005/08/29_10:28:07 info: Acquiring resource group: mail1.twn.tuv.com 172.16.51.170 
>heartbeat: 2005/08/29_10:28:07 info: Running /etc/ha.d/resource.d/IPaddr 172.16.51.170 start 
>heartbeat: 2005/08/29_10:28:07 info: Removing conflicting loopback lo:0. 
>heartbeat: 2005/08/29_10:28:07 info: /sbin/ifconfig lo:0 down                                                                 heartbeat: 2005/08/29_10:28:07 info: /sbin/route -n del -host 172.16.51.170 
>heartbeat: 2005/08/29_10:28:07 info: /sbin/arptables -D IN -j DROP -d 172.16.51.170 
>heartbeat: 2005/08/29_10:28:07 info: /sbin/arptables -D OUT -j mangle -o eth0 -s 172.16.51.170 --mangle-ip-s 172.16.51.171 
>heartbeat: 2005/08/29_10:28:07 info: /usr/sbin/arptables-noarp-addr 172.16.51.170 stop: success 
>heartbeat: 2005/08/29_10:28:07 info: /sbin/ifconfig eth0:0 172.16.51.170 netmask 255.255.252.0  broadcast 172.16.51.255 
>heartbeat: 2005/08/29_10:28:07 info: Sending Gratuitous Arp for 172.16.51.170 on eth0:0 [eth0] 
>heartbeat: 2005/08/29_10:28:07 /usr/lib/heartbeat/send_arp -i 1010 -r 5 -p /var/lib/heartbeat/rsctmp/send_arp/send_arp-172.16.51.170 eth0 172.16.51.170 auto 172.16.51.170 ffffffffffff 
>heartbeat: 2005/08/29_10:28:07 info: Exiting ip-request-resp process 2397 returned rc 0. 
>heartbeat: 2005/08/29_10:28:07 info: AnnounceTakeover(local 1, foreign 1, reason 'ip-request-resp' (1)) 
>
>heartbeat: 2005/08/29_10:37:55 info: Daily informational memory statistics 
>heartbeat: 2005/08/29_10:37:55 info: MSG stats: 305/1045 ms age 410 [pid2175/MST_CONTROL] 
>heartbeat: 2005/08/29_10:37:55 info: ha_malloc stats: 7414/27651  508974/243856 [pid2175/MST_CONTROL] 
>heartbeat: 2005/08/29_10:37:55 info: RealMalloc stats: 510926 total malloc bytes. pid [2175/MST_CONTROL] 
>heartbeat: 2005/08/29_10:37:55 info: Current arena value: 659456 
>heartbeat: 2005/08/29_10:37:55 info: MSG stats: 0/2 ms age 587870 [pid2194/HBFIFO] 
>heartbeat: 2005/08/29_10:37:55 info: ha_malloc stats: 0/36  238/0 [pid2194/HBFIFO] 
>heartbeat: 2005/08/29_10:37:55 info: RealMalloc stats: 1390 total malloc bytes. pid [2194/HBFIFO] 
>heartbeat: 2005/08/29_10:37:55 info: Current arena value: 135168 
>heartbeat: 2005/08/29_10:37:55 info: MSG stats: 0/0 ms age 639030 [pid2195/HBWRITE] 
>heartbeat: 2005/08/29_10:37:55 info: ha_malloc stats: 0/0  0/0 [pid2195/HBWRITE] 
>heartbeat: 2005/08/29_10:37:55 info: RealMalloc stats: 0 total malloc bytes. pid [2195/HBWRITE] 
>heartbeat: 2005/08/29_10:37:55 info: Current arena value: 0 
>heartbeat: 2005/08/29_10:37:55 info: MSG stats: 0/0 ms age 639030 [pid2196/HBREAD] 
>heartbeat: 2005/08/29_10:37:55 info: ha_malloc stats: 0/1468  14/0 [pid2196/HBREAD] 
>heartbeat: 2005/08/29_10:37:55 info: RealMalloc stats: 270 total malloc bytes. pid [2196/HBREAD] 
>heartbeat: 2005/08/29_10:37:55 info: Current arena value: 135168 
>heartbeat: 2005/08/29_10:37:55 info: These are nothing to worry about. 
>
>mail2.twn.tuv.com 
>
>heartbeat: 2005/08/29_10:29:39 info: AUTH: i=2: key = 0x855400c, auth=0x74cc48, authname=sha1 
>heartbeat: 2005/08/29_10:29:39 info: ************************** 
>heartbeat: 2005/08/29_10:29:39 info: Configuration validated. Starting heartbeat 1.2.3.cvs.20050404 
>heartbeat: 2005/08/29_10:29:40 info: heartbeat: version 1.2.3.cvs.20050404 
>heartbeat: 2005/08/29_10:29:41 info: Heartbeat generation: 38 
>heartbeat: 2005/08/29_10:29:41 info: UDP Broadcast heartbeat started on port 694 (694) interface eth0 
>heartbeat: 2005/08/29_10:29:41 info: pid 2175 locked in memory. 
>heartbeat: 2005/08/29_10:29:41 info: Local status now set to: 'up' 
>heartbeat: 2005/08/29_10:29:42 info: pid 2194 locked in memory. 
>heartbeat: 2005/08/29_10:29:42 info: pid 2195 locked in memory. 
>heartbeat: 2005/08/29_10:29:42 info: pid 2196 locked in memory. 
>heartbeat: 2005/08/29_10:29:42 info: Link mail2.twn.tuv.com:eth0 up. 
>heartbeat: 2005/08/29_10:29:52 info: Link mail1.twn.tuv.com:eth0 up. 
>heartbeat: 2005/08/29_10:29:52 info: Status update for node mail1.twn.tuv.com: status up 
>heartbeat: 2005/08/29_10:29:52 info: Local status now set to: 'active' 
>heartbeat: 2005/08/29_10:29:52 info: Running /etc/ha.d/rc.d/status status 
>heartbeat: 2005/08/29_10:29:52 info: Exiting status process 2357 returned rc 0. 
>heartbeat: 2005/08/29_10:29:52 info: Status update for node mail1.twn.tuv.com: status active 
>heartbeat: 2005/08/29_10:29:52 info: AnnounceTakeover(local 0, foreign 1, reason 'HB_R_BOTHSTARTING' (0)) 
>heartbeat: 2005/08/29_10:29:52 info: AnnounceTakeover(local 0, foreign 1, reason 'T_RESOURCES' (0)) 
>heartbeat: 2005/08/29_10:29:52 info: STATE 1 => 3 
>heartbeat: 2005/08/29_10:29:52 info: STATE 3 => 2 
>heartbeat: 2005/08/29_10:29:52 info: Running /etc/ha.d/rc.d/status status 
>heartbeat: 2005/08/29_10:29:52 info: Exiting status process 2361 returned rc 0. 
>heartbeat: 2005/08/29_10:29:52 info: AnnounceTakeover(local 0, foreign 1, reason 'T_RESOURCES' (0)) 
>heartbeat: 2005/08/29_10:29:52 info: other_holds_resources: 0 
>heartbeat: 2005/08/29_10:29:52 info: STATE 2 => 3 
>heartbeat: 2005/08/29_10:30:03 info: local resource transition completed. 
>heartbeat: 2005/08/29_10:30:03 info: AnnounceTakeover(local 1, foreign 1, reason 'T_RESOURCES(us)' (0)) 
>heartbeat: 2005/08/29_10:30:03 info: Initial resource acquisition complete (T_RESOURCES(us)) 
>heartbeat: 2005/08/29_10:30:03 info: remote resource transition completed. 
>heartbeat: 2005/08/29_10:30:03 info: AnnounceTakeover(local 1, foreign 1, reason 'T_RESOURCES(us)' (1)) 
>heartbeat: 2005/08/29_10:30:03 info: other_holds_resources: 1 
>heartbeat: 2005/08/29_10:30:03 info: other_holds_resources: 1 
>heartbeat: 2005/08/29_10:30:03 info: No local resources [/usr/lib/heartbeat/ResourceManager listkeys mail2.twn.tuv.com] to acquire. 
>heartbeat: 2005/08/29_10:30:03 info: AnnounceTakeover(local 1, foreign 1, reason 'T_RESOURCES(us)' (1)) 
>heartbeat: 2005/08/29_10:30:03 info: Exiting req_our_resources(ask) process 2365 returned rc 0. 
>heartbeat: 2005/08/29_10:30:03 info: other_holds_resources: 1 
>
>heartbeat: 2005/08/29_10:39:41 info: Daily informational memory statistics 
>heartbeat: 2005/08/29_10:39:41 info: MSG stats: 305/1040 ms age 390 [pid2175/MST_CONTROL] 
>heartbeat: 2005/08/29_10:39:41 info: ha_malloc stats: 7384/27483  507050/242978 [pid2175/MST_CONTROL] 
>heartbeat: 2005/08/29_10:39:41 info: RealMalloc stats: 509002 total malloc bytes. pid [2175/MST_CONTROL] 
>heartbeat: 2005/08/29_10:39:41 info: Current arena value: 667648 
>heartbeat: 2005/08/29_10:39:41 info: MSG stats: 0/1 ms age 577890 [pid2194/HBFIFO] 
>heartbeat: 2005/08/29_10:39:41 info: ha_malloc stats: 0/18  238/0 [pid2194/HBFIFO] 
>heartbeat: 2005/08/29_10:39:41 info: RealMalloc stats: 1390 total malloc bytes. pid [2194/HBFIFO] 
>heartbeat: 2005/08/29_10:39:41 info: Current arena value: 135168 
>heartbeat: 2005/08/29_10:39:41 info: MSG stats: 0/0 ms age 633250 [pid2195/HBWRITE] 
>heartbeat: 2005/08/29_10:39:41 info: ha_malloc stats: 0/0  0/0 [pid2195/HBWRITE] 
>heartbeat: 2005/08/29_10:39:41 info: RealMalloc stats: 0 total malloc bytes. pid [2195/HBWRITE] 
>heartbeat: 2005/08/29_10:39:41 info: Current arena value: 0 
>heartbeat: 2005/08/29_10:39:41 info: MSG stats: 0/0 ms age 633250 [pid2196/HBREAD] 
>heartbeat: 2005/08/29_10:39:41 info: ha_malloc stats: 0/1460  14/0 [pid2196/HBREAD] 
>heartbeat: 2005/08/29_10:39:41 info: RealMalloc stats: 270 total malloc bytes. pid [2196/HBREAD] 
>heartbeat: 2005/08/29_10:39:41 info: Current arena value: 135168 
>heartbeat: 2005/08/29_10:39:41 info: These are nothing to worry about. 
>_______________________________________________
>Linux-HA mailing list
>Linux-HA at lists.linux-ha.org
>http://lists.linux-ha.org/mailman/listinfo/linux-ha
>See also: http://linux-ha.org/ReportingProblems 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.community.tummy.com/pipermail/linux-ha/attachments/20050828/460ddacf/attachment-0001.html


More information about the Linux-HA mailing list