[Linux-ha-dev] PATCH FOR: Re: AW: [Linux-HA] ERROR: 100 NULL cf-read() running heartbeat 2.0.0

Horms horms at verge.net.au
Mon Aug 8 23:38:30 MDT 2005


On Wed, Aug 03, 2005 at 08:25:56AM -0600, Alan Robertson wrote:
> Ulrich H. Thomas wrote:
> >Okay here are the log file an the configs.
> >
> >http://www.rz.unibw-muenchen.de/~j4tu0736/ha/
> >
> >Sorry, but my webmailer have a problem with the attachments. Hope that´s 
> >now
> >okay.
> >
> >Ulrich H. Thomas
> 
> Hi,
> 
> 
> In addition to the suggested patch of ignoring a lot more bad reads 
> before we exit, I've also made another patch to NOT return NULL when we 
> read a ping packet that doesn't belong to us.
> 
> I've attached this new patch for your critical examination.
> 
> Please comment.
> 
> 	Thanks!
> 
> -- 
>     Alan Robertson <alanr at unix.sh>
> 
> "Openness is the foundation and preservative of friendship...  Let me 
> claim from you at all times your undisguised opinions." - William 
> Wilberforce

> ? BEAM-complaints
> ? serial.s
> ? testme.c
> Index: ping.c
> ===================================================================
> RCS file: /home/cvs/linux-ha/linux-ha/lib/plugins/HBcomm/ping.c,v
> retrieving revision 1.42
> diff -u -r1.42 ping.c
> --- ping.c	8 Jul 2005 16:03:34 -0000	1.42
> +++ ping.c	3 Aug 2005 14:21:30 -0000
> @@ -314,6 +314,8 @@
>  	
>  	PINGASSERT(mp);
>  	ei = (struct ping_private *) mp->pd;
> +
> +ReRead:	/* We recv lots of packets that aren't ours */
>  	
>  	if ((numbytes=recvfrom(ei->sock, (void *) &buf.cbuf
>  	,	sizeof(buf.cbuf)-1, 0,	(struct sockaddr *)&their_addr
> @@ -343,7 +345,7 @@
>  	memcpy(&icp, (buf.cbuf + hlen), sizeof(icp));
>  	
>  	if (icp.icmp_type != ICMP_ECHOREPLY || icp.icmp_id != ei->ident) {
> -		return NULL;
> +		goto ReRead;	/* Not one of ours */
>  	}

Seems good, though a goto from alanr, wow! (joke)

>  	if (DEBUGPKT) {
> @@ -359,6 +361,7 @@
>  	pktlen = numbytes - hlen - ICMP_HDR_SZ;
>  
>  	if ((pkt = ha_malloc(pktlen + 1)) == NULL) {
> +		errno = ENOMEM;
>  		return NULL;
>  	}
>  	
> @@ -371,21 +374,19 @@
>  	msg = wirefmt2msg(msgstart, bufmax - msgstart, MSG_NEEDAUTH);
>  	if (msg == NULL) {
>                  ha_free(pkt);
> +		errno = EINVAL;
>  		return(NULL);
>  	}
>  	comment = ha_msg_value(msg, F_COMMENT);
>  	if (comment == NULL || strcmp(comment, PIL_PLUGIN_S)) {
>                  ha_free(pkt);
>  		ha_msg_del(msg);
> +		errno = EINVAL;
>  		return(NULL);
>  	}

What is the implication of setting errno?
I'm not sure I understand how this releates to the problem at hand.

>  	ha_msg_del(msg);
> -	/* return(msg); */
> -	
>  	return(pkt);
> -
> -
>  }

Mmm, spurious whitespace change.

>  
>  /*

[ snipped the ping_group portion of the patch, its much the same.
  Ref: http://www.gossamer-threads.com/lists/engine?do=post_attachment;postatt_id=362;list=linuxha ]


On Fri, Aug 05, 2005 at 10:16:05PM -0600, Alan Robertson wrote:

[snip]

> The fix filters out ANY ICMP messages which aren't ICMP Echo Replies 
> that belong to us.
> 
> There are other kinds of ICMP messages besides ICMP Echo Reply.  Maybe 
> the network problem caused a flurry of those other kinds of messages.

That sounds entirely likely to me.
I'd say its a good fix for a problem that was bound to show up
and I advocate puting it into HEAD and STABLE_1_2, comments
about not withstanding.

[snip]


-- 
Horms


More information about the Linux-HA mailing list