[Linux-HA] Problems on Solaris 10
t.d.lee at durham.ac.uk
Mon Jul 16 03:24:58 MDT 2007
On Fri, 13 Jul 2007, Francesco Furnari wrote:
> I'm trying to run heartbeat on two Solaris 10 boxes. I compiled and
> installed succesfully but have a lot of messages on the log.
> Furthermore, issuing /etc/init.d/heartbeat stop command a lot of
> processes remain running.
> This is ps -ef after issuing the stop command
> root at nodo1 # ps -ef | grep hea
> nobody 26273 26266 0 16:19:08 ? 0:01
> root 26266 1 1 16:19:08 ? 0:22
I think I know what this might be. Could you confirm that you are running
a release of heartbeat from before May 2007?
In May 2007, we began to fix a problem about the way heartbeat spawns
its processes. It was using 'exec("bin/sh", ..., "command")' to do this.
On Linux "sh", that interim "sh" replaced itself with "command"; that is,
"command" became the child of the original process. On Solaris "sh", this
interim "sh" stays, and itself spawns "command" as its own child; that is,
"command" becomes the grandchild (not child) of the original process.
This is being tracked as bugzilla 1576.
There were two instances of such code. I was able to test and fix one, by
using "wordexp()" insted of "exec()". That did the "proof of concept" and
has been in the development tree for some weeks. But I was not easily
able to test the other; that part of bug remains open.
Coincidentally, just this weekend Alan has also visited this area because
of an issue with the fix, which we are now addressing.
Could I suggest that we look at bug 1576, please, for the 2.1.1 release?
: David Lee I.T. Service :
: Senior Systems Programmer Computer Centre :
: UNIX Team Leader Durham University :
: South Road :
: http://www.dur.ac.uk/t.d.lee/ Durham DH1 3LE :
: Phone: +44 191 334 2752 U.K. :
More information about the Linux-HA