[Linux-ha-dev] pgsql RA improvements
Andrew Beekhof
beekhof at gmail.com
Mon Feb 26 03:31:28 MST 2007
i made some further improvements in:
http://hg.beekhof.net/lha/crm-dev/rev/2e9b22cfb7e1
On 2/26/07, Keisuke MORI <kskmori at intellilink.co.jp> wrote:
> "Serge Dubrouski" <sergeyfd at gmail.com> writes:
> >> "Serge Dubrouski" <sergeyfd at gmail.com> writes:
> >>
> >> > And I don't like the idea of removing PID in "start" function. The
> >> > standard approach if to remove it after stopping application. Other
> >> > way it could lead to attempt of starting a second copy of application.
> >>
> >> This is necessary for the recovery from the power failure of the
> >> primary node, for example. There is no chance to cleanup by stop
> >> in such cases.
> >>
> >> Duplicate starting is avoided by checking if the postmaster
> >> process exists beforehand, as the original script does.
> >
> > Yes, but in this case you remov the legitimate pid file from the
> > running instance. You remove it before testing that the checking for
> > postmaster.
>
> Well, I think that the script does the cheking for postmaster first
> and removing it second (remove it only when no postmaster process exists).
>
> Here's the code snip with my patch.
> pgsql_status checks for it and I think it should be good enough.
> ----8<--------8<--------8<--------8<--------8<--------8<--------8<--------8<----
> pgsql_start() {
> if pgsql_status
> then
> ocf_log info "PostgreSQL is already running. PID=`cat $PIDFILE`"
> return $OCF_SUCCESS
> fi
>
> if [ -x $PGCTL ]
> then
> # Remove postmastre.pid if it exists
> rm -f $PIDFILE
> ----8<--------8<--------8<--------8<--------8<--------8<--------8<--------8<----
>
>
> > Let me think about it, I don't know what is worse in a
> > such case. Probably you are right and we has the right to think that
> > Postgress shouldn't be started outside of cluster control.
>
> If postmaster was already started outside of heartbeat control,
> then it should return OCF_SUCCESS and the postmaster should
> continue to run.
>
> Power failure is one of the most typical situation that we want
> to save with HA software, so this 'cleanup in start' is
> important, I think.
>
> Maybe it would be nice if we put a WARN log before removing it.
>
> Thanks,
>
> >
> >>
> >>
> >> >
> >> > On 2/23/07, Serge Dubrouski <sergeyfd at gmail.com> wrote:
> >> >> I like the idea of the patch, but honestly I don't like how it's
> >> >> implemented. It shall call (as Andrew suggested) "monitor" function to
> >> >> check that pgsql is up or down instead of spreading the same code all
> >> >> around the script. I'd like to review the idea and prepare another
> >> >> patch if everybody is agree.
> >>
> >> Yes, using the same monitor function would be better.
> >> I didn't do that just because it will dump many logs every
> >> seconds when it takes time to start.
> >> It is OK if you don't mind it.
> >
> > Don't think that this is a problem. Those files are big even without
> > those records.
> >
> > Thanks for all these proposals.
> >
> >>
> >> Thanks,
> >> --
> >> Keisuke MORI
> >> NTT DATA Intellilink Corporation
> >> _______________________________________________________
> >> Linux-HA-Dev: Linux-HA-Dev at lists.linux-ha.org
> >> http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
> >> Home Page: http://linux-ha.org/
> >>
> > _______________________________________________________
> > Linux-HA-Dev: Linux-HA-Dev at lists.linux-ha.org
> > http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
> > Home Page: http://linux-ha.org/
>
> --
> Keisuke MORI
> Open Source Business Division
> NTT DATA Intellilink Corporation
> Tel: +81-3-3534-4811 / Fax: +81-3-3534-4814
> _______________________________________________________
> Linux-HA-Dev: Linux-HA-Dev at lists.linux-ha.org
> http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
> Home Page: http://linux-ha.org/
>
More information about the Linux-HA-Dev
mailing list