[reid.s@usa.net: mailing list...]
Reid Steiner
reid_steiner@hotmail.com
Mon, 30 Sep 2002 23:30:29 +0000
Hi Nick, thanks for the reply! I'm not sure if I'm supposed to post to the
list or reply to you.
No failover occurs if I HUP Apache myself. The only files that actually
rotate on Sunday night have to be a week old and they are system logs and
httpd-error.log httpd-access.log is rotated anytime it hits 600MB (that's
why I have the logrotate script in cron.hourly, to monitor it). The only
thing I can thing of is that fos or pulse is getting a SIGTERM are
restarting which makes it inaccessible to the back up server which then
takes over. Here is the log entry from the last failover:
web2:
ep 29 01:01:04 web2 syslogd 1.4.1: restart.
Sep 29 01:01:19 web2 nanny[1954]: READ returned error 104:Connection reset
by peer
Sep 29 01:01:19 web2 nanny[1954]: Exiting due to connection failure of
xx.xx.xxx.2:80
Sep 29 01:01:19 web2 fos[1926]: Monitor for service xxx.xxx.xxx.3:80 exited.
This is a failover condition!
Sep 29 01:01:19 web2 fos[1926]: will now exit to notify pulse...
Sep 29 01:01:21 web2 pulse[1924]: fos process exited -- performing service
failover
Sep 29 01:01:21 web2 pulse[1924]: running command "/usr/sbin/fos"
"--active" "-c" "/etc/sysconfig/ha/lvs.cf" "--nofork"
Sep 29 01:01:21 web2 pulse[5310]: DEBUG -- device = eth0:1
Sep 29 01:01:21 web2 pulse[5310]: DEBUG -- floatAddr = xxx.xxx.xxx.3
Sep 29 01:01:21 web2 pulse[5310]: DEBUG -- vip_nmask = 0.0.0.0
Sep 29 01:01:21 web2 pulse[5311]: running command "/sbin/ifconfig" "eth0:1"
"xxx.xxx.xxx.3" "up"
Sep 29 01:01:21 web2 pulse[5311]: DEBUG -- Executing '/sbin/ifconfig eth0:1
xxx.xxx.xxx.2 netmask 0.0.0.0 up'
Sep 29 01:01:21 web2 fos[5309]: SIOCGIFADDR failed: Cannot assign requested
address
Sep 29 01:01:21 web2 fos[5309]: Stopping local services (if any)
Web1:
Sep 29 01:01:05 web1 syslogd 1.4.1: restart.
Sep 29 01:01:16 web1 pulse[1889]: PARTNER HAS TOLD US TO GO INACTIVE!
Sep 29 01:01:16 web1 fos[1934]: Shutting down due to signal 15
Sep 29 01:01:16 web1 fos[1934]: Shutting down local service 64.62.133.3:80
Sep 29 01:01:16 web1 fos[1934]: running command "/etc/rc.d/init.d/httpd"
"stop"
Sep 29 01:01:18 web1 httpd: httpd shutdown succeeded
Sep 29 01:01:18 web1 fos[1934]: will now exit to notify pulse...
>From: Nicholas Garratt <nick@clickatell.com>
>To: linux-ha@muc.de
>Subject: Re: [reid.s@usa.net: mailing list...]
>Date: Mon, 30 Sep 2002 14:45:04 +0200
>
>hi
>
>the nasty thing with logrotate is its need to HUP or stop/start the daemon
>writing to the log its rotating. apache has a very nice solution which I
>use for some daemons I never want to have to HUP, namely rotatelogs.
>Basically its reads from stdin and writes to a file with a timestamp
>(unixtime). rotatelogs then rotates the log it writes to at a configurable
>interval without ever having to close stdin which means the daemon
>generating the log data never has to reopen its log files.
>
>the obvious problem with this solution is that the daemon needs to be able
>to log to a pipe or stdout. daemons that insist on opening a log file can
>be tricked with a named pipe, although I haven't played with this much...
>
>nick
>
>
>>
>>Hi Armin, I'm trying to post to the mailing list and I'd like to post
>>this:
>>
>>I'm writing because I have similar issues as a few people who have posted
>>concerning piranha and logrotate.
>>
>>Some have posted that they "remedied" the problem by making logs rotate
>>only after an increased size as opposed to date/time/normal log file size.
>> I'm running RedHat 7.3 on two web servers using Piranha piranha-0.5.3-9
>>in fos mode. I have the httpd-access logs rotating at 600MB and logrotate
>>is in cron.hourly. All other files in logrotate.d are rotated weekly
>>(apache ftpd rpm syslog).
>>
>>I have no fail-overs during the week, however, when the weekly files
>>rotate on Sunday morning, piranha fails over and fails back. I only
>>really notice because I have httpd runing on the backup web server (for a
>>bullitin board, to balance the load between the servers) and it goes down
>>after the re-failover).
>>
>>When I HUP apache, logrotate or syslogd, I can't get it to happen. However
>>when I HUP fos on the primary node, the same behavior occurs *almost*.
>>All that's missing in the logs when I HUP pulse myself is:
>>
>>Sep 29 01:01:19 web2 nanny[1954]: READ returned error 104:Connection reset
>>by peer
>>
>>
>>Here's what my lvs.conf looks like:
>>
>>
>>primary = xxx.xxx.xxx.1
>>service = fos
>>rsh_command = ssh
>>backup_active = 1
>>backup = xxx.xxx.xxx.2
>>heartbeat = 1
>>heartbeat_port = 539
>>keepalive = 18
>>deadtime = 36
>>network = direct
>>debug_level = NONE
>>failover failover {
>> address = xxx.xxx.xxx.3 eth1:1
>> active = 1
>> port = 80
>> timeout = 18
>> send = "GET / HTTP/1.0\r\n\r\n"
>> expect = "HTTP"
>> start_cmd = "/etc/rc.d/init.d/httpd start"
>> stop_cmd = "/etc/rc.d/init.d/httpd stop"
>>}
>>
>>
>>
>>Thanks in advance for any thoughts/direction.
>>
>>
>>
>>Reid Steiner
>>
>>----- End forwarded message -----
>>
>>--
>>Armin Gruner ____
>>mailto:ag@muc.de
>>``Nur wer sich aendert, bleibt \ /
>>http://www.muc.de/~ag/
>>sich treu'' - Wolf Biermann \/ PGP Key: 0x72DBE671 or finger -l
>>ag@muc.de
>
>
>--
>--------------------------------
>www.clickatell.com
>Any message, anywhere
>Phone: +27 21 9487150
_________________________________________________________________
MSN Photos is the easiest way to share and print your photos:
http://photos.msn.com/support/worldwide.aspx