Apache failed to start in the gate

Bug #1340660 reported by Daniel Berrange
50
This bug affects 11 people
Affects Status Importance Assigned to Milestone
devstack
Invalid
Undecided
Unassigned

Bug Description

Devstack failed to start apache for horizon

http://logs.openstack.org/21/105121/3/check/gate-tempest-dsvm-neutron-large-ops/b8ce97e/logs/devstacklog.txt.gz

2014-07-11 07:52:23.074 | Enabling site horizon.
2014-07-11 07:52:23.079 | To activate the new configuration, you need to run:
2014-07-11 07:52:23.079 | service apache2 reload
2014-07-11 07:52:23.081 | + sudo rm -f '/var/log/apache2/horizon_*'
2014-07-11 07:52:23.088 | + start_horizon
2014-07-11 07:52:23.088 | + restart_apache_server
2014-07-11 07:52:23.088 | + restart_service apache2
2014-07-11 07:52:23.088 | + is_ubuntu
2014-07-11 07:52:23.088 | + [[ -z deb ]]
2014-07-11 07:52:23.088 | + '[' deb = deb ']'
2014-07-11 07:52:23.088 | + sudo /usr/sbin/service apache2 restart
2014-07-11 07:52:23.100 | * Restarting web server apache2
2014-07-11 07:52:23.389 | AH00558: apache2: Could not reliably determine the server's fully qualified domain name, using 127.0.0.1. Set the 'ServerName' directive globally to suppress this message
2014-07-11 07:52:23.389 | (98)Address already in use: AH00072: make_sock: could not bind to address [::]:80
2014-07-11 07:52:23.390 | (98)Address already in use: AH00072: make_sock: could not bind to address 0.0.0.0:80
2014-07-11 07:52:23.390 | no listening sockets available, shutting down
2014-07-11 07:52:23.390 | AH00015: Unable to open logs
2014-07-11 07:52:23.400 | Action 'start' failed.
2014-07-11 07:52:23.400 | The Apache error log may have more information.
2014-07-11 07:52:43.435 | ...fail!
2014-07-11 07:52:43.435 | * The apache2 instance did not start within 20 seconds. Please read the log files to discover problems

Revision history for this message
melanie witt (melwitt) wrote :
Revision history for this message
Attila Fazekas (afazekas) wrote :

These message not seen before 2014-07-10T23:14:07.269+00:00.

Revision history for this message
Davanum Srinivas (DIMS) (dims-v) wrote :

Added a review to capture Apache2 error log at the gate:
https://review.openstack.org/#/c/107129/

Revision history for this message
Morgan Fainberg (mdrnstm) wrote :

This is a case where the init-scripts (as far as i can tell) are a bit too fast. Basically, they think apache is stopped before all resources are really freed. I've seen this over many years working with Apache. According to the Apache wiki, apachectl restart should avoid this issue.

https://wiki.apache.org/httpd/CouldNotBindToAddress (look at the last case).

Most cases the CMS or other tool has enough delay you'll never run across this type of issue, but I've seen many cases where a stop/sleep1/start is sufficient.

I have a patch https://review.openstack.org/#/c/107131/ proposed to address this, but after conversing with Ian across comments I plan on circling back and make devstack use apachectl instead of the stop/sleep/start method.

It's not really a *bug* in the upstream init, but it's generally speaking annoying and a relative edge-case (happens more often when running an application under apache's control via mod_wsgi, etc).

If we want to add a query for Elastic Recheck for this, we probably want to also look for the "Address already in use" part, as that is a key bit to this issue.

This was likely introduced with Keystone gating on mod_wsgi since it introduces an *extra* apache restart that was not there before in the ds-g run.

Revision history for this message
Davanum Srinivas (DIMS) (dims-v) wrote :

Morgan's last patch seems to have fixed the gate.

Changed in devstack:
status: New → Fix Committed
Revision history for this message
ruhe (ruhe) wrote :

Seems like we've hit this issue in dsvm job. Link:
http://logs.openstack.org/32/113032/1/check/gate-murano-devstack-dsvm/5a40c55/logs/devstacklog.txt.gz

Trace matches the original one:
2014-08-08 22:14:22.923 | ++ restart_service apache2
2014-08-08 22:14:22.923 | ++ is_ubuntu
2014-08-08 22:14:22.923 | ++ [[ -z deb ]]
2014-08-08 22:14:22.923 | ++ '[' deb = deb ']'
2014-08-08 22:14:22.923 | ++ sudo /usr/sbin/service apache2 restart
2014-08-08 22:14:22.936 | * Restarting web server apache2
2014-08-08 22:14:23.252 | AH00558: apache2: Could not reliably determine the server's fully qualified domain name, using 127.0.0.1. Set the 'ServerName' directive globally to suppress this message
2014-08-08 22:14:23.252 | (98)Address already in use: AH00072: make_sock: could not bind to address [::]:80
2014-08-08 22:14:23.252 | (98)Address already in use: AH00072: make_sock: could not bind to address 0.0.0.0:80
2014-08-08 22:14:23.252 | no listening sockets available, shutting down
2014-08-08 22:14:23.252 | AH00015: Unable to open logs
2014-08-08 22:14:23.263 | Action 'start' failed.
2014-08-08 22:14:23.263 | The Apache error log may have more information.
2014-08-08 22:14:43.302 | ...fail!
2014-08-08 22:14:43.302 | * The apache2 instance did not start within 20 seconds. Please read the log files to discover problems
2014-08-08 22:14:43.303 | + exit_trap

Revision history for this message
Preston L. Bannister (preston-bannister) wrote :
Download full text (3.4 KiB)

Ran into this when running the current "stable" version of devstack, with "stable/icehouse" branches specified in "local.conf".

From local.conf:
==================
[[local|localrc]]

# Wanted versions
GLANCE_BRANCH=stable/icehouse
HORIZON_BRANCH=stable/icehouse
KEYSTONE_BRANCH=stable/icehouse
NOVA_BRANCH=stable/icehouse
GLANCE_BRANCH=stable/icehouse
NEUTRON_BRANCH=stable/icehouse
#SWIFT_BRANCH=1.10.0
==================

Output from ./stack.sh startup:
==================
2014-08-08 23:52:24.908 | + start_horizon
2014-08-08 23:52:24.908 | + restart_apache_server
2014-08-08 23:52:24.909 | + restart_service apache2
2014-08-08 23:52:24.909 | + is_ubuntu
2014-08-08 23:52:24.909 | + [[ -z deb ]]
2014-08-08 23:52:24.909 | + '[' deb = deb ']'
2014-08-08 23:52:24.910 | + sudo /usr/sbin/service apache2 restart
2014-08-08 23:52:24.957 | * Restarting web server apache2
2014-08-08 23:52:25.275 | (98)Address already in use: AH00072: make_sock: could not bind to address [::]:5000
2014-08-08 23:52:25.275 | (98)Address already in use: AH00072: make_sock: could not bind to address 0.0.0.0:5000
2014-08-08 23:52:25.275 | no listening sockets available, shutting down
2014-08-08 23:52:25.275 | AH00015: Unable to open logs
2014-08-08 23:52:25.290 | Action 'start' failed.
2014-08-08 23:52:25.291 | The Apache error log may have more information.
2014-08-08 23:52:45.500 | ...fail!
2014-08-08 23:52:45.501 | * The apache2 instance did not start within 20 seconds. Please read the log files to discover problems
==================

Seems there is a python process sitting on port 5000:
==================
preston@devstack:~/devstack$ sudo netstat -atp
Active Internet connections (servers and established)
Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name
tcp 0 0 *:33815 *:* LISTEN 29478/beam.smp
tcp 0 0 10.0.1.25:35357 *:* LISTEN 4268/python
tcp 0 0 *:5000 *:* LISTEN 4268/python
tcp 0 0 *:mysql *:* LISTEN 4036/mysqld
tcp 0 0 *:epmd *:* LISTEN 29441/epmd
tcp 0 0 192.168.122.1:domain *:* LISTEN 3646/dnsmasq
tcp 0 0 *:ssh *:* LISTEN 1397/sshd
tcp 0 0 localhost:45597 localhost:epmd ESTABLISHED 29478/beam.smp
tcp 0 0 10.0.1.25:ssh 10.0.1.8:57683 ESTABLISHED 27989/sshd: preston
tcp 0 0 localhost:epmd localhost:45597 ESTABLISHED 29441/epmd
tcp 0 0 localhost:56120 localhost:mysql ESTABLISHED 4268/python
tcp 0 0 10.0.1.25:ssh 10.0.1.8:54099 ESTABLISHED 12256/sshd: preston
tcp 0 0 localhost:mysql localhost:56120 ESTABLISHED 4036/mysqld
tcp 0 0 10.0.1.25:ssh 10.0.1.8:58251 ESTABLISHED 5259/sshd: preston
tcp6 0 0 [::]:amqp ...

Read more...

Revision history for this message
Preston L. Bannister (preston-bannister) wrote :

Ah. Found it. The process sitting on port 5000 is "keystone".

Not fluent with WSGI, but it appears the WSGI daemons for keystone are not stopped on "apache2 restart".

/etc/apache2/sites-available/keystone.conf
==================
Listen 5000
Listen 35357

<VirtualHost *:5000>
    WSGIDaemonProcess keystone-public processes=5 threads=1 user=preston
    WSGIProcessGroup keystone-public
    WSGIScriptAlias / /var/www/keystone/main
    WSGIApplicationGroup %{GLOBAL}
    ErrorLog /var/log/apache2/keystone
    LogLevel debug
    CustomLog /var/log/apache2/access.log combined
</VirtualHost>

<VirtualHost *:35357>
    WSGIDaemonProcess keystone-admin processes=5 threads=1 user=preston
    WSGIProcessGroup keystone-admin
    WSGIScriptAlias / /var/www/keystone/admin
    WSGIApplicationGroup %{GLOBAL}
    ErrorLog /var/log/apache2/keystone
    LogLevel debug
    CustomLog /var/log/apache2/access.log combined
</VirtualHost>

# Workaround for missing path on RHEL6, see
# https://bugzilla.redhat.com/show_bug.cgi?id=1121019
WSGISocketPrefix /var/run/apache2
==================

Running "unstack.sh" brings down the unwanted(?) processes, apparently by "screen_stop key" .

So ... the process camping on port 5000 is running in a "screen" window, but Apache wants to open the port as well. Does this mean a recent change to the Apache configuration that is not compatible with devstack usage?

Revision history for this message
Morgan Fainberg (mdrnstm) wrote :

@Preston,

Master devstack now uses apache to run keystone. What appears to have occurred for you is that you had an eventlet keystone running on port 5000 and tried to run the apache-based keystone at the same time.

My guess is that this was not a clean devstack build (I haven't seen this occur in other cases). There is a flag that can be set (ENV value) that will disable apache mod_wsgi for all services that use it by default (currently only keystone) or toggle it for keystone specifically (separate flag / ENV value).

KEYSTONE_USE_MOD_WSGI - Keystone Only
ENABLE_HTTPD_MOD_WSGI_SERVICES - All services that use mod_wsgi by default (currently only keystone, but may change in the future)

If these are set to any value except "True", it should revert the behavior back to using eventlet instead of mod_wsgi

Switching between eventlet and apache keystone should likely be done as a clean re-stack (you can stop the screen by hand, among other ways, but clean-restack ensures all the correct apache values)

Revision history for this message
Morgan Fainberg (mdrnstm) wrote :

@Ruslan Kamaldinov,

Looking over the logstash query, this has occured 3 times in 7 days. Unfortunately, this could be an issue where the specific node is being slow (having a lot going on and the process teardown for apache is a bit slow). While it might be worth reopening this ticket / keeping this on the radar, I think we can let this one sit for a little bit.

There is not exactly a simple fix for this, the next step is to start looking for processes and issuing explicit kill signals to them. This type of issue has existed with the system-startup /init scripts for a very long time and seems to keep cropping up no matter the system used.

Revision history for this message
Preston L. Bannister (preston-bannister) wrote :

@Morgan,

Thanks for the clue. Apparently I did not do an "unstack.sh" at the right moment. Bit new to playing with OpenStack.

That said, this seems likely to come up again. And again.

Seems like this should be in the devstack FAQ, at least:
    If you are switching between branches, run "unstack.sh" or "clean.sh" first.

What got me into this state...

  cd devstack
  git pull # on master
  ./stack.sh # errors in startup ... unstable master? Try stable/icehouse?
  screen -r # and ^A\ to terminate screen sessions

  git checkout stable/icehouse
  ./stack.sh # errors in startup
  screen -r # and ^A\ to terminate screen sessions
  sudo rm -rf /opt/stack/*

Perhaps "unstack.sh" should know to cleanup for other branches (or at least between "master" and "stable/icehouse"), to limit future reports of this kind.

Revision history for this message
Mark Goddard (mgoddard) wrote :

I have had this problem on CentOS 6.5 desktop running in a VirtualBox VM.

I too followed a similar workflow to @Preston, moving from devstack master to stable/icehouse.

It turned out that the keystone httpd WSGI module config file was not removed by devstack master when cleaning up. This module would then clash with the icehouse keystone process listening on port 5000, causing the error.

With keystone running (potentially still in your half baked devstack setup), running 'sudo service httpd restart' will reproduce the error.

I fixed the issue by removing the keystone httpd config file:

sudo rm /etc/httpd/conf.d/keystone.conf

Presumably this could be achieved similarly for apache2:

sudo rm /etc/apache2/sites-available/keystone.conf

Revision history for this message
Sean Dague (sdague) wrote :

This devstack bug was last updated over 180 days ago, as devstack
is a fast moving project and we'd like to get the tracker down to
currently actionable bugs, this is getting marked as Invalid. If the
issue still exists, please feel free to reopen it.

Changed in devstack:
status: Fix Committed → Invalid
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.