apache failed to restart with keystone wsgi app

Bug #1484836 reported by Vasyl Saienko
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Identity (keystone)
Expired
Undecided
Unassigned

Bug Description

From time to time apache failed to restart with init script when keystone wsgi application is running.

There is steps how to reproduce it:

launch apache start/stop in cycle
while :; do service apache2 stop; service apache2 start; done

after sime time apache failed to start, because it can't bind to opened socket:
 * Starting web server apache2
 *
 * Stopping web server apache2
 *
 * Starting web server apache2
 *
 * Stopping web server apache2
 *
 * Starting web server apache2
(98)Address already in use: AH00072: make_sock: could not bind to address [::]:35357
(98)Address already in use: AH00072: make_sock: could not bind to address 0.0.0.0:35357
no listening sockets available, shutting down
AH00015: Unable to open logs
Action 'start' failed.
The Apache error log may have more information.
 *
 * The apache2 instance did not start within 20 seconds. Please read the log files to discover problems
 * Stopping web server apache2
 *
 * Starting web server apache2
 *
 * Stopping web server apache2

Without keystone wsgi application, I can't reproduce error in 12 hours. horizon and radosgw wsgi were enabled.

It look like root cause is in keystone wsgi application itself.

Revision history for this message
narasimha18sv (narasimha18sv) wrote :

Stop Keystone service and start apache2 so make sure keystone works on top of apache.

no two services can work with same port , as keystone is already running on 35357 and 5000 ports, apache2 cannot run on the same ports.

All the commands will work.

Changed in keystone:
status: New → Invalid
Revision history for this message
Vasyl Saienko (vsaienko) wrote :

keystone service is stopped,
you can see it from the logs. Apache is able to stop/start successfully

 * Starting web server apache2
 *
 * Stopping web server apache2
 *
 * Starting web server apache2
 *

But sometimes apache trying to start when not all resources from previous stop are freed.
It causes error like:

(98)Address already in use: AH00072: make_sock: could not bind to address [::]:35357
(98)Address already in use: AH00072: make_sock: could not bind to address 0.0.0.0:35357

Changed in keystone:
status: Invalid → New
Revision history for this message
Sergii Golovatiuk (sgolovatiuk) wrote :

The problem is in keystone-admin, sometimes it's hung on SIGTERM sent by apache process. This hangs apache, so it doesn't release TCP socket for a while.

Revision history for this message
Dolph Mathews (dolph) wrote :

Several solutions are documented in bug 1253482.

Revision history for this message
Bogdan Dobrelya (bogdando) wrote :

@Sergii, I tried to reproduce your case by disabling signals processing for keystone-admin. But apache still stops/starts:
# pgrep -fla admin
2982 keystone-admin -k start
2983 keystone-admin -k start
# kill -STOP 2982 2983
# while :; do service apache2 stop; service apache2 start; done
 * Stopping web server apache2
 *
 * Starting web server apache2

Revision history for this message
Bogdan Dobrelya (bogdando) wrote :

@Dolph, the bug you've marked as dupluicate is not the case.
The reported issue is taking place on environments with 35357 excluded from ephemeral ports:
# sysctl -a | grep ip_local_reserved_ports
net.ipv4.ip_local_reserved_ports = 35357,41055,49000-49001,55572,58882

Revision history for this message
Sergii Golovatiuk (sgolovatiuk) wrote :

In Mirantis openstack deployment 35357 is added to list of reserved ports https://github.com/stackforge/fuel-library/blob/master/deployment/puppet/openstack/manifests/reserved_ports.pp

Even with reserved ports keystone WSGI module has issue.

Revision history for this message
David Stanek (dstanek) wrote :

It sounds like this bug is implying that Keystone is not releasing its ports. Can you post your Apache configuration somewhere? In a traditional setup Apache is opening/closing those ports, not Keystone, so I'm curious to see what you are actually doing.

Changed in keystone:
status: New → Incomplete
Revision history for this message
Vasyl Saienko (vsaienko) wrote :

There is my configuration for apache http://paste.openstack.org/show/427032/

Revision history for this message
David Stanek (dstanek) wrote :

So, the Keystone WSGI app isn't listening to the port so it can't release it. This is Apache not releasing the port or like Dolph suggested an issue with the port being in the ephemeral port range.

When this happens to you what does "netstat -ltp" show as running on that port?

Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for OpenStack Identity (keystone) because there has been no activity for 60 days.]

Changed in keystone:
status: Incomplete → Expired
Revision history for this message
jcat (jcat) wrote :

Just in case anyone else hits this issue on Ubuntu Trusty, some additional info that may help - who knows!

I had the same error from Apache, regarding Keystone, after I added Aodh API to Apache [ it was working fine before].
After some digging, I noticed that the aodh-notifier and aodh-listener services [ part of Aodh not in Apache ] where segfaulting constantly against librabbitmq.so.1.1.1 . I uninstalled librabbitmq1, as it didn't even look like it was needed. After that, everything worked fine.

Sounds strange I know, but I re-produced it several times by re-installing the lib [ and the python module for it python-librabbitmq ], and re-running the test. The behaviour was completely consistent every time. I'm not saying that aodh-notifier and aodh-listener services segfaulting were to blame necessarily, but it may have been some other interaction with the Aodh services under Apache and librabbitmq1 that caused it to not reload the Apache server properly.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.