Fuel for OpenStack

Apache graceful restart leads to mod_wsgi segfault

Bug #1493353 reported by Vladimir Kuklin on 2015-09-08

This bug affects 1 person

	Status	Importance	Assigned to	Milestone
Fuel for OpenStack	Won't Fix	Critical	Vladimir Kuklin	Fuel for OpenStack 8.0
6.1.x	Invalid	Undecided	Vladimir Kuklin	Fuel for OpenStack 6.1-updates
7.0.x	Won't Fix	High	Vladimir Kuklin	Fuel for OpenStack 7.0-updates

Bug Description

Please look into for details https://bugs.launchpad.net/fuel/7.0.x/+bug/1490523/comments/20

This needs to be worked around until upstream fix for mod_wsgi is available.

This happens during cluster scale-up/scale-down operations as apache gets reloaded and thus mod_wsgi starts misbehaving which makes deployment fail as we cannot operate with keystone entities.

For details also look into

https://github.com/GrahamDumpleton/mod_wsgi/issues/81

Tags:

Revision history for this message

Aleksander Mogylchenko (amogylchenko) wrote on 2015-09-08:

The workaround is simple - do not reload mod_wsgi so often. As github comments suggest (and my comments in the bug/1490523), there should be at least 1 minute pause before restarts. Or, alternatively, run keystone as a separate process.

Revision history for this message

Igor Marnat (imarnat) wrote on 2015-09-08:

+amogylchenko: Sasha, I don't think we do reload of mod_wsgi more often than 1 in a minute. What makes you think so?

Revision history for this message

Aleksander Mogylchenko (amogylchenko) wrote on 2015-09-08:

There was somewhat similar investigation here:
https://bugs.launchpad.net/mos/+bug/1481671/comments/3

And comments to the bug say so:
https://github.com/GrahamDumpleton/mod_wsgi/issues/81#issue-94403134
> Note that this happens if there is less than 1 second in between reloads:

Revision history for this message

Aleksander Mogylchenko (amogylchenko) wrote on 2015-09-08:

Since it is not clear how to reproduce the problem from this description, please provide more detailed information:
1. Environment you are using
2. Apache and mod-wsgi versions
3. Shell command to reproduce a problem

If steps are similar to those in upstream github issue, please do not restart mod_wsgi so often, or keep keystone as a separate process.

Changed in fuel:
status:	Confirmed → Incomplete
assignee:	MOS Linux (mos-linux) → Vladimir Kuklin (vkuklin)

Revision history for this message

Igor Marnat (imarnat) wrote on 2015-09-08:

@amogylchenko: Sasha, there are more details in bug https://bugs.launchpad.net/mos/+bug/1491576. Does this answer your questions?

Vladimir Kuklin (vkuklin) on 2015-09-10

Changed in fuel:
status:	Incomplete → Confirmed
assignee:	Vladimir Kuklin (vkuklin) → MOS Linux (mos-linux)

Revision history for this message

Aleksander Mogylchenko (amogylchenko) wrote on 2015-09-10:

Please provide the following information:
1. Environment you are using
2. Apache and mod-wsgi versions
3. Shell command to reproduce a problem

Changed in fuel:
status:	Confirmed → Incomplete
assignee:	MOS Linux (mos-linux) → Vladimir Kuklin (vkuklin)

Revision history for this message

Vladimir Kuklin (vkuklin) wrote on 2015-09-10:

1. apache2:
  Установлен: 2.4.7-1ubuntu4.5
  Кандидат: 2.4.7-1ubuntu4.5
  Таблица версий:
*** 2.4.7-1ubuntu4.5 0
        500 http://mirrors.msk.mirantis.net/ubuntu/ trusty-updates/main amd64 Packages
        500 http://mirrors.msk.mirantis.net/ubuntu/ trusty-security/main amd64 Packages
        100 /var/lib/dpkg/status
     2.4.7-1ubuntu4 0
        500 http://mirrors.msk.mirantis.net/ubuntu/ trusty/main amd64 Packages
libapache2-mod-wsgi:
  Установлен: 3.4-4ubuntu2.1.14.04.2
  Кандидат: 3.4-4ubuntu2.1.14.04.2
  Таблица версий:
*** 3.4-4ubuntu2.1.14.04.2 0
        500 http://mirrors.msk.mirantis.net/ubuntu/ trusty-updates/main amd64 Packages
        500 http://mirrors.msk.mirantis.net/ubuntu/ trusty-security/main amd64 Packages
        100 /var/lib/dpkg/status
     3.4-4ubuntu2 0
        500 http://mirrors.msk.mirantis.net/ubuntu/ trusty/main amd64 Packages

2. Any fuel mod_wsgi enabled ISO. E.g. 286

a) in one shell run periodic `openstack token issue` command or similar rally scenarion that introduces some workloads
b) in the other shell run "while :; do let i=i+1; echo -e "`date`\n"; apachectl graceful 2>&1; sleep N; done"

Grep for seg fault messages for apache in /var/log/syslog.

According to the symptoms N can be any arbitrary number up to 60 .

Changed in fuel:
status:	Incomplete → Confirmed
assignee:	Vladimir Kuklin (vkuklin) → Aleksander Mogylchenko (amogylchenko)

Revision history for this message

Aleksander Mogylchenko (amogylchenko) wrote on 2015-09-10:

Unable to reproduce after an hour run with N=30 using steps provided on ISO #187. It has mod_wsgi enables with the following parameters:
WSGIDaemonProcess keystone_main display-name=keystone-main group=keystone processes=2 threads=1 user=keystone

I was running 'openstack tocken issue' in one console (endless loop without sleeps):
while true; do openstack token issue; done

And the exact loop as provided in the other console with N=30:
while :; do let i=i+1; echo -e "`date`\n"; apachectl graceful 2>&1; sleep 30; done

Changed in fuel:
status:	Confirmed → Incomplete
assignee:	Aleksander Mogylchenko (amogylchenko) → Vladimir Kuklin (vkuklin)

Revision history for this message

Vladimir Kuklin (vkuklin) wrote on 2015-09-10:

according to the discussion, this bug is as-designed feature of apache2 which cannot handle cold/warm restarts properly. so far, we are closing this bug and will switch to uwsgi and apparently nginx in the upcoming releases.

Changed in fuel:
status:	Incomplete → Won't Fix
tags:	added: release-notes

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2015-09-10: Related fix proposed to fuel-library (master)

#10

Related fix proposed to branch: master
Review: https://review.openstack.org/222360

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2015-09-10: Related fix proposed to fuel-library (stable/7.0)

#11

Related fix proposed to branch: stable/7.0
Review: https://review.openstack.org/222362

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2015-09-11: Related fix merged to fuel-library (master)

#12

Reviewed: https://review.openstack.org/222360
Committed: https://git.openstack.org/cgit/stackforge/fuel-library/commit/?id=92a4ed5bb811c62263d72cf76b8cbd7a719fed3f
Submitter: Jenkins
Branch: master

commit 92a4ed5bb811c62263d72cf76b8cbd7a719fed3f
Author: Vladimir Kuklin <email address hidden>
Date: Thu Sep 10 23:44:48 2015 +0300

Add workaround for apache restart during deployment

    As figured out in the following launcpad bug
    https://bugs.launchpad.net/fuel/+bug/1493353
    Apache2 is not very friendly to any types of
    restarts. This may lead to the issue when we
    restart apache to often during deployment and
    it gets into Byzantine unresponsive state
    along with keystone which breaks keystone
    providers operations.

    Change-Id: I11c52089e9598fc6d088c3478c90de3aa853652a
    Closes-bug: #1493372
    Related-bug: #1493353

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2015-09-11: Related fix merged to fuel-library (stable/7.0)

#13

Reviewed: https://review.openstack.org/222362
Committed: https://git.openstack.org/cgit/stackforge/fuel-library/commit/?id=d4723a87bee9ab1a958bdec174b222a0fcd05d5c
Submitter: Jenkins
Branch: stable/7.0

commit d4723a87bee9ab1a958bdec174b222a0fcd05d5c
Author: Vladimir Kuklin <email address hidden>
Date: Thu Sep 10 23:44:48 2015 +0300

Add workaround for apache restart during deployment

    Change-Id: I11c52089e9598fc6d088c3478c90de3aa853652a
    Closes-bug: #1493372
    Related-bug: #1493353

Revision history for this message

Kyrylo Galanov (kgalanov) wrote on 2015-11-02:

#14

Hello,

Unfortunately, the fix does not help. The bug is still present in 8.0 (mentioned in https://bugs.launchpad.net/fuel/+bug/1506449).
According to the information available in the Internet, apache does seg fault if two restart commands are issues rapidly.
So, the fix might look like:
apachectl graceful || sleep 15 && apachectl restart

--
VERSION:
  feature_groups:
    - mirantis
  production: "docker"
  release: "8.0"
  openstack_version: "2015.1.0-8.0"
  api: "1.0"
  build_number: "204"
  build_id: "204"
  fuel-nailgun_sha: "5a3b8907ae9ebd56c354436a9e8c9a47edf459ad"
  python-fuelclient_sha: "2a1b048cc439986e222ece43a290b5cc68e92a77"
  fuel-agent_sha: "d2103bee6e216396eb8e308ec5448328c9ee4261"
  fuel-nailgun-agent_sha: "00b4b11553c250f22c0079fb74c8b782dcb7b740"
  astute_sha: "cfd5d6b916a17ad2f73e6c567a0365845155b0e3"
  fuel-library_sha: "7794da76fd5797c4c4242fb4e70e3757d37c4a01"
  fuel-ostf_sha: "1ab201cb8c3bba04522bf56ce72e863a03ff09b3"
  fuel-createmirror_sha: "6e1b82b2059a20f1fa9a4d794b976edaad156b85"
  fuelmenu_sha: "e68335c88feca803c97d75ae5a6e7de1e3f330dc"
  shotgun_sha: "bbbfccff9eb90895b13fae3fac398e65efe646f4"
  fuelmain_sha: "058e07386350bfa0a8365818cf75893949e0d863"

Revision history for this message

Nastya Urlapova (aurlapova) wrote on 2015-11-02:

#15

Kyrylo, if issue is present in 8.0, please add proper milestone to it, instead of reopen old one.

Revision history for this message

Dmitry Klenov (dklenov) wrote on 2015-11-02:

#16

I see that the fix was merged to stable/7.0 branch. Vladimir, can you please confirm that no more fixes to stable/7.0 are expected? Please also move bug to 'fix committed' in this case.

Revision history for this message

Bogdan Dobrelya (bogdando) wrote on 2015-11-02:

#17

Please note, all bugs involving a cluster scale-up/down should have this tag

tags:

added: life-cycle-management

Revision history for this message

Vladimir Kuklin (vkuklin) wrote on 2015-11-05:

#18

Kiril, please create another bug as RCA of this bug was identified and fixed and tested many times. Please create new bug and add the test case and all the details required. In the worst case we will mark it as a duplicate of this one, but I suspect you have an outdated mod_wsgi version for your installation.

Revision history for this message

Kyrylo Galanov (kgalanov) wrote on 2015-11-05:

#19

Hello,

The new bug was originally filed to https://bugs.launchpad.net/fuel/+bug/1506449

--
Kyrylo

Dmitry Pyzhov (dpyzhov) on 2015-11-05

tags:

added: area-library

Dmitry Pyzhov (dpyzhov) on 2015-11-18

no longer affects:

fuel/8.0.x

Dmitry Pyzhov (dpyzhov) on 2015-11-18

Changed in fuel:
status:	Incomplete → Won't Fix
status:	Won't Fix → Invalid
status:	Invalid → Won't Fix

Olga Gusarenko (ogusarenko) on 2016-02-25

tags:

added: 8.0 release-notes-done
removed: release-notes

Report a bug

This report contains Public information

Everyone can see this information.

You are

Subscribing...

Edit bug mail

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.