Apache restart leads to issues with deployment. Could not stop Service[httpd]: Execution of 'service apache2 stop || sleep 60 && service apache2 stop' returned 1

Bug #1472675 reported by Anastasia Palkina
12
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Fuel for OpenStack
Fix Released
Critical
Aleksandr Didenko
Mirantis OpenStack
Invalid
Critical
MOS Keystone

Bug Description

1. Create new environment (Ubuntu)
2. Choose Neutron, VLAN
3. Check on 'Settings' tab only Ceph for ephemeral volumes
4. Add 1 controller+ceph, 1 ceph
5. Start deployment. It was successful
6. Start OSTF tests. It was successful. But the most of tests skipped because there was no compute node in environment
7. There is error on controller (node-15):

2015-07-08 14:43:01 ERR (/Stage[main]/Apache::Service/Service[httpd]) Failed to call refresh: Could not stop Service[httpd]: Execution of 'service apache2 stop || sleep 60 && service apache2 stop' returned 1:

"build_id": "2015-07-06_18-08-24", "build_number": "26", "release_versions": {"2014.2.2-7.0": {"VERSION": {"build_id": "2015-07-06_18-08-24", "build_number": "26", "api": "1.0", "fuel-library_sha": "251c54e8de2f41aacd260751e7a891e9fbffc45d", "nailgun_sha": "d040c5cebc9cdd24ef20cb7ecf0a337039baddec", "feature_groups": ["mirantis"], "openstack_version": "2014.2.2-7.0", "production": "docker", "python-fuelclient_sha": "315d8bf991fbe7e2ab91abfc1f59b2f24fd92f45", "astute_sha": "9cbb8ae5adbe6e758b24b3c1021aac1b662344e8", "fuel-ostf_sha": "a752c857deafd2629baf646b1b3188f02ff38084", "release": "7.0", "fuelmain_sha": "4f2dff3bdc327858fa45bcc2853cfbceae68a40c"}}}, "auth_required": true, "api": "1.0", "fuel-library_sha": "251c54e8de2f41aacd260751e7a891e9fbffc45d", "nailgun_sha": "d040c5cebc9cdd24ef20cb7ecf0a337039baddec", "feature_groups": ["mirantis"], "openstack_version": "2014.2.2-7.0", "production": "docker", "python-fuelclient_sha": "315d8bf991fbe7e2ab91abfc1f59b2f24fd92f45", "astute_sha": "9cbb8ae5adbe6e758b24b3c1021aac1b662344e8", "fuel-ostf_sha": "a752c857deafd2629baf646b1b3188f02ff38084", "release": "7.0", "fuelmain_sha": "4f2dff3bdc327858fa45bcc2853cfbceae68a40c"

Tags: fuel-to-mos
Revision history for this message
Anastasia Palkina (apalkina) wrote :
Changed in fuel:
assignee: Fuel Library Team (fuel-library) → Oleksiy Molchanov (omolchanov)
status: New → Confirmed
Revision history for this message
Oleksiy Molchanov (omolchanov) wrote :

Moving to incomplete, I didn't manage to reproduce on the same ISO.

Also apache logs states that apache was stopped, but somehow puppet got != 0 exit code.

Changed in fuel:
status: Confirmed → Incomplete
Revision history for this message
Oleksiy Molchanov (omolchanov) wrote :

I am closing it to invalid, because of not reproduced and not updated during 3 weeks.

Changed in fuel:
status: Incomplete → Invalid
Revision history for this message
Dennis Dmitriev (ddmitriev) wrote :

Reproduced on CI: http://jenkins-product.srt.mirantis.net:8080/job/7.0.system_test.ubuntu.services_ha/44/

Task horizon.pp failed with the following error: http://paste.openstack.org/show/406568/

        Scenario:
            1. Create a Fuel cluster. Set the option for Sahara installation. Choose Neutron GRE
            2. Add 3 node with "controller" role
            3. Add 1 node with "compute" role
            4. Deploy the Fuel cluster

ISO version: {u'build_id': u'2015-07-30_16-01-07', u'build_number': u'113', u'auth_required': True, u'fuel-ostf_sha': u'92cdab6c6829be0d2d0c561fe56346dac8708d95', u'fuel-library_sha': u'd1291ae75680818e715608814422075049a10ce8', u'nailgun_sha': u'21ba6e2606a056883734392187845c172ecf99aa', u'openstack_version': u'2015.1.0-7.0', u'fuel-nailgun-agent_sha': u'1512b9af6b41cc95c4d891c593aeebe0faca5a63', u'fuel-agent_sha': u'dee9f2eb7e2822e89f6253f500f0c2e376a5b824', u'api': u'1.0', u'python-fuelclient_sha': u'71bb8fa87ee25f0c1bb84317884da7c917902a63', u'astute_sha': u'488db988a1f2e18f99decf417371c50b2a7fb794', u'fuelmain_sha': u'de5b333815f8541224c6726dc8446ffc7fb18b5b', u'feature_groups': [u'mirantis'], u'release': u'7.0', u'release_versions': {u'2015.1.0-7.0': {u'VERSION': {u'build_id': u'2015-07-30_16-01-07', u'build_number': u'113', u'fuel-library_sha': u'd1291ae75680818e715608814422075049a10ce8', u'nailgun_sha': u'21ba6e2606a056883734392187845c172ecf99aa', u'fuel-ostf_sha': u'92cdab6c6829be0d2d0c561fe56346dac8708d95', u'fuel-nailgun-agent_sha': u'1512b9af6b41cc95c4d891c593aeebe0faca5a63', u'fuel-agent_sha': u'dee9f2eb7e2822e89f6253f500f0c2e376a5b824', u'api': u'1.0', u'python-fuelclient_sha': u'71bb8fa87ee25f0c1bb84317884da7c917902a63', u'astute_sha': u'488db988a1f2e18f99decf417371c50b2a7fb794', u'fuelmain_sha': u'de5b333815f8541224c6726dc8446ffc7fb18b5b', u'feature_groups': [u'mirantis'], u'release': u'7.0', u'openstack_version': u'2015.1.0-7.0', u'production': u'docker'}}}, u'production': u'docker'}

Changed in fuel:
status: Invalid → Confirmed
Mike Scherbakov (mihgen)
Changed in fuel:
assignee: Oleksiy Molchanov (omolchanov) → MOS Ceph (mos-ceph)
assignee: MOS Ceph (mos-ceph) → Fuel Library Team (fuel-library)
Changed in fuel:
assignee: Fuel Library Team (fuel-library) → Oleksiy Molchanov (omolchanov)
Revision history for this message
Oleksiy Molchanov (omolchanov) wrote :

Seems that the issue described in the body is not a bug by itself. In first case we have successfully deployed cluster. In second case (Dennis Dmitriev (ddmitriev) 2015-07-31) we have not ready keystone backend during sahara deployment - #1480875

Changed in fuel:
importance: High → Medium
assignee: Oleksiy Molchanov (omolchanov) → Fuel Library Team (fuel-library)
summary: - Ubuntu with ephemeral ceph: Could not stop Service[httpd]: Execution of
- 'service apache2 stop || sleep 60 && service apache2 stop' returned 1
+ Apache restart leads to issues with deployment. Could not stop
+ Service[httpd]: Execution of 'service apache2 stop || sleep 60 &&
+ service apache2 stop' returned 1
Changed in fuel:
importance: Medium → High
Vasyl Saienko (vsaienko)
Changed in fuel:
assignee: Fuel Library Team (fuel-library) → Vasyl Saienko (vsaienko)
Revision history for this message
Vasyl Saienko (vsaienko) wrote :

It seems that puppet can't find command and failed at the same second, when it is called.
otherwise if first iteration is failed there should be sleep for 60 seconds

'service apache2 stop || sleep 60 && service apache2 stop'

2015-08-05 02:08:24 +0000 Puppet (debug): Executing '/etc/init.d/apache2 status'
2015-08-05 02:08:24 +0000 Puppet (debug): Executing '/etc/init.d/apache2 status'
2015-08-05 02:08:24 +0000 Puppet (debug): Executing 'service apache2 stop || sleep 60 && service apache2 stop'
2015-08-05 02:08:24 +0000 /Stage[main]/Apache::Service/Service[httpd] (err): Failed to call refresh: Could not stop Service[httpd]: Execution of 'service apache2 stop || sleep 60 && service apache2 stop' returned 1:
2015-08-05 02:08:24 +0000 /Stage[main]/Apache::Service/Service[httpd] (err): Could not stop Service[httpd]: Execution of 'service apache2 stop || sleep 60 && service apache2 stop' returned 1:

[Wed Aug 05 02:08:17.843851 2015] [core:notice] [pid 3108:tid 139876821706624] AH00094: Command line: '/usr/sbin/apache2'
[Wed Aug 05 02:08:24.625869 2015] [mpm_worker:notice] [pid 3108:tid 139876821706624] AH00295: caught SIGTERM, shutting down
[Wed Aug 05 02:08:43.956296 2015] [mpm_worker:notice] [pid 7642:tid 140534070359936] AH00292: Apache/2.4.7 (Ubuntu) mod_wsgi/3.4 Python/2.7.6 configured -- resuming normal operations

https://paste.mirantis.net/show/863/

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to fuel-library (master)

Fix proposed to branch: master
Review: https://review.openstack.org/209674

Changed in fuel:
status: Confirmed → In Progress
Revision history for this message
Vasyl Saienko (vsaienko) wrote :
Revision history for this message
Vasyl Saienko (vsaienko) wrote :

I've added debugging to apache, build custom ISO: [1], launched approx 15 times bvt on it, and didn't reproduce this error.

[1] http://jenkins-product.srt.mirantis.net:8080/view/custom_iso/job/custom_7.0_iso/878/

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on fuel-library (master)

Change abandoned by Vasyl Saienko (<email address hidden>) on branch: master
Review: https://review.openstack.org/209674

Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Change abandoned by Vasyl Saienko (<email address hidden>) on branch: master
Review: https://review.openstack.org/209995

tags: added: fuel-to-mos
Revision history for this message
Vasyl Saienko (vsaienko) wrote :

There is steps how to reproduce it:

launch apache start/stop in cycle
while :; do service apache2 stop; service apache2 start; done

after sime time apache failed to start, because it can't bind to opened socket:
 * Starting web server apache2
 *
 * Stopping web server apache2
 *
 * Starting web server apache2
 *
 * Stopping web server apache2
 *
 * Starting web server apache2
(98)Address already in use: AH00072: make_sock: could not bind to address [::]:35357
(98)Address already in use: AH00072: make_sock: could not bind to address 0.0.0.0:35357
no listening sockets available, shutting down
AH00015: Unable to open logs
Action 'start' failed.
The Apache error log may have more information.
 *
 * The apache2 instance did not start within 20 seconds. Please read the log files to discover problems
 * Stopping web server apache2
 *
 * Starting web server apache2
 *
 * Stopping web server apache2

I've tried to remove keystone from apache, and try it with horizon and radosg wsgi apps.
I didn't reproduce error in 12 hours. While with keystone it is possible to reproduce in 5-10 minutes.

@keystone: please check it from python side

Changed in fuel:
assignee: Vasyl Saienko (vsaienko) → MOS Keystone (mos-keystone)
status: In Progress → Confirmed
Revision history for this message
Boris Bobrov (bbobrov) wrote :

There is nothing from keystone side that can be done. Please read discussion on https://review.openstack.org/#/c/107131/1/lib/apache, https://review.openstack.org/#/c/107131/, bug 1342256 and bug 1340660.

The fix was accepted in upstream devstack, so why don't we take it from there and apply to our init scripts?

Revision history for this message
Vasyl Saienko (vsaienko) wrote :

@Boris we know about that bug, and we already have similar fix applied in our manifests. But we still able to reproduce this error from time to time on smoke/swarm/bvt. And according to Morgan Fainberg's comments in https://review.openstack.org/#/c/107131/1/lib/apache he agreed that it is not permanent fix, and helps only in some ~95% of cases.

I've opened new bug to upstream keystone https://bugs.launchpad.net/keystone/+bug/1484836

Revision history for this message
Boris Bobrov (bbobrov) wrote :

Have you read the link he provided in comments? https://wiki.apache.org/httpd/CouldNotBindToAddress

In comments he also says that the proper fix is to "change the apache restart from *init* to apachectl which should also resolve the issue without requiring an explicit sleep". Have you done that?

Revision history for this message
Davanum Srinivas (DIMS) (dims-v) wrote :

+1 to Boris' suggestion. waiting for input back after trying that suggestion

Changed in fuel:
status: Confirmed → Incomplete
Revision history for this message
Bogdan Dobrelya (bogdando) wrote :

MOS team, please investigate https://bugs.launchpad.net/keystone/+bug/1484836/comments/3
According to Sergii's comment there is not only CM issue here, but keystone's itself

Changed in mos:
assignee: nobody → MOS Keystone (mos-keystone)
Changed in fuel:
assignee: MOS Keystone (mos-keystone) → MOS Puppet Team (mos-puppet)
Changed in mos:
milestone: none → 7.0
importance: Undecided → High
Revision history for this message
Bogdan Dobrelya (bogdando) wrote :

MOS puppet team, please investigate the apachctl option to the "service apache2 restart" in puppet manifests as well

Changed in fuel:
status: Incomplete → New
status: New → Confirmed
Revision history for this message
Bogdan Dobrelya (bogdando) wrote :

I separated bug into two projects so we could have CM team don't block on MOS team and vice versa

Changed in mos:
importance: High → Critical
Changed in fuel:
importance: High → Critical
Revision history for this message
Bogdan Dobrelya (bogdando) wrote :

Raised to critical as there is another critical bug related to this one https://bugs.launchpad.net/fuel/+bug/1484066

Changed in mos:
status: New → Confirmed
Revision history for this message
Vasyl Saienko (vsaienko) wrote :

@Bogdan: apachectl doesn't solve problem, on Ubuntu at least:
https://paste.mirantis.net/show/929/

Revision history for this message
Boris Bobrov (bbobrov) wrote :

Vasyl, have you tried apachectl *restart*?

Revision history for this message
Sergii Golovatiuk (sgolovatiuk) wrote :

This bug is valid. It's not related to ephemeral ports as port has been excluded https://github.com/stackforge/fuel-library/blob/master/deployment/puppet/openstack/manifests/reserved_ports.pp

Revision history for this message
Boris Bobrov (bbobrov) wrote :

I also don't understand why you don't want to insert sleep(5) (or even sleep(10)) somewhere before start and stop. It is a totally legit solution. It is proposed everywhere. It was accepted used in Murano ( https://review.openstack.org/#/c/114969/1 ), in upstream devstack ( https://github.com/openstack-dev/devstack/blob/144dbc62f8aa6a62cdca403a69bb883cb8552142/lib/apache#L179 ). It is proposed around the web ( http://serverfault.com/questions/555345/apache2-starting-then-port-fails-to-bind ) and in the official Apache2 wiki ( https://wiki.apache.org/httpd/CouldNotBindToAddress ).

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to fuel-library (master)

Reviewed: https://review.openstack.org/209924
Committed: https://git.openstack.org/cgit/stackforge/fuel-library/commit/?id=ff2274e44ab732d943976d1b21c8997dc90a7d94
Submitter: Jenkins
Branch: master

commit ff2274e44ab732d943976d1b21c8997dc90a7d94
Author: vsaienko <email address hidden>
Date: Fri Aug 14 11:45:12 2015 +0300

    Tune tweak::apache_wrappers module

    - Sometimes apache fails to start after stop, due to unclosed
      resources. The problem frequently reproduced with keystone wsgi
      module, and didn't reproduced with horizon or radosgw.
      'apachectl restart' is recommended if doing start/stop rapidly
      https://wiki.apache.org/httpd/CouldNotBindToAddress
    - Redefine restart => 'apachectl graceful' for apache service
    - Remove disabling of GarbageCollector

    Related-Bug: #1472675
    Related-Bug: #1484066
    Related-Bug: #1457893
    Related-Bug: #1459357

    Change-Id: I34843639eacc9bcb6d451d3376440c8bfe9014f7

Revision history for this message
Bogdan Dobrelya (bogdando) wrote :

Marked to invalid for MOS as this bug has nothing to do with OpenStack. The merged patch https://review.openstack.org/209924 actually have resolved this bug, which is appeared to be a pure one the configuration related

Changed in fuel:
status: Confirmed → Fix Committed
assignee: MOS Puppet Team (mos-puppet) → Vasyl Saienko (vsaienko)
Changed in mos:
status: Confirmed → Invalid
Revision history for this message
Artem Panchenko (apanchenko-8) wrote :
Download full text (4.6 KiB)

Reproduced again on BVT:

node-1 2015-08-20T23:51:20.014281 err: Could not prefetch keystone_endpoint provider 'openstack': Execution of '/usr/bin/openstack endpoint list --quiet --format csv --long' returned 1: ERROR: openstack Service Unavailable (HTTP 503)
node-1 2015-08-20T23:51:20.014609 err:
node-1 2015-08-20T23:51:20.015448 err: /usr/lib/ruby/vendor_ruby/puppet/util/execution.rb:188:in `execute'
node-1 2015-08-20T23:51:20.015448 err: /usr/lib/ruby/vendor_ruby/puppet/provider/command.rb:23:in `execute'
node-1 2015-08-20T23:51:20.015448 err: /usr/lib/ruby/vendor_ruby/puppet/provider.rb:237:in `block in has_command'
node-1 2015-08-20T23:51:20.015448 err: /etc/puppet/modules/openstacklib/lib/puppet/provider/openstack.rb:26:in `block (2 levels) in request'
node-1 2015-08-20T23:51:20.015448 err: /etc/puppet/modules/openstacklib/lib/puppet/provider/openstack.rb:23:in `loop'
node-1 2015-08-20T23:51:20.015448 err: /etc/puppet/modules/openstacklib/lib/puppet/provider/openstack.rb:23:in `block in request'
node-1 2015-08-20T23:51:20.015448 err: /usr/lib/ruby/vendor_ruby/puppet/util.rb:43:in `withenv'
node-1 2015-08-20T23:51:20.015448 err: /etc/puppet/modules/openstacklib/lib/puppet/provider/openstack.rb:19:in `request'
node-1 2015-08-20T23:51:20.015448 err: /etc/puppet/modules/openstacklib/lib/puppet/provider/openstack/auth.rb:44:in `request'
node-1 2015-08-20T23:51:20.015448 err: /etc/puppet/modules/keystone/lib/puppet/provider/keystone.rb:110:in `request'
node-1 2015-08-20T23:51:20.015448 err: /etc/puppet/modules/keystone/lib/puppet/provider/keystone_endpoint/openstack.rb:82:in `instances'
node-1 2015-08-20T23:51:20.015448 err: /etc/puppet/modules/keystone/lib/puppet/provider/keystone_endpoint/openstack.rb:97:in `prefetch'
node-1 2015-08-20T23:51:20.015448 err: /usr/lib/ruby/vendor_ruby/puppet/transaction.rb:277:in `prefetch'
node-1 2015-08-20T23:51:20.015448 err: /usr/lib/ruby/vendor_ruby/puppet/transaction.rb:167:in `prefetch_if_necessary'
node-1 2015-08-20T23:51:20.015448 err: /usr/lib/ruby/vendor_ruby/puppet/transaction.rb:67:in `block in evaluate'
node-1 2015-08-20T23:51:20.015448 err: /usr/lib/ruby/vendor_ruby/puppet/graph/relationship_graph.rb:116:in `call'
node-1 2015-08-20T23:51:20.015448 err: /usr/lib/ruby/vendor_ruby/puppet/graph/relationship_graph.rb:116:in `traverse'
node-1 2015-08-20T23:51:20.015448 err: /usr/lib/ruby/vendor_ruby/puppet/transaction.rb:108:in `evaluate'
node-1 2015-08-20T23:51:20.015448 err: /usr/lib/ruby/vendor_ruby/puppet/resource/catalog.rb:164:in `block in apply'
node-1 2015-08-20T23:51:20.015448 err: /usr/lib/ruby/vendor_ruby/puppet/util/log.rb:149:in `with_destination'
node-1 2015-08-20T23:51:20.015448 err: /usr/lib/ruby/vendor_ruby/puppet/transaction/report.rb:108:in `as_logging_destination'
node-1 2015-08-20T23:51:20.015448 err: /usr/lib/ruby/vendor_ruby/puppet/resource/catalog.rb:163:in `apply'
node-1 2015-08-20T23:51:20.015448 err: /usr/lib/ruby/vendor_ruby/puppet/configurer.rb:125:in `block in apply_catalog'
node-1 2015-08-20T23:51:20.015448 err: /usr/lib/ruby/vendor_ruby/puppet/util.rb:161:in `block in benchmark'
node-1 2015-08-20T23:51:20.015448 err: /usr/lib/ruby/1.9.1/benchmark.rb:295:in `realtime'
no...

Read more...

Changed in fuel:
status: Fix Committed → Confirmed
Changed in fuel:
assignee: Vasyl Saienko (vsaienko) → Aleksandr Didenko (adidenko)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to fuel-library (master)

Fix proposed to branch: master
Review: https://review.openstack.org/215545

Changed in fuel:
status: Confirmed → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to fuel-library (master)

Reviewed: https://review.openstack.org/215545
Committed: https://git.openstack.org/cgit/stackforge/fuel-library/commit/?id=475e8a33ef6dab7f9b1db9561e2a62c1534c2226
Submitter: Jenkins
Branch: master

commit 475e8a33ef6dab7f9b1db9561e2a62c1534c2226
Author: Aleksandr Didenko <email address hidden>
Date: Fri Aug 21 12:11:24 2015 +0300

    Add greaceful apache restart to radosgw task

    We need to avoid apache stop/start during radosgw configuration.
    Also added noop-rspec tests for all tasks where we use apache
    class and need to reload apache service.

    Closes-bug: #1472675
    Change-Id: I164998ed7f3042dc82d03fbf78812df4f10fdec1

Changed in fuel:
status: In Progress → Fix Committed
tags: added: on-verification
Revision history for this message
Anastasia Palkina (apalkina) wrote :

Verified on ISO #256. Also I didn't see this issue on latest ISOs for different test cases

"build_id": "265", "build_number": "265", "release_versions": {"2015.1.0-7.0": {"VERSION": {"build_id": "265", "build_number": "265", "api": "1.0", "fuel-library_sha": "4fdf3d6b070204366593012428395d173698678a", "nailgun_sha": "0dfcf73deb8ae99654f3da2ea95b7b68b9ee7273", "feature_groups": ["mirantis"], "fuel-nailgun-agent_sha": "d7027952870a35db8dc52f185bb1158cdd3d1ebd", "openstack_version": "2015.1.0-7.0", "fuel-agent_sha": "082a47bf014002e515001be05f99040437281a2d", "production": "docker", "python-fuelclient_sha": "9643fa07f1290071511066804f962f62fe27b512", "astute_sha": "e63709d16bd4c1949bef820ac336c9393c040d25", "fuel-ostf_sha": "582a81ccaa1e439a3aec4b8b8f6994735de840f4", "release": "7.0", "fuelmain_sha": "9ab01caf960013dc882825dc9b0e11ccf0b81cb0"}}}, "auth_required": true, "api": "1.0", "fuel-library_sha": "4fdf3d6b070204366593012428395d173698678a", "nailgun_sha": "0dfcf73deb8ae99654f3da2ea95b7b68b9ee7273", "feature_groups": ["mirantis"], "fuel-nailgun-agent_sha": "d7027952870a35db8dc52f185bb1158cdd3d1ebd", "openstack_version": "2015.1.0-7.0", "fuel-agent_sha": "082a47bf014002e515001be05f99040437281a2d", "production": "docker", "python-fuelclient_sha": "9643fa07f1290071511066804f962f62fe27b512", "astute_sha": "e63709d16bd4c1949bef820ac336c9393c040d25", "fuel-ostf_sha": "582a81ccaa1e439a3aec4b8b8f6994735de840f4", "release": "7.0", "fuelmain_sha": "9ab01caf960013dc882825dc9b0e11ccf0b81cb0"

Changed in fuel:
status: Fix Committed → Fix Released
tags: removed: on-verification
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.