Upgrade "Pike BM" -> "Pike Containers": httpd issue

Bug #1761093 reported by Cédric Jeanneret deactivated
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
tripleo
Expired
Low
Unassigned

Bug Description

Dear Stackers,

While still trying to make this needed upgrade since Baremetal support will be dropped shortly, and Queens has no support for such migration, I have a new issue:

apparently, a step in the migration (resources.ControllerDeployment_Step3 - not 100% sure it's this one) involves an httpd service start, and this fails, on at least one controller.

Here are the few logs from the `openstack stack event list --nested-depth 3 --follow overcloud':
2018-04-04 07:33:35Z [0]: SIGNAL_IN_PROGRESS Signal: deployment 53b94496-f243-40d0-8e9a-92fd52f8a1c2 failed (2)
2018-04-04 07:33:36Z [0]: CREATE_FAILED Error: resources[0]: Deployment to server failed: deploy_status_code : Deployment exited with non-zero status code: 2
2018-04-04 07:33:36Z [overcloud.AllNodesDeploySteps.AllNodesPostUpgradeSteps.ControllerDeployment_Step3]: CREATE_FAILED Resource CREATE failed: Error: resources[0]: Deployment to server failed: deploy_status_code : Deployment exited with
 non-zero status code: 2
2018-04-04 07:33:36Z [overcloud.AllNodesDeploySteps.AllNodesPostUpgradeSteps.ControllerDeployment_Step3]: CREATE_FAILED Error: resources.ControllerDeployment_Step3.resources[0]: Deployment to server failed: deploy_status_code: Deployment
 exited with non-zero status code: 2
2018-04-04 07:33:36Z [overcloud.AllNodesDeploySteps.AllNodesPostUpgradeSteps]: CREATE_FAILED Resource CREATE failed: Error: resources.ControllerDeployment_Step3.resources[0]: Deployment to server failed: deploy_status_code: Deployment ex
ited with non-zero status code: 2
2018-04-04 07:33:37Z [overcloud.AllNodesDeploySteps.AllNodesPostUpgradeSteps]: CREATE_FAILED Error: resources.AllNodesPostUpgradeSteps.resources.ControllerDeployment_Step3.resources[0]: Deployment to server failed: deploy_status_code: De
ployment exited with non-zero status code: 2
2018-04-04 07:33:37Z [overcloud.AllNodesDeploySteps]: UPDATE_FAILED Error: resources.AllNodesPostUpgradeSteps.resources.ControllerDeployment_Step3.resources[0]: Deployment to server failed: deploy_status_code: Deployment exited with non-
zero status code: 2
2018-04-04 07:33:38Z [overcloud.AllNodesDeploySteps]: UPDATE_FAILED resources.AllNodesDeploySteps: Error: resources.AllNodesPostUpgradeSteps.resources.ControllerDeployment_Step3.resources[0]: Deployment to server failed: deploy_status_co
de: Deployment exited with non-zero status code: 2
2018-04-04 07:33:39Z [overcloud]: UPDATE_FAILED resources.AllNodesDeploySteps: Error: resources.AllNodesPostUpgradeSteps.resources.ControllerDeployment_Step3.resources[0]: Deployment to server failed: deploy_status_code: Deployment exite
d with non-zero status code: 2

A (partial) output of the `openstack stack failures list --long overcloud' shows this at the end:
            "Error: Systemd start for httpd failed!",
            "journalctl log for httpd:",
            "-- Logs begin at Thu 2018-03-29 08:26:07 UTC, end at Wed 2018-04-04 07:33:01 UTC. --",
            "Apr 04 07:32:19 lab-controller-0.cloud.camptocamp.com systemd[1]: Starting The Apache HTTP Server...",
            "Apr 04 07:32:21 lab-controller-0.cloud.camptocamp.com python[262846]: WARNING:root:\"dashboards\" and \"default_dashboard\" in (local_)settings is DEPRECATED now and may be unsupported in some future release. The preferred wa
y to specify the order of dashboards and the default dashboard is the pluggable dashboard mechanism (in /usr/share/openstack-dashboard/openstack_dashboard/enabled, /usr/share/openstack-dashboard/openstack_dashboard/local/enabled).",
            "Apr 04 07:32:27 lab-controller-0.cloud.camptocamp.com python[263467]: WARNING:root:\"dashboards\" and \"default_dashboard\" in (local_)settings is DEPRECATED now and may be unsupported in some future release. The preferred wa
y to specify the order of dashboards and the default dashboard is the pluggable dashboard mechanism (in /usr/share/openstack-dashboard/openstack_dashboard/enabled, /usr/share/openstack-dashboard/openstack_dashboard/local/enabled).",
            "Apr 04 07:32:31 lab-controller-0.cloud.camptocamp.com python[263467]: ERROR:scss.ast:Function not found: twbs-font-path:1",
            "Apr 04 07:32:32 lab-controller-0.cloud.camptocamp.com python[263467]: ERROR:scss.ast:Function not found: twbs-font-path:1",
            "Apr 04 07:32:41 lab-controller-0.cloud.camptocamp.com python[263467]: ERROR:scss.ast:Function not found: twbs-font-path:1",
            "Apr 04 07:32:54 lab-controller-0.cloud.camptocamp.com python[263467]: ERROR:scss.ast:Function not found: function-exists:1",
            "Error: /Stage[main]/Apache::Service/Service[httpd]/ensure: change from stopped to running failed: Systemd start for httpd failed!",
            "Warning: /Firewall[998 log all ipv4]: Skipping because of failed dependencies",
            "Warning: /Firewall[998 log all ipv6]: Skipping because of failed dependencies",
            "Warning: /Firewall[999 drop all ipv4]: Skipping because of failed dependencies",
            "Warning: /Firewall[999 drop all ipv6]: Skipping because of failed dependencies",
            "Warning: /Stage[main]/Firewall::Linux::Redhat/File[/etc/sysconfig/iptables]: Skipping because of failed dependencies",
            "Warning: /Stage[main]/Firewall::Linux::Redhat/File[/etc/sysconfig/ip6tables]: Skipping because of failed dependencies",
            "Warning: /Stage[main]/Tripleo::Firewall/Exec[nonpersistent_v4_rules_cleanup]: Skipping because of failed dependencies",
            "Warning: /Stage[main]/Tripleo::Firewall/Exec[reload_iptables]: Skipping because of failed dependencies",
            "Warning: /Stage[main]/Tripleo::Firewall/Exec[nonpersistent_v6_rules_cleanup]: Skipping because of failed dependencies",
            "Warning: /Stage[main]/Tripleo::Firewall/Exec[reload_ip6tables]: Skipping because of failed dependencies"

As it failed only on one node, it might be a temporary glitch - more over, httpd IS actually running, but its state is set to:
Loaded: loaded (/usr/lib/systemd/system/httpd.service; disabled; vendor preset: disabled)

This means httpd shouldn't be running directly on the baremetal - more over, I don't see why it should, as all services should be running in containers, embedding their own httpd service when needed…

Any clue?

Thank you!

Changed in tripleo:
status: New → Triaged
importance: Undecided → Medium
importance: Medium → Low
milestone: none → rocky-2
Revision history for this message
Cédric Jeanneret deactivated (cjeanneret-c2c-deactivated) wrote :

well, apparently, it was a glitch… The migration just finished. Like… yes…

2018-04-05 08:30:17Z [overcloud]: UPDATE_COMPLETE Stack UPDATE completed successfully

 Stack overcloud UPDATE_COMPLETE

Overcloud Endpoint: https://x.y.z:13000/v2.0
Overcloud Deployed

But I don't really feel confident running that on the production :D.

Changed in tripleo:
milestone: rocky-2 → rocky-3
Changed in tripleo:
milestone: rocky-3 → rocky-rc1
Changed in tripleo:
milestone: rocky-rc1 → stein-1
Changed in tripleo:
milestone: stein-1 → stein-2
Changed in tripleo:
milestone: stein-2 → stein-3
Changed in tripleo:
milestone: stein-3 → train-1
Changed in tripleo:
milestone: train-1 → train-2
Changed in tripleo:
milestone: train-2 → train-3
Changed in tripleo:
milestone: train-3 → ussuri-1
Changed in tripleo:
milestone: ussuri-1 → ussuri-2
wes hayutin (weshayutin)
Changed in tripleo:
milestone: ussuri-2 → ussuri-3
wes hayutin (weshayutin)
Changed in tripleo:
milestone: ussuri-3 → ussuri-rc3
wes hayutin (weshayutin)
Changed in tripleo:
milestone: ussuri-rc3 → victoria-1
Changed in tripleo:
milestone: victoria-1 → victoria-3
Changed in tripleo:
milestone: victoria-3 → wallaby-1
Changed in tripleo:
milestone: wallaby-1 → wallaby-2
Changed in tripleo:
milestone: wallaby-2 → wallaby-3
Revision history for this message
Marios Andreou (marios-b) wrote :

This is an automated action. Bug status has been set to 'Incomplete' and target milestone has been removed due to inactivity. If you disagree please re-set these values and reach out to us on freenode #tripleo

Changed in tripleo:
milestone: wallaby-3 → none
status: Triaged → Incomplete
Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for tripleo because there has been no activity for 60 days.]

Changed in tripleo:
status: Incomplete → Expired
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.