N->O upgrade on IPv6 deployment get stuck during major-upgrade-composable-steps

Bug #1675782 reported by Sofer Athlan-Guyot
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
tripleo
Fix Released
High
Sofer Athlan-Guyot

Bug Description

Originaly reported there:
https://bugzilla.redhat.com/show_bug.cgi?id=1430384

Run the newton->ocata upgrade workflow on an IPv6 deployment with 3
controllers, 2 compute nodes and 3 ceph nodes and the upgrade gets
stuck.

[stack@undercloud-0 ~]$ openstack stack list --nested | grep PROGRESS
| 52ca5c6b-6353-450f-9ec3-6ffb3df5da96 | overcloud-AllNodesDeploySteps-n7gn2tdu4r7m-AllNodesPostUpgradeSteps-cob3l24fahnm-ControllerDeployment_Step1-gl6bzubujvwb | CREATE_IN_PROGRESS | 2017-03-08T13:28:05Z | None | 175caee2-c83d-4f70-b7f5-e2de6373a70b |
| 175caee2-c83d-4f70-b7f5-e2de6373a70b | overcloud-AllNodesDeploySteps-n7gn2tdu4r7m-AllNodesPostUpgradeSteps-cob3l24fahnm | CREATE_IN_PROGRESS | 2017-03-08T13:27:25Z | None | 5d78c2c4-fa2f-4560-aa09-40939044b9bb |
| 5d78c2c4-fa2f-4560-aa09-40939044b9bb | overcloud-AllNodesDeploySteps-n7gn2tdu4r7m | UPDATE_IN_PROGRESS | 2017-03-08T11:52:03Z | 2017-03-08T13:09:57Z | efe081d8-de20-4fef-98d8-12c23c578e6c |
| efe081d8-de20-4fef-98d8-12c23c578e6c | overcloud | UPDATE_IN_PROGRESS | 2017-03-08T11:41:47Z | 2017-03-08T13:02:34Z | None |

All the controller nodes are running the following in the os-collect-config log:

[root@overcloud-controller-2 heat-admin]# journalctl -fl -u os-collect-config
-- Logs begin at Wed 2017-03-08 11:47:14 UTC. --
Mar 08 13:28:17 overcloud-controller-2.localdomain os-collect-config[4244]: [2017-03-08 13:28:17,211] (heat-config) [WARNING] To force-deploy, rm /var/lib/heat-config/deployed/45ec9401-3381-4ed3-8066-5bf0b0a1442e.json
Mar 08 13:28:17 overcloud-controller-2.localdomain os-collect-config[4244]: [2017-03-08 13:28:17,212] (heat-config) [WARNING] Skipping config d4a79a71-3f6c-4ad0-be65-53ee87d38a18, already deployed
Mar 08 13:28:17 overcloud-controller-2.localdomain os-collect-config[4244]: [2017-03-08 13:28:17,212] (heat-config) [WARNING] To force-deploy, rm /var/lib/heat-config/deployed/d4a79a71-3f6c-4ad0-be65-53ee87d38a18.json
Mar 08 13:28:17 overcloud-controller-2.localdomain os-collect-config[4244]: [2017-03-08 13:28:17,212] (heat-config) [DEBUG] Running /usr/libexec/heat-config/hooks/puppet < /var/lib/heat-config/deployed/306ad840-29b4-4dfe-825d-0659cce43de8.json
Mar 08 13:28:23 overcloud-controller-2.localdomain su[510690]: (to rabbitmq) root on none
Mar 08 13:28:33 overcloud-controller-2.localdomain su[511048]: (to rabbitmq) root on none
Mar 08 13:28:34 overcloud-controller-2.localdomain su[511217]: (to rabbitmq) root on none
Mar 08 13:28:35 overcloud-controller-2.localdomain su[511396]: (to rabbitmq) root on none
Mar 08 13:28:36 overcloud-controller-2.localdomain su[511564]: (to rabbitmq) root on none
Mar 08 13:28:38 overcloud-controller-2.localdomain usermod[511879]: change user 'hacluster' password

The nodes seem to not be able to join the cluster:

http://paste.openstack.org/show/601938/

ip6tables rules:
http://paste.openstack.org/show/601939/

It looks that the firewall rules are blocking the nodes from joining
the cluster. After running 'ip6tables -F' the deployment was unblocked
and the nodes were able to join the cluster.

Changed in tripleo:
assignee: nobody → Sofer Athlan-Guyot (sofer-athlan-guyot)
status: Confirmed → In Progress
Changed in tripleo:
milestone: ongoing → pike-1
importance: Critical → High
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to tripleo-heat-templates (stable/ocata)

Fix proposed to branch: stable/ocata
Review: https://review.openstack.org/450144

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to tripleo-heat-templates (master)

Reviewed: https://review.openstack.org/449613
Committed: https://git.openstack.org/cgit/openstack/tripleo-heat-templates/commit/?id=670399a2caeecd9259bea454e9518ab6c92cff49
Submitter: Jenkins
Branch: master

commit 670399a2caeecd9259bea454e9518ab6c92cff49
Author: Sofer Athlan-Guyot <email address hidden>
Date: Fri Mar 24 13:45:10 2017 +0100

    N->O upgrade, blanks ipv6 rules before activating it.

    When the firewall is enabled with ipv6, the default rules set is
    taken as not ipv6 firewall was present for Newton. This make
    communication impossible until puppet is run again.

    This ensures that no rules are loaded when the firewall is enabled.

    This mimic this patch[1]

    [1] https://github.com/openstack/tripleo-heat-templates/commit/ae8aac36143d5dadb08af0d275f513678909dcc7

    Change-Id: Id878b5caae666a799c89c8466ce46b9ecb86d9f7
    Closes-Bug: #1675782

Changed in tripleo:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to tripleo-heat-templates (stable/ocata)

Reviewed: https://review.openstack.org/450144
Committed: https://git.openstack.org/cgit/openstack/tripleo-heat-templates/commit/?id=440901b5026d0927ce74ab358fbe3d430f91b38a
Submitter: Jenkins
Branch: stable/ocata

commit 440901b5026d0927ce74ab358fbe3d430f91b38a
Author: Sofer Athlan-Guyot <email address hidden>
Date: Fri Mar 24 13:45:10 2017 +0100

    N->O upgrade, blanks ipv6 rules before activating it.

    When the firewall is enabled with ipv6, the default rules set is
    taken as not ipv6 firewall was present for Newton. This make
    communication impossible until puppet is run again.

    This ensures that no rules are loaded when the firewall is enabled.

    This mimic this patch[1]

    [1] https://github.com/openstack/tripleo-heat-templates/commit/ae8aac36143d5dadb08af0d275f513678909dcc7

    Change-Id: Id878b5caae666a799c89c8466ce46b9ecb86d9f7
    Closes-Bug: #1675782
    (cherry picked from commit 670399a2caeecd9259bea454e9518ab6c92cff49)

tags: added: in-stable-ocata
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/tripleo-heat-templates 7.0.0.0b1

This issue was fixed in the openstack/tripleo-heat-templates 7.0.0.0b1 development milestone.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/tripleo-heat-templates 6.1.0

This issue was fixed in the openstack/tripleo-heat-templates 6.1.0 release.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.