featureset022 on stable/pike fails on ping test (iptables restart unloads OVS kernel module)

Bug #1752441 reported by Alex Schultz
14
This bug affects 2 people
Affects Status Importance Assigned to Milestone
tripleo
Fix Released
Critical
Alex Schultz

Bug Description

https://review.rdoproject.org/jenkins/job/gate-tripleo-ci-centos-7-ovb-1ctlr_1comp-featureset022-pike/642/
https://review.rdoproject.org/jenkins/job/gate-tripleo-ci-centos-7-ovb-1ctlr_1comp-featureset022-pike/643/
https://review.rdoproject.org/jenkins/job/gate-tripleo-ci-centos-7-ovb-1ctlr_1comp-featureset022-pike/

Ping test is failing
https://logs.rdoproject.org/16/548616/1/openstack-check/gate-tripleo-ci-centos-7-ovb-1ctlr_1comp-featureset022-pike/Z0f8d1afd63344426a064405a98d997fe/undercloud/home/jenkins/overcloud_validate.log.txt.gz#_2018-02-28_21_08_00

Looks like neutron is failing to bind a port for the node.
https://logs.rdoproject.org/16/548616/1/openstack-check/gate-tripleo-ci-centos-7-ovb-1ctlr_1comp-featureset022-pike/Z0f8d1afd63344426a064405a98d997fe/overcloud-controller-foo-0/var/log/containers/nova/nova-conductor.log.txt.gz#_2018-02-28_21_11_13_171
2018-02-28 21:11:13.171 6 ERROR nova.scheduler.utils [req-cae410e3-f492-4319-88b9-5a8fee8f759e 72df4856209c4df886a345a404d130b2 ff2ac32a731944ae97547a541bf821d7 - default default] [instance: fdec0b5e-7395-459a-8d1d-645ccfa92f48] Error from last host: overcloud-novacompute-bar-0.localdomain (node overcloud-novacompute-bar-0.localdomain): [u'Traceback (most recent call last):\n', u' File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 1855, in _do_build_and_run_instance\n filter_properties)\n', u' File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 2085, in _build_and_run_instance\n instance_uuid=instance.uuid, reason=six.text_type(e))\n', u'RescheduledException: Build of instance fdec0b5e-7395-459a-8d1d-645ccfa92f48 was re-scheduled: Binding failed for port 9f8386ca-9201-48c4-9b7a-8d5f1a676c9b, please check neutron logs for more information.\n']

https://logs.rdoproject.org/16/548616/1/openstack-check/gate-tripleo-ci-centos-7-ovb-1ctlr_1comp-featureset022-pike/Z0f8d1afd63344426a064405a98d997fe/overcloud-controller-foo-0/var/log/containers/neutron/neutron-server.log.txt.gz#_2018-02-28_21_11_34_438
2018-02-28 21:11:34.438 27 ERROR neutron.plugins.ml2.managers [req-42ea3f47-d990-4e80-b2e2-f61b7b045764 99f487b1077d464fa02c59bf642270af e2746f291486402ba7ba9c1924033e20 - default default] Failed to bind port 9f8386ca-9201-48c4-9b7a-8d5f1a676c9b on host overcloud-novacompute-bar-0.localdomain for vnic_type normal using segments [{'network_id': 'd70ac7a9-1cd4-4243-ae0c-1f413940108b', 'segmentation_id': 89, 'physical_network': None, 'id': 'f235f8bb-bead-4d3f-9e97-a51bfd5c4135', 'network_type': u'vxlan'}]

Matt Young (halcyondude)
tags: added: promotion-blocker
Revision history for this message
Assaf Muller (amuller) wrote :

neutron-server is saying that there's no active OVS agent on compute-0.

Revision history for this message
Assaf Muller (amuller) wrote :
Revision history for this message
Assaf Muller (amuller) wrote :
Changed in tripleo:
milestone: queens-rc1 → rocky-1
Revision history for this message
Alex Schultz (alex-schultz) wrote :

Looking at the ovs-vswitchd.log for a failing job we see:

2018-03-02T19:00:17.382Z|00009|dpif_netlink|WARN|Generic Netlink family 'ovs_datapath' does not exist. The Open vSwitch kernel module is probably not loaded.

https://logs.rdoproject.org/02/548102/2/openstack-check/gate-tripleo-ci-centos-7-ovb-1ctlr_1comp-featureset022-pike/Zd3d16a6d18194b398fa4fd823ead2ef0/overcloud-novacompute-bar-0/var/log/openvswitch/ovs-vswitchd.log.txt.gz

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to puppet-tripleo (master)

Fix proposed to branch: master
Review: https://review.openstack.org/549838

Changed in tripleo:
assignee: nobody → Alex Schultz (alex-schultz)
status: Triaged → In Progress
Revision history for this message
Alex Schultz (alex-schultz) wrote : Re: featureset022 on stable/pike fails on ping test

root cause is iptables is being restarted which causes openvswitch kernel module to be unloaded which triggers random other failures related to the interfaces.

tags: added: newton-backport-potential ocata-backport-potential pike-backport-potential queens-backport-potential
Assaf Muller (amuller)
summary: - featureset022 on stable/pike fails on ping test
+ featureset022 on stable/pike fails on ping test (iptables restart
+ unloads OVS kernel module)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to puppet-tripleo (master)

Reviewed: https://review.openstack.org/549838
Committed: https://git.openstack.org/cgit/openstack/puppet-tripleo/commit/?id=bb5013920ac658c99d9ae2ab7f81847b274aa177
Submitter: Zuul
Branch: master

commit bb5013920ac658c99d9ae2ab7f81847b274aa177
Author: Alex Schultz <email address hidden>
Date: Mon Mar 5 11:06:52 2018 -0700

    Reload iptables instead of restart

    Due to bz#1520534, restarting iptables may cause unrelated kernel
    modules to be unloaded. In order to not trigger this condition we should
    reload iptables from the configuration rather than restart the whole
    process.

    Change-Id: Ifc625eb51f6cc2a0a4cf4f83ac7a4978db641d75
    Closes-Bug: #1752441
    Closes-Bug: #1753492

Changed in tripleo:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to puppet-tripleo (stable/queens)

Fix proposed to branch: stable/queens
Review: https://review.openstack.org/549939

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to puppet-tripleo (stable/pike)

Fix proposed to branch: stable/pike
Review: https://review.openstack.org/549940

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to puppet-tripleo (stable/ocata)

Fix proposed to branch: stable/ocata
Review: https://review.openstack.org/549941

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to puppet-tripleo (stable/newton)

Fix proposed to branch: stable/newton
Review: https://review.openstack.org/549942

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to puppet-tripleo (stable/queens)

Reviewed: https://review.openstack.org/549939
Committed: https://git.openstack.org/cgit/openstack/puppet-tripleo/commit/?id=9868c0395b50a6c5e2c21c765bd1cb5345ab0ea4
Submitter: Zuul
Branch: stable/queens

commit 9868c0395b50a6c5e2c21c765bd1cb5345ab0ea4
Author: Alex Schultz <email address hidden>
Date: Mon Mar 5 11:06:52 2018 -0700

    Reload iptables instead of restart

    Due to bz#1520534, restarting iptables may cause unrelated kernel
    modules to be unloaded. In order to not trigger this condition we should
    reload iptables from the configuration rather than restart the whole
    process.

    Change-Id: Ifc625eb51f6cc2a0a4cf4f83ac7a4978db641d75
    Closes-Bug: #1752441
    Closes-Bug: #1753492
    (cherry picked from commit bb5013920ac658c99d9ae2ab7f81847b274aa177)

tags: added: in-stable-queens
tags: added: in-stable-ocata
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to puppet-tripleo (stable/ocata)

Reviewed: https://review.openstack.org/549941
Committed: https://git.openstack.org/cgit/openstack/puppet-tripleo/commit/?id=8dee1557874272126dbb8bc41affbd7ec0063097
Submitter: Zuul
Branch: stable/ocata

commit 8dee1557874272126dbb8bc41affbd7ec0063097
Author: Alex Schultz <email address hidden>
Date: Mon Mar 5 11:06:52 2018 -0700

    Reload iptables instead of restart

    Due to bz#1520534, restarting iptables may cause unrelated kernel
    modules to be unloaded. In order to not trigger this condition we should
    reload iptables from the configuration rather than restart the whole
    process.

    Change-Id: Ifc625eb51f6cc2a0a4cf4f83ac7a4978db641d75
    Closes-Bug: #1752441
    Closes-Bug: #1753492
    (cherry picked from commit bb5013920ac658c99d9ae2ab7f81847b274aa177)

tags: added: in-stable-newton
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to puppet-tripleo (stable/newton)

Reviewed: https://review.openstack.org/549942
Committed: https://git.openstack.org/cgit/openstack/puppet-tripleo/commit/?id=fbab9255944d1dac59fd9e1f271f69b24fd90cf1
Submitter: Zuul
Branch: stable/newton

commit fbab9255944d1dac59fd9e1f271f69b24fd90cf1
Author: Alex Schultz <email address hidden>
Date: Mon Mar 5 11:06:52 2018 -0700

    Reload iptables instead of restart

    Due to bz#1520534, restarting iptables may cause unrelated kernel
    modules to be unloaded. In order to not trigger this condition we should
    reload iptables from the configuration rather than restart the whole
    process.

    Change-Id: Ifc625eb51f6cc2a0a4cf4f83ac7a4978db641d75
    Closes-Bug: #1752441
    Closes-Bug: #1753492
    (cherry picked from commit bb5013920ac658c99d9ae2ab7f81847b274aa177)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to puppet-tripleo (stable/pike)

Reviewed: https://review.openstack.org/549940
Committed: https://git.openstack.org/cgit/openstack/puppet-tripleo/commit/?id=58e99c7f0af25ed2aed7d46b7d219abcabd6c1d2
Submitter: Zuul
Branch: stable/pike

commit 58e99c7f0af25ed2aed7d46b7d219abcabd6c1d2
Author: Alex Schultz <email address hidden>
Date: Mon Mar 5 11:06:52 2018 -0700

    Reload iptables instead of restart

    Due to bz#1520534, restarting iptables may cause unrelated kernel
    modules to be unloaded. In order to not trigger this condition we should
    reload iptables from the configuration rather than restart the whole
    process.

    Change-Id: Ifc625eb51f6cc2a0a4cf4f83ac7a4978db641d75
    Closes-Bug: #1752441
    Closes-Bug: #1753492
    (cherry picked from commit bb5013920ac658c99d9ae2ab7f81847b274aa177)

tags: added: in-stable-pike
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/puppet-tripleo 7.4.10

This issue was fixed in the openstack/puppet-tripleo 7.4.10 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/puppet-tripleo 6.5.10

This issue was fixed in the openstack/puppet-tripleo 6.5.10 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/puppet-tripleo 5.6.8

This issue was fixed in the openstack/puppet-tripleo 5.6.8 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/puppet-tripleo 8.3.1

This issue was fixed in the openstack/puppet-tripleo 8.3.1 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/puppet-tripleo 9.0.0

This issue was fixed in the openstack/puppet-tripleo 9.0.0 release.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.