Docker CE can cause ironic provisioning to fail

Bug #1823044 reported by Mark Goddard
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
kolla-ansible
Fix Released
High
Mark Goddard
Rocky
New
Medium
Mark Goddard
Stein
Fix Released
High
Mark Goddard
Train
Fix Released
High
Mark Goddard

Bug Description

Recently, Kolla Ansible switched to installing Docker CE by default in the 'kolla-ansible bootstrap-servers' command. This moves us forward from the old 1.12 legacy package we were using previously. These changes will be included in the Stein release.

When using Ironic with Docker CE and the Ironic Inspector iptables PXE filter, provisioning of bare metal nodes fails. This has so far only been seen in Kayobe CI, and not yet reproduced on real hardware.

The cause of the issue appears to be that Docker sets the default policy of the FORWARD chain to DROP since version 13.1 [1]. In the nova role we set the following sysctl, meaning that bridged L2 traffic is processed by iptables:

net.bridge.bridge-nf-call-iptables=1

When ironic inspector is used with the iptables PXE filter (ironic_inspector_pxe_filter = iptables), it sets up iptables rules to block DHCP packets from being accepted by the ironic_dnsmasq container's DHCP server, unless a node is being inspected. During provisioning and for instance networking, neutron provides DHCP services.

For some reason, the combination of the default DROP policy on the FORWARD chain and the inspector iptables PXE filter causes DHCP packets to get dropped before they get to the neutron OVS bridges. I suspect this depends on the network topology, and might not affect systems with a separate inspection/provisioning VLAN.

A few things I've tried that avoid this issue:

* Use a different PXE filter, e.g. 'dnsmasq'. This is actually recommended over iptables.
* Set the net.bridge.bridge-nf-call-iptables sysctl to 0. This could have adverse consequences for nova (although I'm not sure what exactly).

Some things that might work that I haven't tried:

* Set the default iptables policy on the FORWARD chain to ACCEPT. Docker will probably try to revert this change if it is restarted
* Set the iptables Docker config option to false to prevent it from configuring iptables. This is fine for Kolla Ansible, but if it is used for any other containers that don't use host networking, then it will cause problems.

I think the sensible thing to do here is to change the default value of ironic_inspector_pxe_filter from iptables to dnsmasq.

[1] https://docs.docker.com/v17.09/engine/userguide/networking/default_network/container-communication/

Revision history for this message
Mark Goddard (mgoddard) wrote :

Note: support for ironic_inspector_pxe_filter was only added in the Stein release. We only use Docker CE in Rocky by default for Ubuntu, but it is possible to change docker_legacy_packages to false on other distros in Rocky to use Docker CE.

Revision history for this message
Mark Goddard (mgoddard) wrote :

Example iptables config for inspector:

*filter
-A INPUT -i breth1 -p udp -m udp --dport 67 -j ironic-inspector
-A ironic-inspector -m mac --mac-source 52:54:00:23:91:75 -j DROP
-A ironic-inspector -m mac --mac-source 52:54:00:3A:66:72 -j DROP
-A ironic-inspector -j ACCEPT

Revision history for this message
Mark Goddard (mgoddard) wrote :

'Fix' proposed in master: https://review.openstack.org/649673

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to kolla-ansible (master)

Reviewed: https://review.openstack.org/649673
Committed: https://git.openstack.org/cgit/openstack/kolla-ansible/commit/?id=86e83faeb1fd088d44c5108a5ec835eba6316b2d
Submitter: Zuul
Branch: master

commit 86e83faeb1fd088d44c5108a5ec835eba6316b2d
Author: Mark Goddard <email address hidden>
Date: Wed Apr 3 17:33:04 2019 +0100

    Use ironic inspector 'dnsmasq' PXE filter by default

    With Docker CE, the daemon sets the default policy of the iptables
    FORWARD chain to DROP. This causes problems for provisioning bare metal
    servers when ironic inspector is used with the 'iptables' PXE filter.
    It's not entirely clear why these two things interact in this way,
    but switching to the 'dnsmasq' filter works around the issue, and is
    probably a good move anyway because it is more efficient.

    We have added a migration task here to flush and remove the ironic-inspector
    iptables chain since inspector does not do this itself currently.

    Change-Id: Iceed5a096819203eb2b92466d39575d3adf8e218
    Closes-Bug: #1823044

Changed in kolla-ansible:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to kolla-ansible (stable/stein)

Fix proposed to branch: stable/stein
Review: https://review.openstack.org/651474

Revision history for this message
Mark Goddard (mgoddard) wrote :

We don't support the dnsmasq inspector filter in rocky, so it's not obvious what to do there.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to kolla-ansible (stable/stein)

Reviewed: https://review.openstack.org/651474
Committed: https://git.openstack.org/cgit/openstack/kolla-ansible/commit/?id=16e982401191602efaf1c034b8fe645d62e7181e
Submitter: Zuul
Branch: stable/stein

commit 16e982401191602efaf1c034b8fe645d62e7181e
Author: Mark Goddard <email address hidden>
Date: Wed Apr 3 17:33:04 2019 +0100

    Use ironic inspector 'dnsmasq' PXE filter by default

    With Docker CE, the daemon sets the default policy of the iptables
    FORWARD chain to DROP. This causes problems for provisioning bare metal
    servers when ironic inspector is used with the 'iptables' PXE filter.
    It's not entirely clear why these two things interact in this way,
    but switching to the 'dnsmasq' filter works around the issue, and is
    probably a good move anyway because it is more efficient.

    We have added a migration task here to flush and remove the ironic-inspector
    iptables chain since inspector does not do this itself currently.

    Change-Id: Iceed5a096819203eb2b92466d39575d3adf8e218
    Closes-Bug: #1823044
    (cherry picked from commit 86e83faeb1fd088d44c5108a5ec835eba6316b2d)

tags: added: in-stable-stein
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/kolla-ansible 8.0.0.0rc2

This issue was fixed in the openstack/kolla-ansible 8.0.0.0rc2 release candidate.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/kolla-ansible 9.0.0.0rc1

This issue was fixed in the openstack/kolla-ansible 9.0.0.0rc1 release candidate.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.