agent traces about bridge-nf-call sysctl values missing

Bug #1622914 reported by Kevin Benton
16
This bug affects 3 people
Affects Status Importance Assigned to Milestone
devstack
Fix Released
Undecided
Ihar Hrachyshka
neutron
Fix Released
Low
Ihar Hrachyshka
tripleo
Expired
Undecided
Unassigned

Bug Description

spotted in gate:

2016-09-13 07:37:33.437 13401 ERROR neutron.plugins.ml2.drivers.agent._common_agent Traceback (most recent call last):
2016-09-13 07:37:33.437 13401 ERROR neutron.plugins.ml2.drivers.agent._common_agent File "/opt/stack/new/neutron/neutron/plugins/ml2/drivers/agent/_common_agent.py", line 450, in daemon_loop
2016-09-13 07:37:33.437 13401 ERROR neutron.plugins.ml2.drivers.agent._common_agent sync = self.process_network_devices(device_info)
2016-09-13 07:37:33.437 13401 ERROR neutron.plugins.ml2.drivers.agent._common_agent File "/usr/local/lib/python2.7/dist-packages/osprofiler/profiler.py", line 154, in wrapper
2016-09-13 07:37:33.437 13401 ERROR neutron.plugins.ml2.drivers.agent._common_agent return f(*args, **kwargs)
2016-09-13 07:37:33.437 13401 ERROR neutron.plugins.ml2.drivers.agent._common_agent File "/opt/stack/new/neutron/neutron/plugins/ml2/drivers/agent/_common_agent.py", line 200, in process_network_devices
2016-09-13 07:37:33.437 13401 ERROR neutron.plugins.ml2.drivers.agent._common_agent device_info.get('updated'))
2016-09-13 07:37:33.437 13401 ERROR neutron.plugins.ml2.drivers.agent._common_agent File "/opt/stack/new/neutron/neutron/agent/securitygroups_rpc.py", line 265, in setup_port_filters
2016-09-13 07:37:33.437 13401 ERROR neutron.plugins.ml2.drivers.agent._common_agent self.prepare_devices_filter(new_devices)
2016-09-13 07:37:33.437 13401 ERROR neutron.plugins.ml2.drivers.agent._common_agent File "/opt/stack/new/neutron/neutron/agent/securitygroups_rpc.py", line 130, in decorated_function
2016-09-13 07:37:33.437 13401 ERROR neutron.plugins.ml2.drivers.agent._common_agent *args, **kwargs)
2016-09-13 07:37:33.437 13401 ERROR neutron.plugins.ml2.drivers.agent._common_agent File "/opt/stack/new/neutron/neutron/agent/securitygroups_rpc.py", line 138, in prepare_devices_filter
2016-09-13 07:37:33.437 13401 ERROR neutron.plugins.ml2.drivers.agent._common_agent self._apply_port_filter(device_ids)
2016-09-13 07:37:33.437 13401 ERROR neutron.plugins.ml2.drivers.agent._common_agent File "/opt/stack/new/neutron/neutron/agent/securitygroups_rpc.py", line 163, in _apply_port_filter
2016-09-13 07:37:33.437 13401 ERROR neutron.plugins.ml2.drivers.agent._common_agent self.firewall.prepare_port_filter(device)
2016-09-13 07:37:33.437 13401 ERROR neutron.plugins.ml2.drivers.agent._common_agent File "/opt/stack/new/neutron/neutron/agent/linux/iptables_firewall.py", line 170, in prepare_port_filter
2016-09-13 07:37:33.437 13401 ERROR neutron.plugins.ml2.drivers.agent._common_agent self._enable_netfilter_for_bridges()
2016-09-13 07:37:33.437 13401 ERROR neutron.plugins.ml2.drivers.agent._common_agent File "/opt/stack/new/neutron/neutron/agent/linux/iptables_firewall.py", line 114, in _enable_netfilter_for_bridges
2016-09-13 07:37:33.437 13401 ERROR neutron.plugins.ml2.drivers.agent._common_agent run_as_root=True)
2016-09-13 07:37:33.437 13401 ERROR neutron.plugins.ml2.drivers.agent._common_agent File "/opt/stack/new/neutron/neutron/agent/linux/utils.py", line 138, in execute
2016-09-13 07:37:33.437 13401 ERROR neutron.plugins.ml2.drivers.agent._common_agent raise RuntimeError(msg)
2016-09-13 07:37:33.437 13401 ERROR neutron.plugins.ml2.drivers.agent._common_agent RuntimeError: Exit code: 255; Stdin: ; Stdout: ; Stderr: sysctl: cannot stat /proc/sys/net/bridge/bridge-nf-call-arptables: No such file or directory
2016-09-13 07:37:33.437 13401 ERROR neutron.plugins.ml2.drivers.agent._common_agent
2016-09-13 07:37:33.437 13401 ERROR neutron.plugins.ml2.drivers.agent._common_agent

Changed in neutron:
assignee: nobody → Kevin Benton (kevinbenton)
milestone: none → newton-rc1
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (master)

Fix proposed to branch: master
Review: https://review.openstack.org/369250

Changed in neutron:
status: New → In Progress
Revision history for this message
Armando Migliaccio (armando-migliaccio) wrote :

Do you have a link?

tags: added: logging
Revision history for this message
Kevin Benton (kevinbenton) wrote :
Changed in neutron:
importance: Undecided → Low
Changed in neutron:
milestone: newton-rc1 → ocata-1
Revision history for this message
Assaf Muller (amuller) wrote :

Added TripleO - br_filter kernel module should be loaded by installers.

Revision history for this message
Brent Eagles (beagles) wrote :

With respect to TripleO: this won't really be an issue until such time as a supported OS has a kernel with a separate br_filter module.

Revision history for this message
Assaf Muller (amuller) wrote :

Is Fedora supported? It already has the separate module.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to devstack (master)

Related fix proposed to branch: master
Review: https://review.openstack.org/371504

Changed in devstack:
assignee: nobody → Ihar Hrachyshka (ihar-hrachyshka)
status: New → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to neutron (master)

Related fix proposed to branch: master
Review: https://review.openstack.org/371523

Revision history for this message
Brent Eagles (beagles) wrote :

I've accepted the bz for the tripleo part. I've set the importance to "medium" since the seriousness is dependent on newer versions of CentOS and RHEL than are available right now. I'm aiming to fix this at the puppet level so it will at least be exercised before tripleo needs it.

Changed in tripleo:
assignee: nobody → Brent Eagles (beagles)
status: New → Triaged
importance: Undecided → Medium
Changed in neutron:
assignee: Kevin Benton (kevinbenton) → Ihar Hrachyshka (ihar-hrachyshka)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on neutron (master)

Change abandoned by Kevin Benton (<email address hidden>) on branch: master
Review: https://review.openstack.org/369250
Reason: Ihar's approach: I9137ea017624ac92a05f73863b77f9ee4681bbe7

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on devstack (master)

Change abandoned by Ihar Hrachyshka (<email address hidden>) on branch: master
Review: https://review.openstack.org/370918

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to neutron (master)

Related fix proposed to branch: master
Review: https://review.openstack.org/379468

Revision history for this message
Ihar Hrachyshka (ihar-hrachyshka) wrote :

Removed 'logging' tag from the bug because it suggests that it's a logging only issue. In reality, it's not since newer kernels may see firewall misbehaving.

tags: added: deprecation
tags: removed: logging
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to devstack (master)

Reviewed: https://review.openstack.org/371504
Committed: https://git.openstack.org/cgit/openstack-dev/devstack/commit/?id=b3a210f643989603d192b32a40b2001664f8ed73
Submitter: Jenkins
Branch: master

commit b3a210f643989603d192b32a40b2001664f8ed73
Author: Ihar Hrachyshka <email address hidden>
Date: Thu Sep 29 13:26:30 2016 +0000

    Enable bridge firewalling if iptables are used

    With the plan [1] to stop enabling it by Neutron iptables firewall
    driver itself, deployment tools should catch up and enable the firewall
    themselves.

    This is needed for distributions that decided to disable the kernel
    firewall by default (upstream kernel has it enabled). This is also
    needed for distributions that ship newer kernels but don't load the
    br_netfilter module before starting nova-network or Neutron iptables
    firewall driver. In the latter case, firewall may not work, depending on
    the order of operations executed by the driver.

    To isolate devstack setups from the difference in distribution
    kernel configuration and version, the following steps are done:

    - we load bridge kernel module, and br_netfilter if present, to get
      access to sysctl knobs controlling the firewall;
    - once knobs are available, we unconditionally set them to 1, to make
      sure the firewall is in effect.

    More details at:
    http://wiki.libvirt.org/page/Net.bridge.bridge-nf-call_and_sysctl.conf

    [1] I9137ea017624ac92a05f73863b77f9ee4681bbe7

    Change-Id: Id6bfd9595f0772a63d1096ef83ebbb6cd630fafd
    Related-Bug: #1622914

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to neutron (master)

Reviewed: https://review.openstack.org/379468
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=80eb375ba1d58a41be1fcb6e11163f78cce8b65d
Submitter: Jenkins
Branch: master

commit 80eb375ba1d58a41be1fcb6e11163f78cce8b65d
Author: Ihar Hrachyshka <email address hidden>
Date: Thu Sep 29 13:36:07 2016 +0000

    Fixed functional iptables firewall tests for newer kernels

    Iptables functional tests fail on Xenial and other newer kernels if
    br_netfilter kernel module is not loaded, in which case sysctl knobs to
    enable bridge firewalling are not available, and attempt to set them
    with _enable_netfilter_for_bridges fails.

    We should load the kernel module before running those tests. Luckily,
    devstack has a function for just that (plus more).

    Change-Id: I602d8cd02c73b18e9d719b19998e36059ae28cd8
    Depends-On: Id6bfd9595f0772a63d1096ef83ebbb6cd630fafd
    Related-Bug: #1622914

Revision history for this message
OpenStack Infra (hudson-openstack) wrote :
Download full text (3.2 KiB)

Reviewed: https://review.openstack.org/371523
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=e83a44b96a8e3cd81b7cc684ac90486b283a3507
Submitter: Jenkins
Branch: master

commit e83a44b96a8e3cd81b7cc684ac90486b283a3507
Author: Ihar Hrachyshka <email address hidden>
Date: Thu Sep 15 21:48:10 2016 +0000

    iptables: fail to start ovs/linuxbridge agents on missing sysctl knobs

    For new kernels (3.18+), bridge module is split into two pieces: bridge
    and br_netfilter. The latter provides firewall support for bridged
    traffic, as well as the following sysctl knobs:

    * net.bridge.bridge-nf-call-arptables
    * net.bridge.bridge-nf-call-ip6tables
    * net.bridge.bridge-nf-call-iptables

    Before kernel 3.18, any brctl command was loading the 'bridge' module
    with the knobs, so at the moment where we reached iptables setup, they
    were always available.

    With new 3.18+ kernels, brctl still loads 'bridge' module, but not
    br_netfilter. So bridge existance no longer guarantees us knobs'
    presence. If we reach _enable_netfilter_for_bridges before the new
    module is loaded, then the code will fail, triggering agent resync. It
    will also fail to enable bridge firewalling on systems where it's
    disabled by default (examples of those systems are most if not all Red
    Hat/Fedora based systems), making security groups completely
    ineffective.

    Systems that don't override default settings for those knobs would work
    fine except for this exception in the log file and agent resync. This is
    because the first attempt to add a iptables rule using 'physdev' module
    (-m physdev) will trigger the kernel module loading. In theory, we could
    silently swallow missing knobs, and still operate correctly. But on
    second thought, it's quite fragile to rely on that implicit module
    loading. In the case where we can't detect whether firewall is enabled,
    it's better to fail than hope for the best.

    An alternative to the proposed path could be trying
    to fix broken deployment, meaning we would need to load the missing
    kernel module on agent startup. It's not even clear whether we can
    assume the operation would be available to us. Even with that, adding a
    rootwrap filter to allow loading code in the kernel sounds quite scary.
    If we would follow the path, we would also hit an issue of
    distinguishing between cases of built-in kernel module vs. modular one.
    A complexity that is probably beyond what Neutron should fix.

    The patch introduces a sanity check that would fail on missing
    configuration knobs.

    DocImpact: document the new deployment requirement in operations guide
    UpgradeImpact: deployers relying on agents fixing wrong sysctl defaults
                   will need to make sure bridge firewalling is enabled.
                   Also, the kernel module providing sysctl knobs must be
                   loaded before starting the agent, otherwise it will fail
                   to start.

    Depends-On: Id6bfd9595f0772a63d1096ef83ebbb6cd630fafd
    Change-Id: I9137ea017624ac92a05f73863b77f9ee4681bb...

Read more...

Changed in neutron:
milestone: ocata-1 → ocata-2
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to neutron (stable/newton)

Related fix proposed to branch: stable/newton
Review: https://review.openstack.org/398817

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to devstack (stable/newton)

Related fix proposed to branch: stable/newton
Review: https://review.openstack.org/399142

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to neutron (stable/newton)

Related fix proposed to branch: stable/newton
Review: https://review.openstack.org/399661

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to devstack (stable/newton)

Reviewed: https://review.openstack.org/399142
Committed: https://git.openstack.org/cgit/openstack-dev/devstack/commit/?id=88dbdefc7ed0a074c473cf9eeaff31a1f8390ca4
Submitter: Jenkins
Branch: stable/newton

commit 88dbdefc7ed0a074c473cf9eeaff31a1f8390ca4
Author: Ihar Hrachyshka <email address hidden>
Date: Thu Sep 29 13:26:30 2016 +0000

    Enable bridge firewalling if iptables are used

    With the plan [1] to stop enabling it by Neutron iptables firewall
    driver itself, deployment tools should catch up and enable the firewall
    themselves.

    This is needed for distributions that decided to disable the kernel
    firewall by default (upstream kernel has it enabled). This is also
    needed for distributions that ship newer kernels but don't load the
    br_netfilter module before starting nova-network or Neutron iptables
    firewall driver. In the latter case, firewall may not work, depending on
    the order of operations executed by the driver.

    To isolate devstack setups from the difference in distribution
    kernel configuration and version, the following steps are done:

    - we load bridge kernel module, and br_netfilter if present, to get
      access to sysctl knobs controlling the firewall;
    - once knobs are available, we unconditionally set them to 1, to make
      sure the firewall is in effect.

    More details at:
    http://wiki.libvirt.org/page/Net.bridge.bridge-nf-call_and_sysctl.conf

    [1] I9137ea017624ac92a05f73863b77f9ee4681bbe7

    Change-Id: Id6bfd9595f0772a63d1096ef83ebbb6cd630fafd
    Related-Bug: #1622914
    (cherry picked from commit b3a210f643989603d192b32a40b2001664f8ed73)

tags: added: in-stable-newton
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to neutron (stable/newton)
Download full text (3.7 KiB)

Reviewed: https://review.openstack.org/398817
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=4371a4f5cdc6559955af9158c4c28851e77914da
Submitter: Jenkins
Branch: stable/newton

commit 4371a4f5cdc6559955af9158c4c28851e77914da
Author: Ihar Hrachyshka <email address hidden>
Date: Thu Sep 15 21:48:10 2016 +0000

    iptables: fail to start ovs/linuxbridge agents on missing sysctl knobs

    For new kernels (3.18+), bridge module is split into two pieces: bridge
    and br_netfilter. The latter provides firewall support for bridged
    traffic, as well as the following sysctl knobs:

    * net.bridge.bridge-nf-call-arptables
    * net.bridge.bridge-nf-call-ip6tables
    * net.bridge.bridge-nf-call-iptables

    Before kernel 3.18, any brctl command was loading the 'bridge' module
    with the knobs, so at the moment where we reached iptables setup, they
    were always available.

    With new 3.18+ kernels, brctl still loads 'bridge' module, but not
    br_netfilter. So bridge existance no longer guarantees us knobs'
    presence. If we reach _enable_netfilter_for_bridges before the new
    module is loaded, then the code will fail, triggering agent resync. It
    will also fail to enable bridge firewalling on systems where it's
    disabled by default (examples of those systems are most if not all Red
    Hat/Fedora based systems), making security groups completely
    ineffective.

    Systems that don't override default settings for those knobs would work
    fine except for this exception in the log file and agent resync. This is
    because the first attempt to add a iptables rule using 'physdev' module
    (-m physdev) will trigger the kernel module loading. In theory, we could
    silently swallow missing knobs, and still operate correctly. But on
    second thought, it's quite fragile to rely on that implicit module
    loading. In the case where we can't detect whether firewall is enabled,
    it's better to fail than hope for the best.

    An alternative to the proposed path could be trying
    to fix broken deployment, meaning we would need to load the missing
    kernel module on agent startup. It's not even clear whether we can
    assume the operation would be available to us. Even with that, adding a
    rootwrap filter to allow loading code in the kernel sounds quite scary.
    If we would follow the path, we would also hit an issue of
    distinguishing between cases of built-in kernel module vs. modular one.
    A complexity that is probably beyond what Neutron should fix.

    The patch introduces a sanity check that would fail on missing
    configuration knobs.

    DocImpact: document the new deployment requirement in operations guide
    UpgradeImpact: deployers relying on agents fixing wrong sysctl defaults
                   will need to make sure bridge firewalling is enabled.
                   Also, the kernel module providing sysctl knobs must be
                   loaded before starting the agent, otherwise it will fail
                   to start.

    Changes made to this backport:
       neutron/agent/linux/iptables_firewall.py
           - removed d...

Read more...

Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Reviewed: https://review.openstack.org/399661
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=69f94f45dc5df1095bf94a1944e565a7a792a6a1
Submitter: Jenkins
Branch: stable/newton

commit 69f94f45dc5df1095bf94a1944e565a7a792a6a1
Author: Ihar Hrachyshka <email address hidden>
Date: Thu Sep 29 13:36:07 2016 +0000

    Fixed functional iptables firewall tests for newer kernels

    Iptables functional tests fail on Xenial and other newer kernels if
    br_netfilter kernel module is not loaded, in which case sysctl knobs to
    enable bridge firewalling are not available, and attempt to set them
    with _enable_netfilter_for_bridges fails.

    We should load the kernel module before running those tests. Luckily,
    devstack has a function for just that (plus more).

    Change-Id: I602d8cd02c73b18e9d719b19998e36059ae28cd8
    Depends-On: Id6bfd9595f0772a63d1096ef83ebbb6cd630fafd
    Related-Bug: #1622914
    (cherry picked from commit 80eb375ba1d58a41be1fcb6e11163f78cce8b65d)

Changed in neutron:
status: In Progress → Fix Released
Changed in devstack:
status: In Progress → Fix Released
Changed in tripleo:
milestone: none → ocata-2
Changed in tripleo:
milestone: ocata-2 → ocata-3
Changed in tripleo:
milestone: ocata-3 → pike-1
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to neutron (master)

Related fix proposed to branch: master
Review: https://review.openstack.org/436315

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to neutron (master)

Reviewed: https://review.openstack.org/436315
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=c1dfb53bf1db1fe65ba6a8ef64a0b30151ee5c03
Submitter: Jenkins
Branch: master

commit c1dfb53bf1db1fe65ba6a8ef64a0b30151ee5c03
Author: Ihar Hrachyshka <email address hidden>
Date: Sat Feb 11 12:50:04 2017 +0000

    iptables: stop 'fixing' kernel sysctl bridge firewalling knobs

    Those are different on different kernel versions, and have reasonable
    default values on all newer kernel versions, including RHEL. We
    nevertheless made devstack to set those in the past; now I propose to
    clean the code from neutron tree and leave it up to deployment tools to
    fix in an unlikely case the system has broken default values.

    Now that iptables firewall code does not trigger sysctl, we can also
    remove this filter from the corresponding rootwrap .filters file.

    DocImpact make sure deployment docs mention the expected sysctl knob
              values.

    Change-Id: Iabf61021c90b0536be274463d48fb5a572ecc023
    Related-Bug: #1622914

Changed in tripleo:
milestone: pike-1 → pike-2
Changed in tripleo:
milestone: pike-2 → pike-3
Revision history for this message
Emilien Macchi (emilienm) wrote :

There are no currently open reviews on this bug, changing the status back to the previous state and unassigning. If there are active reviews related to this bug, please include links in comments.

Changed in tripleo:
assignee: Brent Eagles (beagles) → nobody
Changed in tripleo:
milestone: pike-3 → pike-rc1
Changed in tripleo:
milestone: pike-rc1 → queens-1
Changed in tripleo:
milestone: queens-1 → queens-2
Changed in tripleo:
milestone: queens-2 → queens-3
Changed in tripleo:
milestone: queens-3 → queens-rc1
Changed in tripleo:
milestone: queens-rc1 → rocky-1
Revision history for this message
Ihar Hrachyshka (ihar-hrachyshka) wrote :

Is it really an issue for tripleo while neutron package already loads the needed kernel module?

Changed in tripleo:
milestone: rocky-1 → rocky-2
Changed in tripleo:
milestone: rocky-2 → rocky-3
Changed in tripleo:
milestone: rocky-3 → rocky-rc1
Changed in tripleo:
milestone: rocky-rc1 → stein-1
Changed in tripleo:
milestone: stein-1 → stein-2
Revision history for this message
Emilien Macchi (emilienm) wrote : Cleanup EOL bug report

This is an automated cleanup. This bug report has been closed because it
is older than 18 months and there is no open code change to fix this.
After this time it is unlikely that the circumstances which lead to
the observed issue can be reproduced.

If you can reproduce the bug, please:
* reopen the bug report (set to status "New")
* AND add the detailed steps to reproduce the issue (if applicable)
* AND leave a comment "CONFIRMED FOR: <RELEASE_NAME>"
  Only still supported release names are valid (FUTURE, PIKE, QUEENS, ROCKY, STEIN).
  Valid example: CONFIRMED FOR: FUTURE

Changed in tripleo:
importance: Medium → Undecided
status: Triaged → Expired
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.