[OSSA-2019-004] Ageing time of 0 disables linuxbridge MAC learning (CVE-2019-15753)

Bug #1837252 reported by James Denton on 2019-07-19
274
This bug affects 3 people
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Undecided
Unassigned
OpenStack Security Advisory
High
Jeremy Stanley
neutron
Undecided
Unassigned
os-vif
Status tracked in Trunk
Stein
High
sean mooney
Trunk
High
sean mooney

Bug Description

Release: OpenStack Stein
Driver: LinuxBridge

Using Stein w/ the LinuxBridge mech driver/agent, we have found that traffic is being flooded across bridges. Using tcpdump inside an instance, you can see unicast traffic for other instances.

We have confirmed the macs table shows the aging timer set to 0 for permanent entries, and the bridge is NOT learning new MACs:

root@lab-compute01:~# brctl showmacs brqd0084ac0-f7
port no mac addr is local? ageing timer
  5 24:be:05:a3:1f:e1 yes 0.00
  5 24:be:05:a3:1f:e1 yes 0.00
  1 fe:16:3e:02:62:18 yes 0.00
  1 fe:16:3e:02:62:18 yes 0.00
  7 fe:16:3e:07:65:47 yes 0.00
  7 fe:16:3e:07:65:47 yes 0.00
  4 fe:16:3e:1d:d6:33 yes 0.00
  4 fe:16:3e:1d:d6:33 yes 0.00
  9 fe:16:3e:2b:2f:f0 yes 0.00
  9 fe:16:3e:2b:2f:f0 yes 0.00
  8 fe:16:3e:3c:42:64 yes 0.00
  8 fe:16:3e:3c:42:64 yes 0.00
 10 fe:16:3e:5c:a6:6c yes 0.00
 10 fe:16:3e:5c:a6:6c yes 0.00
  2 fe:16:3e:86:9c:dd yes 0.00
  2 fe:16:3e:86:9c:dd yes 0.00
  6 fe:16:3e:91:9b:45 yes 0.00
  6 fe:16:3e:91:9b:45 yes 0.00
 11 fe:16:3e:b3:30:00 yes 0.00
 11 fe:16:3e:b3:30:00 yes 0.00
  3 fe:16:3e:dc:c3:3e yes 0.00
  3 fe:16:3e:dc:c3:3e yes 0.00

root@lab-compute01:~# bridge fdb show | grep brqd0084ac0-f7
01:00:5e:00:00:01 dev brqd0084ac0-f7 self permanent
fe:16:3e:02:62:18 dev tap74af38f9-2e master brqd0084ac0-f7 permanent
fe:16:3e:02:62:18 dev tap74af38f9-2e vlan 1 master brqd0084ac0-f7 permanent
fe:16:3e:86:9c:dd dev tapb00b3c18-b3 master brqd0084ac0-f7 permanent
fe:16:3e:86:9c:dd dev tapb00b3c18-b3 vlan 1 master brqd0084ac0-f7 permanent
fe:16:3e:dc:c3:3e dev tap7284d235-2b master brqd0084ac0-f7 permanent
fe:16:3e:dc:c3:3e dev tap7284d235-2b vlan 1 master brqd0084ac0-f7 permanent
fe:16:3e:1d:d6:33 dev tapbeb9441a-99 vlan 1 master brqd0084ac0-f7 permanent
fe:16:3e:1d:d6:33 dev tapbeb9441a-99 master brqd0084ac0-f7 permanent
24:be:05:a3:1f:e1 dev eno1.102 vlan 1 master brqd0084ac0-f7 permanent
24:be:05:a3:1f:e1 dev eno1.102 master brqd0084ac0-f7 permanent
fe:16:3e:91:9b:45 dev tapc8ad2cec-90 master brqd0084ac0-f7 permanent
fe:16:3e:91:9b:45 dev tapc8ad2cec-90 vlan 1 master brqd0084ac0-f7 permanent
fe:16:3e:07:65:47 dev tap86e2c412-24 master brqd0084ac0-f7 permanent
fe:16:3e:07:65:47 dev tap86e2c412-24 vlan 1 master brqd0084ac0-f7 permanent
fe:16:3e:3c:42:64 dev tap37bcb70e-9e master brqd0084ac0-f7 permanent
fe:16:3e:3c:42:64 dev tap37bcb70e-9e vlan 1 master brqd0084ac0-f7 permanent
fe:16:3e:2b:2f:f0 dev tap40f6be7c-2d vlan 1 master brqd0084ac0-f7 permanent
fe:16:3e:2b:2f:f0 dev tap40f6be7c-2d master brqd0084ac0-f7 permanent
fe:16:3e:b3:30:00 dev tap6548bacb-c0 vlan 1 master brqd0084ac0-f7 permanent
fe:16:3e:b3:30:00 dev tap6548bacb-c0 master brqd0084ac0-f7 permanent
fe:16:3e:5c:a6:6c dev tap61107236-1e vlan 1 master brqd0084ac0-f7 permanent
fe:16:3e:5c:a6:6c dev tap61107236-1e master brqd0084ac0-f7 permanent

The ageing time for the bridge is set to 0:

root@lab-compute01:~# brctl showstp brqd0084ac0-f7
brqd0084ac0-f7
 bridge id 8000.24be05a31fe1
 designated root 8000.24be05a31fe1
 root port 0 path cost 0
 max age 20.00 bridge max age 20.00
 hello time 2.00 bridge hello time 2.00
 forward delay 0.00 bridge forward delay 0.00
 ageing time 0.00
 hello timer 0.00 tcn timer 0.00
 topology change timer 0.00 gc timer 0.00
 flags

The default ageing time of 300 is being overridden by the value set here:

Stein: https://github.com/openstack/os-vif/blob/stable/stein/os_vif/internal/command/ip/linux/impl_pyroute2.py#L89

Master: https://github.com/openstack/os-vif/blob/master/os_vif/internal/ip/linux/impl_pyroute2.py#L89

I am not sure of the behavior in OVS environments using the iptables firewall, but I have confirmed the 'qbr' bridges also have a ageing time of 0 (formerly 300).

Please let me know if you have any questions.

Hongbin Lu (hongbin.lu) wrote :

This doesn't seem to be a neutron bug. Awaiting for feedback from os-vif team.

Changed in neutron:
status: New → Incomplete
sean mooney (sean-k-mooney) wrote :

triaging as high as folding could lead to network disruption to guests on multiple hosts.

i have root caused this as a result of combining the code into a single shared codepath between the ovs and linux bridge plugin

for ovs hybrid plug we set the ageing to 0 to prevent packet loss during live migation

https://github.com/openstack/os-vif/commit/fa4ff64b86e6e1b6399f7250eadbee9775c22d32#diff-f55bc78ffb4c10000bbf81b88bf68673

however this is not valid for linux bridge in general

https://github.com/openstack/os-vif/commit/1f6fed6a69e9fd386e421f3cacae97c11cdd7c75#diff-010d1833da7ca175fffc8c41a38497c2

which replace the use of brctl in the linux bridge driver resued the common code i introduced in

https://github.com/openstack/os-vif/commit/5027ce833c6fccaa80b5ddc8544d262c0bf99dbd#diff-
cec1a2ac6413663c344b607129c39fab

and as a result it picked up the ovs ageing code which was not intentinal.

ill fix this shortly and backport it.

Changed in os-vif:
assignee: nobody → sean mooney (sean-k-mooney)
importance: Undecided → High
status: New → Confirmed
Changed in nova:
status: New → Invalid
Changed in neutron:
status: Incomplete → Invalid
Jeremy Stanley (fungi) wrote :

Since this report concerns a possible security risk, an incomplete security advisory task has been added while the core security reviewers for the affected project or projects confirm the bug and discuss the scope of any vulnerability along with potential solutions.

information type: Public → Public Security
Changed in ossa:
status: New → Incomplete
Jeremy Stanley (fungi) wrote :

This sounds remarkably similar to the symptoms described in bug 1732067 (and its marked duplicates bug 1813439 and bug 1825147).

sean mooney (sean-k-mooney) wrote :

it is the same in the context of linuxbridge backend.

in the case of ovs with hybrid plug its different
but similar.

the expected behaviour or any mac learing bridge is that if it does not have a mac entry for a unicast destiatnion mac then it should perform split horizone routing/swithching which means it should flood to all ports in the same l2 bradcast domain excetp its origin port.

that by itself is not a security risk. in a typical case the mac learning table will have been popluated by the intial arp request and responce that the sorce would have preformed to determin the destiontion mac form its ip

if we are doing arp suppression using the l2 pop driver it shoudl aso have install flows to prevent flooding.

in anycase im currently working on a fix for this

ill take a look at the other bug later as i have only skimed it.

Fix proposed to branch: master
Review: https://review.opendev.org/672834

Changed in os-vif:
status: Confirmed → In Progress

It is still not clear to me how this is different from #1732067

sean mooney (sean-k-mooney) wrote :

well this bug applies soly to deployment with ml2/linuxbridge for one and
the other applies only to deployments with ml2/ovs with the ovs contrack fire
wall driver so this is using two completely differeent network backends.

Reviewed: https://review.opendev.org/672834
Committed: https://git.openstack.org/cgit/openstack/os-vif/commit/?id=655c83d706f5de8a8cf23430782e065219297aef
Submitter: Zuul
Branch: master

commit 655c83d706f5de8a8cf23430782e065219297aef
Author: Sean Mooney <email address hidden>
Date: Thu Jul 25 22:16:42 2019 +0000

    only disable mac ageing for ovs hybrid plug

    The mac ageing configuration on linux bridges is now
    conditional and caller controlled. By default mac ageing
    is unspecified and will use the kernel's default of 300
    seconds. For ovs with hybrid plug we override this to
    0 to prevent packet loss issue during some migration
    edgecases. This change reverts disabling mac ageing
    for the linux bridge plugin which was accidentally
    introduced during the brctl removal via inheriting the
    ovs plugin's default behavior when the bridge create
    code became shared.

    Change-Id: I95612352de6cdb47de98eb80c208dd1a74499d41
    Closes-bug: #1837252

Changed in os-vif:
status: In Progress → Fix Released

I see there's a series bugtask confirmed for Stein. Does this affect other branches presently under stable maintenance?

Also, as openstack/os-vif is not tagged vulnerability:managed in governance and the Nova bugtask was invalidated, I'm marking our Advisory task Won't Fix but am still happy to assist the maintainers with any advisory they consider relevant.

Changed in ossa:
status: Incomplete → Won't Fix
sean mooney (sean-k-mooney) wrote :

no this bug was intoduced in stien and ill be working on the backport so we will aim to include it in the next stable/stien release.

older branches are not affected.

> Also, as openstack/os-vif is not tagged vulnerability:managed in governance and the Nova bugtask was invalidated, I'm marking our Advisory task Won't Fix but am still happy to assist the maintainers with any advisory they consider relevant.

Based on what mnaser said in #openstack-nova today [1], this bug involves VMs being able to see other VMs network traffic, which is a serious security issue worthy of an advisory for operators, IMHO. Does anyone else agree?

And if I'm not missing something about the severity of this issue, should openstack/os-vif be tagged vulnerability:managed, to ensure we get proper security handling of bugs in the future?

[1] http://eavesdrop.openstack.org/irclogs/%23openstack-nova/%23openstack-nova.2019-08-22.log.html#t2019-08-22T19:48:38

Jeremy Stanley (fungi) wrote :

For a while I've been meaning to raise the topic of dropping requirement #5 from https://governance.openstack.org/tc/reference/tags/vulnerability_managed.html#requirements since it was a high bar to clear and even projects which were previously under vulnerability management before the tag existed did not retroactively undergo threat analysis. While I still think it would be swell to have architectural info on critical OpenStack components, the volume of vulnerability reports we've received in recent years is low enough that I think we could cover more projects even without that. I did bring this up with the other members of the OpenStack VMT and there was no disagreement, so I'll start a thread about that on the ML.

I'll go ahead and draft an impact description since it looks like the stable/stein change is passing and likely to merge, and then request a CVE assignment and prepare to issue an advisory.

Changed in ossa:
status: Won't Fix → Confirmed
Jeremy Stanley (fungi) wrote :

Below is a proposed impact description. Please review and let me know if it needs to be adjusted before I request a CVE assignment with it. In particular, I was unsure which versions were affected so I went with the earliest release from the Stein cycle (per comment #11 above). James, also let me know if there is an employer or other organization you would like credited along with your name.

Title: Ageing time of 0 fills linuxbridge MAC tables
Reporter: James Denton
Products: os-vif
Affects: >=1.12.0<1.15.2, 1.16.0

Description:
James Denton reported a vulnerability in os-vif, the Nova/Neutron
network integration library. The hard-coded MAC ageing time of 0
causes rapid filling of linuxbridge tables, often resulting in
Ethernet flooding which both slows network performance significantly
and allows users to possibly view the content of packets for
instances belonging to other tenants sharing the same network.
Only deployments using the linuxbridge backend are affected.

Logan V (loganv) wrote :

The statement “The hard-coded MAC ageing time of 0
causes rapid filling of linuxbridge tables” is incorrect.. setting the aging time to 0 disables MAC learning which causes all instances traffic to be flooded due to a lack of bridge fdb entry.

Jeremy Stanley (fungi) wrote :

Thanks Logan! So how about this:

Title: Ageing time of 0 disables linuxbridge MAC learning
Reporter: James Denton
Products: os-vif
Affects: >=1.12.0<1.15.2, 1.16.0

Description:
James Denton reported a vulnerability in os-vif, the Nova/Neutron
network integration library. The hard-coded MAC ageing time of 0
disables MAC learning in linuxbridge, forcing obligatory
Ethernet flooding which both slows network performance significantly
and allows users to possibly view the content of packets for
instances belonging to other tenants sharing the same network.
Only deployments using the linuxbridge backend are affected.

Logan V (loganv) wrote :

Thanks fungi. Looks good to me!

Impact description of #15 looks good to me too, thanks fungi.

Jeremy Stanley (fungi) wrote :

After discussing this with Sean in #openstack-nova on Freenode, it seems this bug was introduced in commit 1f6fed6 which first appeared in os-vif 1.15.0, making the corrected version list...

Affects: >=1.15.0<1.15.2, 1.16.0

Also, seeking clarification on the exact nature of the flooding, it seems that IFLA_BR_AGEING_TIME=0 disabling learning results in the bridge table containing only local MACs (those which are added explicitly). Flooding then happens when a frame transiting the bridge, most likely one emitted from a local instance, is addressed for a non-local MAC and is then forwarded to all local ports as well as the uplinks to the host-external network, because linuxbridge doesn't know where that MAC resides. Does this summary accurately characterize the scenario?

Gavin Grabias (gavingrabe) wrote :

Fungi, That seems accurate. I was the one who originally noticed this behavior.

James Denton (james-denton) wrote :

@fungi Feel free to list Rackspace as the employer/organization, if there's still time. Otherwise, don't worry about it. Thanks!

Jeremy Stanley (fungi) wrote :

Thanks Gavin, James, Sean, Logan et al! Here's a final draft of the impact description; if there are no further objections I'll use it to request a CVE assignment from MITRE tomorrow:

Title: Ageing time of 0 disables linuxbridge MAC learning
Reporter: James Denton (Rackspace)
Products: os-vif
Affects: >=1.15.0<1.15.2, 1.16.0

Description:
James Denton with Rackspace reported a vulnerability in os-vif, the
Nova/Neutron network integration library. A hard-coded MAC ageing
time of 0 disables MAC learning in linuxbridge, forcing obligatory
Ethernet flooding non-local destinations which both impedes network
performance and allows users to possibly view the content of packets
for instances belonging to other tenants sharing the same network.
Only deployments using the linuxbridge backend are affected.

Jeremy Stanley (fungi) wrote :

Er, that should be "forcing obligatory Ethernet flooding *for* non-local destinations" (I'll be sure to incorporate that edit in the official version).

Jeremy Stanley (fungi) on 2019-08-28
Changed in ossa:
status: Confirmed → In Progress
Jeremy Stanley (fungi) on 2019-08-28
summary: - IFLA_BR_AGEING_TIME of 0 causes flooding across bridges
+ Ageing time of 0 disables linuxbridge MAC learning (CVE-2019-15753)
Jeremy Stanley (fungi) on 2019-08-28
summary: - Ageing time of 0 disables linuxbridge MAC learning (CVE-2019-15753)
+ [OSSA-2019-004] Ageing time of 0 disables linuxbridge MAC learning
+ (CVE-2019-15753)

Reviewed: https://review.opendev.org/678098
Committed: https://git.openstack.org/cgit/openstack/os-vif/commit/?id=ec9d5430300c908ea9a1c64151eee7af522a44e7
Submitter: Zuul
Branch: stable/stein

commit ec9d5430300c908ea9a1c64151eee7af522a44e7
Author: Sean Mooney <email address hidden>
Date: Thu Jul 25 22:16:42 2019 +0000

    only disable mac ageing for ovs hybrid plug

    The mac ageing configuration on linux bridges is now
    conditional and caller controlled. By default mac ageing
    is unspecified and will use the kernel's default of 300
    seconds. For ovs with hybrid plug we override this to
    0 to prevent packet loss issue during some migration
    edgecases. This change reverts disabling mac ageing
    for the linux bridge plugin which was accidentally
    introduced during the brctl removal via inheriting the
    ovs plugin's default behavior when the bridge create
    code became shared.

    Backport Changes:
      In the train cycle we removed the os_vif.internal.command
      module in Id8b71172fb06b435cf169a7e55c11233f22fa65b to eliminate
      one layer of indirection. As a result we need to addtionally
      update the add method in os_vif/internal/command/ip/__init__.py
      which was not required in the train patch.

    Change-Id: I95612352de6cdb47de98eb80c208dd1a74499d41
    Closes-bug: #1837252
    (cherry picked from commit 655c83d706f5de8a8cf23430782e065219297aef)

Reviewed: https://review.opendev.org/679112
Committed: https://git.openstack.org/cgit/openstack/ossa/commit/?id=59342fd8cfd161d9ca80a3ff06fe472da5eb95ef
Submitter: Zuul
Branch: master

commit 59342fd8cfd161d9ca80a3ff06fe472da5eb95ef
Author: Jeremy Stanley <email address hidden>
Date: Wed Aug 28 18:38:29 2019 +0000

    Add OSSA-2019-004 ($CVE)

    Change-Id: I915b0d74577dd9badee6f60300a67b88dc539e03
    Related-Bug: #1837252

Jeremy Stanley (fungi) on 2019-08-29
Changed in ossa:
status: In Progress → Fix Released
importance: Undecided → High
assignee: nobody → Jeremy Stanley (fungi)

This issue was fixed in the openstack/os-vif 1.17.0 release.

This issue was fixed in the openstack/os-vif 1.15.2 release.

To post a comment you must log in.
This report contains Public Security information  Edit
Everyone can see this security related information.

Other bug subscribers