Bug #1734320 “Eavesdropping private traffic” : Bugs : neutron

Revision history for this message

Tristan Cacqueray (tristan-cacqueray) wrote on 2017-11-24:

#1

Since this report concerns a possible security risk, an incomplete security advisory task has been added while the core security reviewers for the affected project or projects confirm the bug and discuss the scope of any vulnerability along with potential solutions.

Changed in ossa:
status:	New → Incomplete
description:	updated

Revision history for this message

Armando Migliaccio (armando-migliaccio) wrote on 2017-11-24:

#2

Looking.

Revision history for this message

Armando Migliaccio (armando-migliaccio) wrote on 2017-11-24:

#3

My fear is that this vulnerability might be present on releases newer than Newton.

Revision history for this message

Kevin Benton (kevinbenton) wrote on 2017-11-24:

#4

The other thing that surprised me about this is that there are rules forwarding traffic to the port when it doesn't have a VLAN tag. Is the behavior of the NORMAL action to flood to all untagged ports as if they were trunks carrying all VLANs?

The other suggested fixes all seem to be working under the assumption that traffic will flow to ports if no rules are installed. I think it would be safest to ensure that a newly added port to br-int doesn't receive any traffic until flows are setup. Going that route would also avoid having to ban all earlier versions of os-vif because a VLAN tag would not need to be set on plugging.

Revision history for this message

Tristan Cacqueray (tristan-cacqueray) wrote on 2017-11-28:

#5

Should we subscribe the Nova project for the issue #2 (Order of creation)?

Also, isn't instance migration an admin only operation? If so, can a regular user abuse this bug without an admin cooperation?

Revision history for this message

Armando Migliaccio (armando-migliaccio) wrote on 2017-11-28:

#6

@Tristan: perhaps looping in the nova project is not a bad idea, and yes, you're right that migration can only be performed by an admin according to the default policy.json.

Revision history for this message

Armando Migliaccio (armando-migliaccio) wrote on 2017-11-28:

#7

Today I was looking at this a bit to see if 'resizing' would expose the same behavior, as that is available to regular users. I shall be able to find more time tomorrow to assess what's going on pike and ocata.

Revision history for this message

Tristan Cacqueray (tristan-cacqueray) wrote on 2017-11-28:

#8

Thanks Armando for the prompt feedback. I've subscribed nova-coresec to discuss the scope of this vulnerability, in particular the issue #2.

@nova-coresec, is it correct to assume migration operation are restricted to admin? If a deployment does authorize regular user to do migration operation, wouldn't they be vulnerable to other unexpected issues anyway?

Revision history for this message

Paul Peereboom (peereb) wrote on 2017-11-28:

#9

@Tristan: yes nova live-migration is by default an admin only operation. But forcing the port-down can be done by a user aswell. Then it's just waiting for an admin to do a live-migration of a instance. We live-migrate instances regularly to empty a compute node and patch the compute node.

Revision history for this message

Armando Migliaccio (armando-migliaccio) wrote on 2017-11-28:

#10

Block migration would also be affected by the same behavior.

Revision history for this message

Tristan Cacqueray (tristan-cacqueray) wrote on 2017-11-28:

#11

@Paul, well I tend to agree this is a vulnerability (class A according to the VMT taxonomy: http://security.openstack.org/vmt-process.html#incident-report-taxonomy ). However, since a malicious user can't control the condition of exploitation, other VMT member may disagree with issuing an advisory for this issue.

I guess it boils down to how likely a user can snoop sensitive traffic.

Revision history for this message

Armando Migliaccio (armando-migliaccio) wrote on 2017-11-28:

#12

That's why I am trying to verify that the resize operation has the same behavior, because if that's the case, the user is very much in control!

Revision history for this message

Paul Peereboom (peereb) wrote on 2017-11-29:

#13

Agree lets see if we can reproduce without admin credentials.

Revision history for this message

Armando Migliaccio (armando-migliaccio) wrote on 2017-11-29:

#14

I used a Pike-based deployment as a testing platform and I can confirm that the resize operation exposes the same behavior as live/block migration (as reported in this report), i.e. the port connected to the VM shows up without a local VLAN tag on the target host.

I still have to look at the extent of the damage that the user can inflict. So, it looks like the malicious user can be in full control of the condition of exploitation.

Revision history for this message

Tristan Cacqueray (tristan-cacqueray) wrote on 2017-11-29:

#15

Thanks Armando,

so the next question is, can this be fixed across all the supported stable release? (e.g.: pike and ocata) ?
And is there something to fix in the os-vif and nova project too?

Revision history for this message

Armando Migliaccio (armando-migliaccio) wrote on 2017-11-29:

#16

Hi Tristan,

I wanted to look a bit deeper at what happens on the datapath, but I keep getting sidetracked. The report and analysis from Paul has been excellent, and I welcome that level of detail very much, but I want to see this through a bit more to formulate a recommendation.

Thanks for everyone involved!

Revision history for this message

Armando Migliaccio (armando-migliaccio) wrote on 2017-12-02:

#17

OK, I finally had some interrupted time in which I could look into this more carefully. the TL;DR summary is: the behavior is exposed in live-migrate/block-migrate/resize operations. That said, I don't believe this vulnerability is serious at least under default conditions. Here's why:

1) live/block migration is an admin operation: if the admin avoids turning a port down, the vulnerability is not exposed. Turning a port down is not strictly necessary. Furthermore block migration is disruptive, in that users loses console and network connectivity to the instance.

2) resize is a user allowed operation, but the operation leads to traffic disruption. Even though a user can explicitly turn a port DOWN and resize the instance without admin intervention, there's no way she can keep connectivity to the instance while the operation is in progress. Turning the port back UP will reinstate the local VLAN tag and restore connectivity. The loophole at that point is closed.

For these reasons, I would cautiously say that this vulnerability is not easily exploitable, but we do want to warn the admin of the potential loophole (OSSN B2?), and eventually address the neutron OVS agent codebase to ensure that a port is always tagged even when it's in ADMIN_DOWN after a migration/resize operation.

I'd recommend against changing os-vif to start a port on the dead vlan for two reasons:

a) neutron use cases go beyond just spinning up VMs. So if possible we should find a fix confined within neutron alone.

b) I feel that the os-vif fix as proposed is ineffective in closing any timing window because while the port is untagged, it's practically unlikely that a user can get hold of the console remotely.

I am open to feedback on my recommendation, but in the meantime I would like to thank the reporter for the very detailed, well thought and challenging report. I had fun cracking this one. Keep them coming!

Needless to say, open to further discussion.

@Tristan: what's next steps? Is this OSSA then marked CONFIRMED? I would like to involve Ihar and Miguel for what comes next. Unless my rationale is flawed, I consider my triage complete and I'd like to push the ball into someone else's court.

Cheers,
Armando