pci: Can't get DHCP address after resize with port type 'direct-physical'

Bug #1617429 reported by Ludovic Beliveau
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Confirmed
Low
Unassigned

Bug Description

Description:
============

Booting a guest with a neutron port of type 'direct-physical' will cause nova to allocate a PCI passthrough device for the port. The MAC address of the PCI passthrough device in the guest is not a virtual MAC address (fa:16:...) but the MAC address of the physical device since the full device is allocated to the guest (compared to SR-IOV where a virtual MAC address is arbitrarily chosen for the port).

When resizing the guest (to another flavor), nova will allocate a new PCI device for the guest. After the resize, the guest will be bound to another PCI device which has a different MAC address. However the MAC address on the neutron port is not updated, causing DHCP to not work because the MAC address is unknown.

The same issue can be observed when migrating a guest to another host.

Steps to reproduce:
===================

1- Configure a compute with two NICs PCI passthrough devices

2- Create a neutron port of type 'direct-physical':
PORTID=`neutron port-create $NETID --binding:vnic_type direct-physical | grep "\ id\ " | awk '{ print $4 }'`

3- Boot a guest with the port-id:
nova boot guest --image=ubuntu --nic port-id=$PORTID --flavor=m1.medium

4- Note the MAC address of the neutron port:

[centos@IronPass-2 devstack]$ neutron port-show 657990ce-3446-40a2-bc33-8040a32cb72b
+-----------------------+-----------------------------------------------------------------------------------------------------------+
| Field | Value |
+-----------------------+-----------------------------------------------------------------------------------------------------------+
| admin_state_up | True |
| allowed_address_pairs | |
| binding:vnic_type | direct-physical |
| created_at | 2016-08-26T14:57:07 |
| description | |
| device_id | f1005ec2-1875-4345-af01-58afaee7e68d |
| device_owner | compute:None |
| extra_dhcp_opts | |
| fixed_ips | {"subnet_id": "8e6aaf8e-78e8-42d7-9b59-f96789728abc", "ip_address": "10.0.0.12"} |
| | {"subnet_id": "4210fe1c-b156-4ecb-a896-922c87f85c3f", "ip_address": "fdab:2588:f4:0:f816:3eff:fe37:acba"} |
| id | 657990ce-3446-40a2-bc33-8040a32cb72b |
| mac_address | 90:e2:ba:48:27:ed |
| name | sriov_port |
| network_id | dfdff739-a27b-4160-9e40-c4824e8a351d |
| port_security_enabled | True |
| revision | 57 |
| security_groups | 8c7b8173-f050-480d-968a-33e0b75332fc |
| status | ACTIVE |
| tenant_id | a11cb2aece1943c2a86ee0a55e1bd8f7 |
| updated_at | 2016-08-26T17:33:12 |
+-----------------------+-----------------------------------------------------------------------------------------------------------+

5- Log in the guest and get a DHCP address for the interface.

ip link add link eth1 name eth1.451 type vlan id 451
ip link set eth1.451 up
dhclient eth1.451

Note the MAC address of the interface:

4: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP qlen 1000
    link/ether 90:e2:ba:48:27:ed brd ff:ff:ff:ff:ff:ff
    inet6 fe80::92e2:baff:fe48:27ed/64 scope link
       valid_lft forever preferred_lft forever
5: eth1.451@eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP
    link/ether 90:e2:ba:48:27:ed brd ff:ff:ff:ff:ff:ff
    inet 10.0.0.12/24 brd 10.0.0.255 scope global eth1.451
       valid_lft forever preferred_lft forever
    inet6 fdab:2588:f4:0:92e2:baff:fe48:27ed/64 scope global dynamic
       valid_lft 86396sec preferred_lft 14396sec
    inet6 fe80::92e2:baff:fe48:27ed/64 scope link
       valid_lft forever preferred_lft forever

6- Resize the guest

7- Note the MAC address of the neutron port (after resize):

[centos@IronPass-2 devstack]$ neutron port-show 657990ce-3446-40a2-bc33-8040a32cb72b
+-----------------------+-----------------------------------------------------------------------------------------------------------+
| Field | Value |
+-----------------------+-----------------------------------------------------------------------------------------------------------+
| admin_state_up | True |
| allowed_address_pairs | |
| binding:vnic_type | direct-physical |
| created_at | 2016-08-26T14:57:07 |
| description | |
| device_id | f1005ec2-1875-4345-af01-58afaee7e68d |
| device_owner | compute:None |
| extra_dhcp_opts | |
| fixed_ips | {"subnet_id": "8e6aaf8e-78e8-42d7-9b59-f96789728abc", "ip_address": "10.0.0.12"} |
| | {"subnet_id": "4210fe1c-b156-4ecb-a896-922c87f85c3f", "ip_address": "fdab:2588:f4:0:f816:3eff:fe37:acba"} |
| id | 657990ce-3446-40a2-bc33-8040a32cb72b |
| mac_address | 90:e2:ba:48:27:ed |
| name | sriov_port |
| network_id | dfdff739-a27b-4160-9e40-c4824e8a351d |
| port_security_enabled | True |
| revision | 59 |
| security_groups | 8c7b8173-f050-480d-968a-33e0b75332fc |
| status | ACTIVE |
| tenant_id | a11cb2aece1943c2a86ee0a55e1bd8f7 |
| updated_at | 2016-08-26T17:38:30 |
+-----------------------+-----------------------------------------------------------------------------------------------------------+

Notice that it's still hasn't changed: 90:e2:ba:48:27:ed

8- Log in the guest and get a DHCP address for the interface.

ip link add link eth1 name eth1.451 type vlan id 451
ip link set eth1.451 up
dhclient eth1.451

Note the MAC address of the interface has changed:

4: eth2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP qlen 1000
    link/ether 00:1e:67:51:36:71 brd ff:ff:ff:ff:ff:ff
    inet6 fe80::21e:67ff:fe51:3671/64 scope link
       valid_lft forever preferred_lft forever
5: eth2.451@eth2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP
    link/ether 00:1e:67:51:36:71 brd ff:ff:ff:ff:ff:ff
    inet6 fe80::21e:67ff:fe51:3671/64 scope link
       valid_lft forever preferred_lft forever

Also note that the DHCP client wasn't successful in getting a DHCP address.

9- From the dnsmasq host file, we can also see that it hasn't been updated:

cat /opt/stack/data/neutron/dhcp/dfdff739-a27b-4160-9e40-c4824e8a351d/host
fa:16:3e:e5:e0:c9,host-10-0-0-2.openstacklocal,10.0.0.2
90:e2:ba:48:27:ed,host-10-0-0-12.openstacklocal,10.0.0.12
fa:16:3e:99:c8:b9,host-10-0-0-5.openstacklocal,10.0.0.5

Environment:
============

Latest master

commit 7500bef94f526a82392400415f07f744700324a9
Merge: 8d5aff7 39fb302
Author: Jenkins <email address hidden>
Date: Fri Aug 26 05:03:48 2016 +0000

    Merge "Revert "Optional separate database for placement API""

Tags: pci
Revision history for this message
Ludovic Beliveau (ludovic-beliveau) wrote :

Fix proposed to branch: master
Review: https://review.openstack.org/361438

Matt Riedemann (mriedem)
tags: added: pci
Changed in nova:
status: New → In Progress
assignee: nobody → Ludovic Beliveau (ludovic-beliveau)
importance: Undecided → Medium
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on nova (master)

Change abandoned by Matt Riedemann (<email address hidden>) on branch: master
Review: https://review.opendev.org/361438
Reason: I haven't seen any responses from anyone to work on this in over a month. I would think starlingx and/or mellanox devs would care about this, but I'm going to abandon. If someone wants me to restore it please just ask.

Matt Riedemann (mriedem)
Changed in nova:
assignee: Ludovic Beliveau (ludovic-beliveau) → nobody
status: In Progress → Confirmed
Revision history for this message
Matt Riedemann (mriedem) wrote :

(10:04:56 AM) adrianc_: mriedem: regarding https://review.opendev.org/#/c/361468, can be abandoned IMO.
(10:05:25 AM) mriedem: adrianc_: i'm assuming you mean the nova change right?
(10:05:28 AM) mriedem: https://review.opendev.org/#/c/361438
(10:06:05 AM) adrianc_: yea
(10:06:29 AM) mriedem: adrianc_: ok any particular reason why? is it no longer valid?
(10:06:40 AM) mriedem: b/c the bug is still open
(10:08:05 AM) adrianc_: mriedem: well, i would expect SR-IOV to be used and not a direct physical port.
(10:09:33 AM) adrianc_: mriedem: i.e direct port and not direct-physical. while the bug is still valid, im not too sure its really stepping on too many toes. (at least not form mellanox side :) )

Changed in nova:
importance: Medium → Low
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.