Bug #1771662 “[bionic] libvirtError: Node device not found: no n...” : Bugs : OpenStack Nova Compute Charm

Revision history for this message

Jason Hobbs (jason-hobbs) wrote on 2018-05-16:

#1

juju-crashdump-arm64.tar.gz Edit (21.4 MiB, application/x-tar)

description:

updated

Revision history for this message

Jason Hobbs (jason-hobbs) wrote on 2018-05-16:

#2

Subscribed to field high as this is blocking bionic queens testing and is 100% reproducible.

Jason Hobbs (jason-hobbs) on 2018-05-16

description:

updated

Jason Hobbs (jason-hobbs) on 2018-05-16

description:

updated

Jason Hobbs (jason-hobbs) on 2018-05-17

description:

updated

Jason Hobbs (jason-hobbs) on 2018-05-17

description:

updated

Revision history for this message

Christian Ehrhardt  (paelzer) wrote on 2018-05-17:

#3

What puzzles me is Xenial-Queens working and Bionic showing issues.
Because it seems like libvirt being unable to cope with this type of HW, but since it works in one but not the other ...
Yet versions are:
- xenial-queens
    libvirt 4.0.0-1ubuntu7~cloud0
    qemu 1:2.11+dfsg-1ubuntu7~cloud0
- bionic
    libvirt 4.0.0-1ubuntu8
    qemu 1:2.11+dfsg-1ubuntu7.1

Which are the same except a minor bump which UCA will sync in a bit.

And jhobbs reports even the kernels are the same (Xenial with HWE).
So for now, ?!?

Jason Hobbs (jason-hobbs) on 2018-05-17

description:

updated

Ryan Beisner (1chb1n) on 2018-05-17

Changed in charm-nova-compute:
status:	New → Invalid

Revision history for this message

Ryan Beisner (1chb1n) wrote on 2018-05-17:

#4

We think this is an issue in libvirt, related to how it handles the sriov hardware in these machines.

Revision history for this message

Andrew McLeod (admcleod) wrote on 2018-05-17:

#5

Further information: Using juju 2.4 beta2 I was able to deploy magpie on bionic in lxd and baremetal via MAAS.

Revision history for this message

Jason Hobbs (jason-hobbs) wrote on 2018-05-17:

#6

The deploy works fine with juju 2.4 beta 2 and xenial/queens.

package versions: http://paste.ubuntu.com/p/PF7Jb7gxnX/

we do see this in nova-compute.log, but it's not fatal:
http://paste.ubuntu.com/p/Dh4ZGVTtH8/

Revision history for this message

Jason Hobbs (jason-hobbs) wrote on 2018-05-17:

#7

This looks like it is specific to this hardware and the way it does VFs and PFs, so I'm removing field-high.

Revision history for this message

Jason Hobbs (jason-hobbs) wrote on 2018-05-17:

#8

given it works with the same libvirt and kernel on 16.04 but not 18.04, I'm suspicious of netplan here.

Revision history for this message

Steve Langasek (vorlon) wrote on 2018-05-17:

#9

> I'm suspicious of netplan here.

netplan is only the messenger here, between cloud-init+juju and networkd. Can you show the complete netplan yaml as it's been laid down on the system in question?

Revision history for this message

Jason Hobbs (jason-hobbs) wrote on 2018-05-17:

#10

steve captured what I meant in #8 better than I did: 17:46 < slangasek> one could as accurately say "I'm suspicious this is related to us replacing the whole networking stack in Ubuntu" ;-)

Revision history for this message

Ryan Harper (raharper) wrote on 2018-05-17:

#11

Please capture:

1) cloud-init collect-logs (writes cloud-init.tar to $CWD)
2) the journal /var/log/journal
3) /etc/netplan and /run/systemd
4) /etc/udev/rules.d

Revision history for this message

Ryan Harper (raharper) wrote on 2018-05-17:

#12

And for the xenial deployment version, can we get what's in /etc/network/interfaces* (including the .d)?

I'm generally curious w.r.t what interfaces are managed by the OS, and which ones are being delegated to the guests.

Revision history for this message

Ryan Harper (raharper) wrote on 2018-05-17:

#13

To make it more clear; the hardware SRIOV device is different that normal:

<cpaelzer> TL;DR this special device has VFs that have NO PF associated
<cpaelzer> software doesn't understand this

Though per comment #3; it seems odd that a Xenial/Queens with the same kernel (HWE) works OK. So some tracing in libvirt/nova to confirm different paths, I think.

Revision history for this message

Jason Hobbs (jason-hobbs) wrote on 2018-05-17:

#14

@rharper still working on getting the other stuff you've asked for, but here is the uname -a output from xenial vs bionic:
http://paste.ubuntu.com/p/rJDpK5SyW9/

Revision history for this message

Ryan Harper (raharper) wrote on 2018-05-17:

#15

Some package level deltas that may be relevant:

ii linux-firmware 1.173
ii linux-firmware 1.157.18

ii pciutils 1:3.3.1-1.1ubuntu1.2
ii pciutils 1:3.5.2-1ubuntu

libvirt0:arm64 4.0.0-1ubuntu7~cloud0
libvirt0:arm64 4.0.0-1ubuntu8

Less likely to have an impact, guest firmware but none-the-less delta:

qemu-efi-aarch64 0~20180205.c0d9813c-2
qemu-efi 0~20160408.ffea0a2c-2

Revision history for this message

Jason Hobbs (jason-hobbs) wrote on 2018-05-17:

#16

bionic-logs.tgz Edit (905.0 KiB, application/x-tar)

@rharper here are the logs you asked for from the bionic deploy

Revision history for this message

Jason Hobbs (jason-hobbs) wrote on 2018-05-17:

#17

bionic-var-log-and-etc.tgz Edit (2.0 MiB, application/x-tar)

all of /var/log and /etc from the bionic deploy.

Revision history for this message

Ryan Harper (raharper) wrote on 2018-05-17:

#18

Thanks for the logs.

I generally don't see anything *fatal* to libvirt. In the nova logs, I can see that virsh capabilities returns host information. It certainly is failing to find the VFs on the SRIOV device; it's not clear if that's because the device is misbehaving (we can see the kernel events indicating the driver is being reset, enP2p1s0f1 renamed eth0, eth0 renamed to enP2p1s0f1 which can only happen if the driver has been reset) or if the probing of device's PCI address space is triggering a reset.

Note that netplan has no skin in this game; it applies a DHCP and DNS config to enP2p1s0f3 which stays up the whole time, juju even bridges en..f3 etc. The other interfaces found during boot are set to "manual" config; that is netplan writes a .link file for setting the name, but note that the name is the predictable name it gets from the default udev policy anyhow.

At this point, we can compare the logs to Xenial, but I think the next step is back to the charms/nova-compute to determine how a node reports back to openstack that a compute node is ready.

Revision history for this message

Christian Ehrhardt  (paelzer) wrote on 2018-05-18:

#19

Download full text (5.2 KiB)

Newly deployed Cavium System with 18.04 to get my own view onto this (without openstack/charms in the way)

1. start a basic guest
   $ sudo apt install uvtool-libvirt qemu-efi-aarch64
   $ uvt-simplestreams-libvirt --verbose sync --source http://cloud-images.ubuntu.com/daily arch=arm64 label=daily release=bionic
   $ uvt-kvm create --password=ubuntu b1 release=bionic arch=arm64 label=daily

=> Just works, nothing special in logs
Since it was stated that the special VF/PF are not uses this already breaks the argument made in the bug report - my guest just works on this system.

2. check the odd PF/VF situation

Please note that I had only the initial renames to the new naming scheme, but no others:
dmesg | grep renamed
[ 10.450002] thunder-nicvf 0002:01:00.2 enP2p1s0f2: renamed from eth1
[ 10.489989] thunder-nicvf 0002:01:00.1 enP2p1s0f1: renamed from eth0
[ 10.629936] thunder-nicvf 0002:01:00.4 enP2p1s0f4: renamed from eth3
[ 10.877936] thunder-nicvf 0002:01:00.3 enP2p1s0f3: renamed from eth2
[ 10.957933] thunder-nicvf 0002:01:00.5 enP2p1s0f5: renamed from eth4

None of the devices has pyhsical_port_id but that is no fatal.
Because on other platforms I found the same e.g. ppc64el some have that some don't /sys/devices/pci0003:00/0003:00:00.0/0003:01:00.0/0003:02:09.0/0003:09:00.0/net/enP3p9s0f0/phys_port_id': Operation not supported
/sys/devices/pci0005:00/0005:00:00.0/0005:01:00.3/net/enP5p1s0f3/phys_port_id 0400000000334233343130363730453131

It will just use NULL which essentially menas there is just one phys port and that is fine.

It is more interesting that it later checks physfn which exists on Cavium (but not on ppc64 for example)
ll /sys/devices/pci0002:00/0002:00:02.0/0002:01:01.4/physfn
lrwxrwxrwx 1 root root 0 May 18 06:23 /sys/devices/pci0002:00/0002:00:02.0/0002:01:01.4/physfn -> ../0002:01:00.0/

If this would NOT exist it would give up here.
But it does exist, so it tries to go on with it and then fails as it doesn't find anything.
That would match what we read in the reported upstream mail discussion.

But none of this matters as per jhobbs it should not use those devices at all.

FYI code in libvirt around that:
virNetDevGetPhysicalFunction
-> virNetDevGetPhysPortID
   -> virNetDevSysfsFile
   This gives you something like
   /sys/devices/pci0002:00/0002:00:02.0/0002:01:00.4/net/enP2p1s0f4/phys_port_id
-> virNetDevSysfsDeviceFile
-> virPCIGetNetName
If none of these functions failed BUT returned no path then the reported message appears.
On other HW it either works OR just doesn't find the paths and gives up before the error message.

3. check libvirt capabilities and status
As I asked before, we would need to know the libvirt action that fails, as all I tried just works.

Also general probing like one would expect on an initial nova node setup:
  $ virsh capabilities
  $ virsh domcapabilities
  $ virsh sysinfo
  $ virsh nodeinfo
works just fine without the reported errors.

4. Lets even use those devices now
The host uses enP2p1s0f1, that is:
0002:01:00.1 Ethernet controller: Cavium, Inc. THUNDERX Network Interface Controller virtual function (rev 09)
So lets use its siblings
As passthrough host-interface
0002...

Newly deployed Cavium System with 18.04 to get my own view onto this (without openstack/charms in the way)

1. start a basic guest
   $ sudo apt install uvtool-libvirt qemu-efi-aarch64
   $ uvt-simplestreams-libvirt --verbose sync --source http://cloud-images.ubuntu.com/daily arch=arm64 label=daily release=bionic
   $ uvt-kvm create --password=ubuntu b1 release=bionic arch=arm64 label=daily

=> Just works, nothing special in logs
Since it was stated that the special VF/PF are not uses this already breaks the argument made in the bug report - my guest just works on this system.

2. check the odd PF/VF situation

Please note that I had only the initial renames to the new naming scheme, but no others:
dmesg | grep renamed
[   10.450002] thunder-nicvf 0002:01:00.2 enP2p1s0f2: renamed from eth1
[   10.489989] thunder-nicvf 0002:01:00.1 enP2p1s0f1: renamed from eth0
[   10.629936] thunder-nicvf 0002:01:00.4 enP2p1s0f4: renamed from eth3
[   10.877936] thunder-nicvf 0002:01:00.3 enP2p1s0f3: renamed from eth2
[   10.957933] thunder-nicvf 0002:01:00.5 enP2p1s0f5: renamed from eth4