Comment 26 for bug 1771662

Revision history for this message
Jason Hobbs (jason-hobbs) wrote : Re: [Bug 1771662] Re: libvirtError: Node device not found: no node device with matching name

cd /sys/bus/pci/devices && grep -nr . *

xenial:
http://paste.ubuntu.com/p/F5qyvN2Qrr/

On Tue, May 22, 2018 at 5:27 PM, Jason Hobbs <email address hidden> wrote:
> Do you really want a tar? How about ls -alR? xenial:
>
> http://paste.ubuntu.com/p/wyQ3kTsyBB/
>
> On Tue, May 22, 2018 at 5:14 PM, Jason Hobbs <email address hidden> wrote:
>> ok; looks like that 4.15.0-22-generic just released and wasn't what I
>> used in the first reproduction... I doubt that's it.
>>
>> On Tue, May 22, 2018 at 4:58 PM, Ryan Harper <email address hidden> wrote:
>>> Comparing the kernel logs, on Xenial, the second nic comes up:
>>>
>>> May 22 15:00:27 aurorus kernel: [ 24.840500] IPv6:
>>> ADDRCONF(NETDEV_UP): enP2p1s0f2: link is not ready
>>> May 22 15:00:27 aurorus kernel: [ 25.472391] thunder-nicvf
>>> 0002:01:00.2 enP2p1s0f2: Link is Up 10000 Mbps Full duplex
>>>
>>> But on bionic, we only ever have f3 up. Note this isn't a network
>>> configuration, but rather the state of the Nic and the switch.
>>> It doesn't appear to matter, 0f3 is what get's bridged by juju anyhow.
>>> But it does suggest that something is different.
>>>
>>> There is a slight kernel version variance as well:
>>>
>>> Xenial:
>>> May 22 15:00:27 aurorus kernel: [ 0.000000] Linux version
>>> 4.15.0-22-generic (buildd@bos02-arm64-038) (gcc version 5.4.0 20160609
>>> (Ubuntu/Lin
>>>
>>> Bionic:
>>> May 17 18:03:47 aurorus kernel: [ 0.000000] Linux version
>>> 4.15.0-20-generic (buildd@bos02-arm64-029) (gcc version 7.3.0
>>> (Ubuntu/Linaro 7.3.
>>>
>>> Looks like Xenial does not use unified cgroup namespaces, not sure
>>> what affect this may have on what's running in those lxd juju
>>> containers.
>>>
>>> % grep DENIED *.log
>>> bionic.log:May 17 18:19:33 aurorus kernel: [ 983.592228] audit:
>>> type=1400 audit(1526581173.043:70): apparmor="DENIED"
>>> operation="mount" info="failed flags match" error=-13
>>> profile="lxd-juju-657fe9-1-lxd-1_</var/lib/lxd>"
>>> name="/sys/fs/cgroup/unified/" pid=24143 comm="systemd"
>>> fstype="cgroup2" srcname="cgroup" flags="rw, nosuid, nodev, noexec"
>>> bionic.log:May 17 18:19:33 aurorus kernel: [ 983.592476] audit:
>>> type=1400 audit(1526581173.043:71): apparmor="DENIED"
>>> operation="mount" info="failed flags match" error=-13
>>> profile="lxd-juju-657fe9-1-lxd-1_</var/lib/lxd>"
>>> name="/sys/fs/cgroup/unified/" pid=24143 comm="systemd"
>>> fstype="cgroup2" srcname="cgroup" flags="rw, nosuid, nodev, noexec"
>>> bionic.log:May 17 18:19:41 aurorus kernel: [ 991.818402] audit:
>>> type=1400 audit(1526581181.267:88): apparmor="DENIED"
>>> operation="mount" info="failed flags match" error=-13
>>> profile="lxd-juju-657fe9-1-lxd-1_</var/lib/lxd>"
>>> name="/run/systemd/unit-root/var/lib/lxcfs/" pid=24757
>>> comm="(networkd)" flags="ro, nosuid, nodev, remount, bind"
>>> bionic.log:May 17 18:19:46 aurorus kernel: [ 997.271203] audit:
>>> type=1400 audit(1526581186.719:90): apparmor="DENIED"
>>> operation="mount" info="failed flags match" error=-13
>>> profile="lxd-juju-657fe9-1-lxd-2_</var/lib/lxd>"
>>> name="/sys/fs/cgroup/unified/" pid=25227 comm="systemd"
>>> fstype="cgroup2" srcname="cgroup" flags="rw, nosuid, nodev, noexec"
>>> bionic.log:May 17 18:19:46 aurorus kernel: [ 997.271425] audit:
>>> type=1400 audit(1526581186.723:91): apparmor="DENIED"
>>> operation="mount" info="failed flags match" error=-13
>>> profile="lxd-juju-657fe9-1-lxd-2_</var/lib/lxd>"
>>> name="/sys/fs/cgroup/unified/" pid=25227 comm="systemd"
>>> fstype="cgroup2" srcname="cgroup" flags="rw, nosuid, nodev, noexec"
>>> bionic.log:May 17 18:19:55 aurorus kernel: [ 1006.285863] audit:
>>> type=1400 audit(1526581195.735:108): apparmor="DENIED"
>>> operation="mount" info="failed flags match" error=-13
>>> profile="lxd-juju-657fe9-1-lxd-2_</var/lib/lxd>"
>>> name="/run/systemd/unit-root/" pid=26209 comm="(networkd)" flags="ro,
>>> remount, bind"
>>> bionic.log:May 17 18:20:12 aurorus kernel: [ 1022.760512] audit:
>>> type=1400 audit(1526581212.211:110): apparmor="DENIED"
>>> operation="mount" info="failed flags match" error=-13
>>> profile="lxd-juju-657fe9-1-lxd-0_</var/lib/lxd>"
>>> name="/sys/fs/cgroup/unified/" pid=28344 comm="systemd"
>>> fstype="cgroup2" srcname="cgroup" flags="rw, nosuid, nodev, noexec"
>>> bionic.log:May 17 18:20:12 aurorus kernel: [ 1022.760713] audit:
>>> type=1400 audit(1526581212.211:111): apparmor="DENIED"
>>> operation="mount" info="failed flags match" error=-13
>>> profile="lxd-juju-657fe9-1-lxd-0_</var/lib/lxd>"
>>> name="/sys/fs/cgroup/unified/" pid=28344 comm="systemd"
>>> fstype="cgroup2" srcname="cgroup" flags="rw, nosuid, nodev, noexec"
>>> bionic.log:May 17 18:20:20 aurorus kernel: [ 1031.256448] audit:
>>> type=1400 audit(1526581220.707:128): apparmor="DENIED"
>>> operation="mount" info="failed flags match" error=-13
>>> profile="lxd-juju-657fe9-1-lxd-0_</var/lib/lxd>"
>>> name="/run/systemd/unit-root/" pid=29205 comm="(networkd)" flags="ro,
>>> remount, bind"
>>> bionic.log:May 17 18:30:03 aurorus kernel: [ 1613.787782] audit:
>>> type=1400 audit(1526581803.277:151): apparmor="DENIED"
>>> operation="mount" info="failed flags match" error=-13
>>> profile="lxd-juju-657fe9-1-lxd-0_</var/lib/lxd>" name="/bin/"
>>> pid=91926 comm="(arter.sh)" flags="ro, remount, bind"
>>> bionic.log:May 17 18:30:03 aurorus kernel: [ 1613.832621] audit:
>>> type=1400 audit(1526581803.321:152): apparmor="DENIED"
>>> operation="mount" info="failed flags match" error=-13
>>> profile="lxd-juju-657fe9-1-lxd-0_</var/lib/lxd>" name="/bin/"
>>> pid=91949 comm="(y-helper)" flags="ro, remount, bind"
>>>
>>>
>>> xenial.log:May 22 15:15:10 aurorus kernel: [ 918.311740] audit:
>>> type=1400 audit(1527002110.131:109): apparmor="DENIED"
>>> operation="file_mmap"
>>> namespace="root//lxd-juju-878ab5-1-lxd-1_<var-lib-lxd>"
>>> profile="/usr/lib/lxd/lxd-bridge-proxy"
>>> name="/usr/lib/lxd/lxd-bridge-proxy" pid=40973 comm="lxd-bridge-prox"
>>> requested_mask="m" denied_mask="m" fsuid=100000 ouid=100000
>>> xenial.log:May 22 15:15:11 aurorus kernel: [ 919.605481] audit:
>>> type=1400 audit(1527002111.427:115): apparmor="DENIED"
>>> operation="file_mmap"
>>> namespace="root//lxd-juju-878ab5-1-lxd-2_<var-lib-lxd>"
>>> profile="/usr/lib/lxd/lxd-bridge-proxy"
>>> name="/usr/lib/lxd/lxd-bridge-proxy" pid=41233 comm="lxd-bridge-prox"
>>> requested_mask="m" denied_mask="m" fsuid=100000 ouid=100000
>>>
>>> Looking at the nova.pci.utils code, the different errors seem to be
>>> related to sysfs entries:
>>>
>>> https://git.openstack.org/cgit/openstack/nova/tree/nova/pci/utils.py?id=e919720e08fae5c07cecda00ac2d51b0a09f533e#n196
>>>
>>> If the sysfs path exists, then we go "further down" the hole and get
>>> an error like in bionic, but if the sysfs path does not exist, then we
>>> get
>>> the exception we see in Xenial.
>>>
>>> Can we get a tar of /sys for both to see if this confirms the
>>> suspicion that we're taking different paths due to differing kernels?
>>>
>>>
>>> On Tue, May 22, 2018 at 3:27 PM, Jason Hobbs <email address hidden> wrote:
>>>> marked new on nova-compute-charm due to rharper's comment #18, and new
>>>> on libvirt because I've posted all the requested logs now.
>>>>
>>>> --
>>>> You received this bug notification because you are subscribed to the bug
>>>> report.
>>>> https://bugs.launchpad.net/bugs/1771662
>>>>
>>>> Title:
>>>> libvirtError: Node device not found: no node device with matching name
>>>>
>>>> To manage notifications about this bug go to:
>>>> https://bugs.launchpad.net/charm-nova-compute/+bug/1771662/+subscriptions
>>>
>>> --
>>> You received this bug notification because you are subscribed to the bug
>>> report.
>>> https://bugs.launchpad.net/bugs/1771662
>>>
>>> Title:
>>> libvirtError: Node device not found: no node device with matching name
>>>
>>> Status in OpenStack nova-compute charm:
>>> New
>>> Status in libvirt package in Ubuntu:
>>> New
>>>
>>> Bug description:
>>> After deploying openstack on arm64 using bionic and queens, no
>>> hypervisors show upon. On my compute nodes, I have an error like:
>>>
>>> 2018-05-16 19:23:08.165 282170 ERROR nova.compute.manager
>>> libvirtError: Node device not found: no node device with matching name
>>> 'net_enP2p1s0f1_40_8d_5c_ba_b8_d2'
>>>
>>> In my /var/log/nova/nova-compute.log
>>>
>>> I'm not sure why this is happening - I don't use enP2p1s0f1 for
>>> anything.
>>>
>>> There are a lot of interesting messages about that interface in syslog:
>>> http://paste.ubuntu.com/p/8WT8NqCbCf/
>>>
>>> Here is my bundle: http://paste.ubuntu.com/p/fWWs6r8Nr5/
>>>
>>> The same bundle works fine for xenial-queens, with the source changed
>>> to the cloud-archive, and using stable charms rather than -next. I hit
>>> this same issue on bionic queens using either stable or next charms.
>>>
>>> This thread has some related info, I think:
>>> https://www.spinics.net/linux/fedora/libvir/msg160975.html
>>>
>>> This is with juju 2.4 beta 2.
>>>
>>> Package versions on affected system:
>>> http://paste.ubuntu.com/p/yfQH3KJzng/
>>>
>>> To manage notifications about this bug go to:
>>> https://bugs.launchpad.net/charm-nova-compute/+bug/1771662/+subscriptions