libvirt doesn't show all interfaces

Bug #1764314 reported by Nicolas Jungers on 2018-04-16
18
This bug affects 2 people
Affects Status Importance Assigned to Milestone
libvirt (Ubuntu)
Status tracked in Cosmic
Bionic
Undecided
Unassigned
Cosmic
Medium
Unassigned
netcf (Debian)
New
Unknown
netcf (Ubuntu)
Status tracked in Cosmic
Bionic
Undecided
Unassigned
Cosmic
Medium
Unassigned

Bug Description

libvirt-manager or virsh iface-list -all don't show all interfaces available in the system.
Specifically, I've 2 bridge defined (br0 and br1) and only br0 shows up in the selectable menu or in the listing. On 16.04, all bridges show up.

On 18.04 box, ip link gives:

1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
2: enp14s0f0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP mode DEFAULT group default qlen 1000
    link/ether 52:54:00:e7:95:47 brd ff:ff:ff:ff:ff:ff
3: enp11s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master br0 state UP mode DEFAULT group default qlen 1000
    link/ether 70:85:c2:42:e9:2a brd ff:ff:ff:ff:ff:ff
4: enp14s0f1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel master br1 state UP mode DEFAULT group default qlen 1000
    link/ether 68:05:ca:05:c1:5b brd ff:ff:ff:ff:ff:ff
5: wlp9s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master br0 state UP mode DEFAULT group default qlen 1000
    link/ether 10:f0:05:8b:44:49 brd ff:ff:ff:ff:ff:ff
6: br1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether 22:22:76:64:28:f9 brd ff:ff:ff:ff:ff:ff
7: br0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether ee:5d:78:0f:f2:e5 brd ff:ff:ff:ff:ff:ff
8: vnet0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel master br0 state UNKNOWN mode DEFAULT group default qlen 1000
    link/ether fe:54:00:e6:0a:dc brd ff:ff:ff:ff:ff:ff
9: macvtap0@enp14s0f0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP mode DEFAULT group default qlen 500
    link/ether 52:54:00:e7:95:47 brd ff:ff:ff:ff:ff:ff

on the same box, virsh iface-list --all gives:

 Name State MAC Address
---------------------------------------------------
 br0 active ee:5d:78:0f:f2:e5
 lo active 00:00:00:00:00:00

on a different box running 16.04. I get:
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master br0 state UP mode DEFAULT group default qlen 1000
    link/ether 00:25:22:9f:28:6e brd ff:ff:ff:ff:ff:ff
3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master br1 state UP mode DEFAULT group default qlen 1000
    link/ether 00:1b:21:8d:87:9c brd ff:ff:ff:ff:ff:ff
4: eth2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master br2 state UP mode DEFAULT group default qlen 1000
    link/ether 00:1b:21:8d:87:9d brd ff:ff:ff:ff:ff:ff
5: br0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether 00:25:22:9f:28:6e brd ff:ff:ff:ff:ff:ff
6: br1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether 00:1b:21:8d:87:9c brd ff:ff:ff:ff:ff:ff
7: br2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether 00:1b:21:8d:87:9d brd ff:ff:ff:ff:ff:ff
11: vnet0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master br0 state UNKNOWN mode DEFAULT group default qlen 1000
    link/ether fe:54:00:2e:e6:7a brd ff:ff:ff:ff:ff:ff
12: vnet1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master br1 state UNKNOWN mode DEFAULT group default qlen 1000
    link/ether fe:54:00:48:7c:d2 brd ff:ff:ff:ff:ff:ff
13: vnet2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master br2 state UNKNOWN mode DEFAULT group default qlen 1000
    link/ether fe:54:00:02:38:08 brd ff:ff:ff:ff:ff:ff

 Name State MAC Address
---------------------------------------------------
 br0 active 00:25:22:9f:28:6e
 br1 active 00:1b:21:8d:87:9c
 br2 active 00:1b:21:8d:87:9d
 lo active 00:00:00:00:00:00

On a practical side, on the 18.04 box, filling in br1 as interface for a vm definition works (only tested in virt-manager).

ProblemType: Bug
DistroRelease: Ubuntu 18.04
Package: libvirt-bin 4.0.0-1ubuntu7
Uname: Linux 4.16.0-041600rc7-generic x86_64
ApportVersion: 2.20.9-0ubuntu4
Architecture: amd64
Date: Mon Apr 16 09:42:46 2018
InstallationDate: Installed on 2018-03-27 (19 days ago)
InstallationMedia: Ubuntu-Server 18.04 LTS "Bionic Beaver" - Alpha amd64 (20180311)
SourcePackage: libvirt
UpgradeStatus: No upgrade log present (probably fresh install)

Nicolas Jungers (unbug) wrote :
Andreas Hasenack (ahasenack) wrote :

For what is worth, I get nothing listed on my 18.04 laptop, where I use libvirt extensively:

$ virsh iface-list --all
 Name State MAC Address
---------------------------------------------------

$

Changed in libvirt (Ubuntu):
status: New → Confirmed
Download full text (3.2 KiB)

It is sometimes surprising what extra features libvirt all has - I haven't touched this for all the time working with it - Feature of 2009 it seems :-)

First of all yes, I also can't see anything on 18.04 boxes (not that on 16.04 I'd have got a lot, only "lo" actually).

There is nothing of the common issues:
- no appamor denie
- no error in a full-debug libvirtd log

Internally this goes through virshInterfaceListCollect -> virConnectListAllInterfaces -> conn->interfaceDriver->connectListAllInterfaces

That call to the libvirt daemon succeeds but returns:
(gdb) p *(virshInterfaceListPtr)ifaces
$15 = {ifaces = 0x55555580f370, nifaces = 0}

So that is an empty list from the daemon, check what happens over there.
The backend uses netcf through ncf_list_interfaces and gets a count of zero.

The implementation of this is in ncf.
It does a few checks and then goes to drv_list_interfaces which for us is is implemented in drv_debian.c.
This then calls list_interface_ids with arguments to fill the list (one can set arg 2/3 to 0 to just get a number).

This then uses list_interfaces to generate a list via uniq_device_names and checks against some filters.

But uniq_device_names already gets an argument with "how many" interfaces are to be expected and that is zero.

This number comes from:
  aug_fmt_match(ncf, &devs, "%s/iface", network_interfaces_path);

This implementation is based on ENI as network_interfaces_path is essentially
  static const char *const network_interfaces_path = "/files/etc/network/interfaces"

So I think we found a case of ENI -> (pure) networkd transition causing some issues.
That also explains why on my 16.04 I don't see all interfaces, I have not all (none but lo) configured via ENI in /etc/network/interfaces.

This is not so much a lbivirt, but much more a netcf issue.
IMHO this was all forgotten in the netplan change :-/ and needs implementation.
Maybe it needs more since it seems to just hang around for quite a while.

A testcase for now looks like that:
$ apt source netcf
$ sudo apt build-dep netcf
$ cd netcf-0.2.8
$ ./configure --with-driver=debian
$ make -j12
$ make check
$ cd tests
$ make test-debian
$ make check

That actually should have tested the function we found breaking - odd.
Instead one can for now use a very simplified netcfConnectListInterfacesImpl:

$ sudo apt install libnetcf1

$ cat >> test.c << EOF
#include <netcf.h>
#include <stdio.h>

int main()
{
   int count = 0;
   int status = NETCF_IFACE_ACTIVE | NETCF_IFACE_INACTIVE;
   struct netcf *netcf = NULL;

   if (ncf_init(&netcf, NULL) != 0) {
       printf("Init failed\n");
       return -1;
   }

   count = ncf_num_of_interfaces(netcf, status);
   printf("Count is %d\n", count);
   return 0;
}
EOF

$ gcc -Wall -o test test.c -lnetcf; ./test

That should return a number matching your /et/network/interfaces on 16.04, but due to lacking it nothing on 18.04

Only good thing for now, it seems only libvirt uses netcf and while iface list and such fail they are not strictly required (as we can see it only now shows up for virt-manager centric usage).
Never the less a big issue :-/

There is a udev backend, maybe we should consider switchign instead of im...

Read more...

Changed in libvirt (Ubuntu):
status: Confirmed → Triaged
importance: Undecided → High
Changed in netcf (Ubuntu):
status: New → Triaged
importance: Undecided → High
Changed in libvirt (Ubuntu):
importance: High → Medium
status: Triaged → Confirmed

This didn't fully work even pre netplan, in fact it is separate.
On my partially E/N/I partially NetworkManager system only the former are accessible to netcf.
So this functionality from libvirts POV was reduced for quite a while.

While netcf itself can be considered really broken atm.

That is why I set low/high severity differently for the two affected packages.

Changed in libvirt (Ubuntu):
importance: Medium → Low
Download full text (5.1 KiB)

Hi,
after sleeping once about this to sort my thoughts I have this morning revisited most of the code.

## Usage and state of ncf ##

In its only user currently being libvirt I checked that usage via:
 $ git log src/interface/
There were only structural changes (how to allocate, make interface accessible, global renames), but no new features "through netcf" or such post 2015 which matches when major activity there stopped.

Once again, I checked former versions this is more or less broken for a long time since NetworkManager configured devices as well as networkd configured devices are not shown.

Other drivers it only check for their old style as well - ifcfg- files (suse), and check scripts in /etc/sysconfig/network-scripts/ (redhat). None handled any other (e.g. those through wicked) configuration scheme either.

I come back to think those iface-* action in libvirt not being important for many use cases for not having broken on someone before :-/

I was also trying other interface API calls of libvirt through virsh, they are all affected in a similar way (e.g. ifup not found, ...).

## Alternatives for Libvirt ##

There is a libvirt udev based interface backend available as well meant for Distribution/Releases with no netcf support. And in some way unless we implement that we are just that now.
That switch would need to be tested as well and would effectively make this backend read-only
It supports
    .connectNumOfInterfaces = udevConnectNumOfInterfaces, /* 1.0.0 */
    .connectListInterfaces = udevConnectListInterfaces, /* 1.0.0 */
    .connectNumOfDefinedInterfaces = udevConnectNumOfDefinedInterfaces, /* 1.0.0 */
    .connectListDefinedInterfaces = udevConnectListDefinedInterfaces, /* 1.0.0 */
    .connectListAllInterfaces = udevConnectListAllInterfaces, /* 1.0.0 */
    .interfaceLookupByName = udevInterfaceLookupByName, /* 1.0.0 */
    .interfaceLookupByMACString = udevInterfaceLookupByMACString, /* 1.0.0 */
    .interfaceIsActive = udevInterfaceIsActive, /* 1.0.0 */
    .interfaceGetXMLDesc = udevInterfaceGetXMLDesc, /* 1.0.0 */
But drops transactions and:
     .interfaceDefineXML = netcfInterfaceDefineXML, /* 0.7.0 */
     .interfaceUndefine = netcfInterfaceUndefine, /* 0.7.0 */
     .interfaceCreate = netcfInterfaceCreate, /* 0.7.0 */
     .interfaceDestroy = netcfInterfaceDestroy, /* 0.7.0 */

So (if tests confirm to be ok) at the reduced functionality of no write (better than broken write still) one option we have is switching to udev backend in libvirt and completely remove netcf.

## netcf usage and implementation ##

Of the netcf API the currently used set is:
ncf_change_begin
ncf_change_commit
ncf_change_rollback
ncf_close
ncf_define
ncf_error
ncf_if_down
ncf_if_free
ncf_if_mac_string
ncf_if_name
ncf_if_status
ncf_if_undefine
ncf_if_up
ncf_if_xml_desc
ncf_if_xml_state
ncf_init
ncf_list_interfaces
ncf_lookup_by_mac_string
ncf_lookup_by_name
ncf_num_o...

Read more...

Note: easiest test without compiling is likely
$ apt install netcf
$ netcftool list
...

Related to all of that is bug 1688345.
I contacted cyphermox who filed and worked on that to get his opinion.

Networks are configured through net-* commands (and respective APIs) and they all work.
This is what is commonly used by libvirt users and solutions depending on libvirt.

The scope of iface-* commands is to manage the Host network interfaces.
This is not really what we'd want libvirt to do.
You can use networkd/NetworkManager directly and if you want a great one for all solution use https://netplan.io/ - but we don't really want libvirt "for that task".

I see that the virt-manager tab "Connection Details" -> "Network Interfaces" would become a read-only pane then. But considering how long it had partial content and set up things not 100% as they should be set up maybe it is even a fix to stop it from doing so.

I'll investigate the feasibility of the udev backend and if we can set that as the default.
With some luck we can keep netcf around for those who want to opt into the former behavior, but that has to be checked - and actually I'd vote for getting rid of it if that finds approval.

@Nicolas - did you have a specific use case in mind that fully relies on those things, because in discussion we cna't find one. Or did you just find it by checking out virt-manager in 18.04 but are not relying on it to be able to write interface configurations?

Nicolas Jungers (unbug) wrote :

I tried to select an interface in virt-manager and the desired interface didn't show in the pop-up menu. But it's true that the name can be entered in the free-form dialog. But I think that for virt-manager it makes sense to offer a complete view of the network _or_ a statement informing of the shortcoming.

Thanks for your reply Nicolas, so even for you if udev backend would work it would be preferred (full view of the network).

We are evaluating if udev works well for the read-only part and I discussed with upstream how to enable/disable as well as on the approach in general.

Nicolas Jungers (unbug) wrote :

For me the main problem is the fact that the dialog in virt-manager is misleading.

Download full text (5.6 KiB)

Thanks Cyphermox for co-testign this with me.
Here an example of his better interface overview now:

virsh # iface-list
 Name State MAC Address
---------------------------------------------------
 enp3s0 active c8:60:00:6d:8c:07
 lxdbr0 active fe:34:73:b4:77:ab
 maas active c8:60:00:6d:8c:07
 staging active c8:60:00:6d:8c:07

Vlans are not always listed, but that still is much better than before.

FYI: there is a test build in https://launchpad.net/~ci-train-ppa-service/+archive/ubuntu/3251

I did my own tests on this as well, on an upgrade from xenial all that most have in their E/N/I is just "lo" so that is what they see (but nobody wants/needs lo - it is just there because it is listed in E/N/I).

$ virsh iface-list
 Name State MAC Address
---------------------------------------------------
 lo active 00:00:00:00:00:00

After the update to the proposed version I got much better output.
$ virsh iface-list
 Name State MAC Address
---------------------------------------------------
 enp0s25 active 54:ee:75:61:c1:97
 lxdbr0 active fe:24:e7:8e:21:8d
 wlp4s0 active 94:65:9c:0e:35:12

Btw those are just the active interfaces, you can list passive/inactive ones still (e.g. virt manager shows these greyed out).

$ virsh iface-list --all
 Name State MAC Address
---------------------------------------------------
 conjureup0 inactive 12:ab:dc:f1:41:bd
 enp0s25 active 54:ee:75:61:c1:97
 lo inactive 00:00:00:00:00:00
 lxdbr0 active fe:24:e7:8e:21:8d
 strswanbr1 inactive 52:54:00:fc:52:4d
 strswanbr2 inactive 52:54:00:14:9e:81
 virbr0 inactive 52:54:00:f4:ea:12
 wlp4s0 active 94:65:9c:0e:35:12

I even see slaves devices on bridges in virt-manager now.

Further functions of the iface- API space are good as well, like name<->mac translation.
$ virsh iface-name 54:ee:75:61:c1:97
enp0s25
$ virsh iface-mac enp0s25
54:ee:75:61:c1:97

Even XML generation works if you want to use the snippet for other defines (e.g. guest forwards).
$ virsh iface-dumpxml enp0s25
<interface type='ethernet' name='enp0s25'>
  <mtu size='1500'/>
  <link speed='1000' state='up'/>
  <mac address='54:ee:75:61:c1:97'/>
</interface>

Only the define/destroy/edit actions are now (intentionally) blocked.
The console calls are ok in the refusal message, just like in virt-manager.
$ virsh iface-edit enp0s25
error: this function is not supported by the connection driver: virInterfaceDefineXML
Failed. Try again? [y,n,f,?]:

From a "virt-manager" and similar tools perspective making changes now looks reasonably guarded.
If you try to do so you get "Error setting ... this function is not supported by the connection driver: virtInterdaceDefineXML"

I also wanted to check a few more corner cases, so I did:
- s390x system (for having odd device types) with some defined in E/N/I (was a Xenial upgrade)
- a lxd container for udev/containers can be od...

Read more...

Robie Basak (racb) wrote :

> This implementation is based on ENI as network_interfaces_path is essentially
> static const char *const network_interfaces_path = "/files/etc/network/interfaces"

Does it look at /etc/network/interfaces.d/ at all?

From the perspective of stable releases, my understanding is that:

1) The only report we know about is that virt-manager misses out interfaces that are not directly defined in /etc/network/interfaces from its dropdown selection box, but does allow the user to specify further interface names manually.

2) There may be other users out there who are using /etc/network/interfaces and the libvirt network interface API and do not experience any problems (because it works fine when using only /etc/network/interfaces).

3) There may be other users out there who are using both case 2 and have other interfaces defined in other places that libvirt misses (/etc/network/interfaces.d, interfaces defined elsewhere and any dynamic interfaces not specified in /etc anywhere) but are using the libvirt network interface API without a problem because the missing interfaces happen to not affect their cases. These users might face a regression if these previously missing additional interfaces suddenly appear in an SRU.

Is my understanding accurate?

If it is, then I'm tempted to suggest that leaving the situation as-is for stable releases is appropriate since use case 1 above is far less serious than a potential regression in use case 3.

For the development release, using udev read-only sounds appropriate because then the API would reflect reality directly rather than configuration items that may or may not be active. It sounds like the API was written in a world where there was only one way (per distro and relaese) to configure network interfaces. Now there are many, it doesn't seem practical to me for this kind of thing to work read-write by default.

Hi rbasak,
almost right - just missing a slight twist.
Yes to all you said until "leaving the situation as-is for stable releases is appropriate since use case 1 above is far less serious than a potential regression in use case 3."

I think we all ack on >=18.10 to select the udev backend and be good (which mean ncf gets demoted btw).

But for the SRU "leave as is" is not the right option IMHO.
If I fully buy in on the reasons you have given that we can't take away netcf backend from an SRU perspective then at least netcf will need a dependency to ifupdown because it calls ifup/ifdown and relies on that to work - if we can't take it away we have to make it work at least.

My personal suggestion to this would therefore be:
Cosmic: switch libvirt to udev
Cosmic: add the ifupdown dependency to netcf, but also demote it
Bionic: add ifupdown dependency to netcf

That will remove the issue for >18.10 the right way and OTOH not violate SRU policy but fix netcf where it is still used.

But I see from IRC discussions that we need to talk, I'll invite to share arguments and eventually decide on this, to then start fixing it (we have all the options now, just need to agree how to proceed).

Note: all verification tests with udev backend are all good btw.

We had a discussion on this:
I have to beg your pardon for all of the Ubuntu community, but we never intended to support write-managing through libvirt in Bionic. Libvirt iface-* never supported networkd or NetworkManager. To manage networks with different backends there is netplan.io.

Adding a dependency to ifupdown from netcf could hurt new installs of Bionic by pulling in ifupdown again - there are known issues and races around that that we want to avoid.
Upgraders have ifupdown around from their past anyway and new users “insisting” on the old use case can still install ifupdown on their own.

It is bad to not have the visibility of all devices in virsh iface-list and virt-manager and such, but since this is the case since at least Xenial and it never was an issue.

Compared to the alternatives - we didn’t come up with a way to fix it yet which would not carry a regression risk (or even regression fact) that would be too high. On balance it is not reasonable to be fixed at the moment. If that importance goes up what might come to my mind is a config option in libvirt where users can switch to udev - but that feature isn’t implemented in libvirt at all - so it would be a major effort not matching the severity of this IMHO.

Goin forward to Cosmic and beyond we will switch to the udev to fix the visibility

For all of the of the above I’ll have to set Bionic to Won’t Fix unless there is a change of severity due to for example more important use-cases we haven’t seen.

/me feels bad as Nicolas made a great and valid bug report, but for now this is the right choice

Changed in libvirt (Ubuntu Cosmic):
status: Confirmed → Triaged
Changed in libvirt (Ubuntu Bionic):
status: New → Won't Fix
Changed in netcf (Ubuntu Bionic):
status: New → Won't Fix
Changed in libvirt (Ubuntu Cosmic):
importance: Low → Medium
Changed in netcf (Ubuntu Cosmic):
importance: High → Medium

Filed a bug in Debian for the netcf Dependency going forward, linked at the bug tasks.

no longer affects: libvirt (Debian)
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package libvirt - 4.0.0-1ubuntu10

---------------
libvirt (4.0.0-1ubuntu10) cosmic; urgency=medium

  * Fix nwfilters that set CTRL_IP_LEARNING set to dhcp failing with "An error
    occurred, but the cause is unknown" due to a buffer being too small
    for pcap with TPACKET_V3 enabled (LP: #1758037)
    - debian/patches/ubuntu/lp-1758037-nwfilter-increase-pcap-buffer-size.patch

 -- Christian Ehrhardt <email address hidden> Wed, 09 May 2018 17:07:59 +0200

Changed in libvirt (Ubuntu Cosmic):
status: Triaged → Fix Released
Changed in netcf (Debian):
status: Unknown → New

There already was an upload ongoing, but I need to group bug 1758037 on the same run and need to set new branches. ... Integrated that now.
Regression- and Case-Tested once more from a ppa and being good.
Uploaded to Cosmic - and it already completed.
Also pushed to ubuntu libvirt-maintainers git.

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.