[SRU] Cannot create instance with multiqueue image and vif_type=tap (calico)
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
OpenStack Compute (nova) |
Medium
|
Rodrigo Barbieri | |||
Stein |
Undecided
|
Unassigned | |||
Train |
Undecided
|
Unassigned | |||
Ussuri |
Undecided
|
Unassigned | |||
Ubuntu Cloud Archive |
Undecided
|
Unassigned | |||
Stein |
Undecided
|
Unassigned | |||
Train |
Undecided
|
Unassigned | |||
Ussuri |
Undecided
|
Unassigned | |||
nova (Ubuntu) |
Undecided
|
Unassigned | |||
Focal |
Undecided
|
Unassigned |
Bug Description
When using calico, the vif_type is tap, therefore when the instance is being created, the method plug_tap() is invoked, which creates the tap device prior to launching the instance.
That tap device is currently always created without multiqueue as per [1]. When libvirt creates the instance, the XML definition "queues=<x>" clashes with the fact that the pre-existing tap interface doesn't have multiqueue enabled, and therefore errors out with the exception below. The code at [2] already handles multiqueue, but it is never invoked with multiqueue=True.
Alternatively, as a current workaround, if the instance is shutdown through virsh, or rebooted through nova, it causes the tap device to be removed, to be created again by libvirt instead, allowing the tap device to be set up with multiqueue appropriately if its XML is manually edited. This begs the question as why the plug_tap() method needs to pre-create the interface at all, if when the VM rebooted libvirt does so regardless of plug_tap().
Steps to reproduce:
1) Ubuntu bionic + devstack master + follow instructions at [3]
2) wget https:/
3) openstack image create bionic-mq --file bionic-
4) openstack image create bionic --file bionic-
5) ssh-keygen
6) openstack keypair create key1 --public-key ~/.ssh/id_rsa.pub
7) openstack flavor create --vcpu 2 --ram 1024 --disk 10 --public --id 10 test_flavor
8) openstack server create --network calico --flavor test_flavor --image bionic --key-name key1 no-mq
instance is created successfully
9) ip a
6: tapcc353751-13: <BROADCAST,
10) sudo virsh edit 1
add "<driver name='vhost' queues='2'/>" to the interface section
11) openstack server reboot no-mq
wait a few secs
12) ip a
7: tapcc353751-13: <BROADCAST,
13) ssh to the instance and run "sudo ethtool -l <interface>"
Combined: 2
14) openstack server delete no-mq
15) openstack server create --network calico --flavor test_flavor --image bionic-mq --key-name key1 mq
instance fails to be created, log shows the below stack trace.
[1] https:/
[2] https:/
[3] https:/
Aug 27 18:58:38 devstack nova-compute[7968]: ERROR nova.compute.
Aug 27 18:58:38 devstack nova-compute[7968]: ERROR nova.compute.
Aug 27 18:58:38 devstack nova-compute[7968]: ERROR nova.compute.
Aug 27 18:58:38 devstack nova-compute[7968]: ERROR nova.compute.
Aug 27 18:58:38 devstack nova-compute[7968]: ERROR nova.compute.
Aug 27 18:58:38 devstack nova-compute[7968]: ERROR nova.compute.
Aug 27 18:58:38 devstack nova-compute[7968]: ERROR nova.compute.
Aug 27 18:58:38 devstack nova-compute[7968]: ERROR nova.compute.
Aug 27 18:58:38 devstack nova-compute[7968]: ERROR nova.compute.
Aug 27 18:58:38 devstack nova-compute[7968]: ERROR nova.compute.
Aug 27 18:58:38 devstack nova-compute[7968]: ERROR nova.compute.
Aug 27 18:58:38 devstack nova-compute[7968]: ERROR nova.compute.
Aug 27 18:58:38 devstack nova-compute[7968]: ERROR nova.compute.
Aug 27 18:58:38 devstack nova-compute[7968]: ERROR nova.compute.
Aug 27 18:58:38 devstack nova-compute[7968]: ERROR nova.compute.
Aug 27 18:58:38 devstack nova-compute[7968]: ERROR nova.compute.
Aug 27 18:58:38 devstack nova-compute[7968]: ERROR nova.compute.
Aug 27 18:58:38 devstack nova-compute[7968]: ERROR nova.compute.
Aug 27 18:58:38 devstack nova-compute[7968]: ERROR nova.compute.
Aug 27 18:58:38 devstack nova-compute[7968]: ERROR nova.compute.
Aug 27 18:58:38 devstack nova-compute[7968]: ERROR nova.compute.
Aug 27 18:58:38 devstack nova-compute[7968]: ERROR nova.compute.
Aug 27 18:58:38 devstack nova-compute[7968]: ERROR nova.compute.
Aug 27 18:58:38 devstack nova-compute[7968]: ERROR nova.compute.
Aug 27 18:58:38 devstack nova-compute[7968]: ERROR nova.compute.
Aug 27 18:58:38 devstack nova-compute[7968]: ERROR nova.compute.
Aug 27 18:58:38 devstack nova-compute[7968]: ERROR nova.compute.
Aug 27 18:58:38 devstack nova-compute[7968]: ERROR nova.compute.
Aug 27 18:58:38 devstack nova-compute[7968]: ERROR nova.compute.
Aug 27 18:58:38 devstack nova-compute[7968]: ERROR nova.compute.
Aug 27 18:58:38 devstack nova-compute[7968]: ERROR nova.compute.
Aug 27 18:58:38 devstack nova-compute[7968]: ERROR nova.compute.
Aug 27 18:58:38 devstack nova-compute[7968]: ERROR nova.compute.
Aug 27 18:58:38 devstack nova-compute[7968]: ERROR nova.compute.
Aug 27 18:58:38 devstack nova-compute[7968]: ERROR nova.compute.
Aug 27 18:58:38 devstack nova-compute[7968]: ERROR nova.compute.
Aug 27 18:58:38 devstack nova-compute[7968]: ERROR nova.compute.
Aug 27 18:58:38 devstack nova-compute[7968]: ERROR nova.compute.
Aug 27 18:58:38 devstack nova-compute[7968]: ERROR nova.compute.
Aug 27 18:58:38 devstack nova-compute[7968]: ERROR nova.compute.
Aug 27 18:58:38 devstack nova-compute[7968]: ERROR nova.compute.
Aug 27 18:58:38 devstack nova-compute[7968]: ERROR nova.compute.
Aug 27 18:58:38 devstack nova-compute[7968]: ERROR nova.compute.
Aug 27 18:58:38 devstack nova-compute[7968]: ERROR nova.compute.
Aug 27 18:58:38 devstack nova-compute[7968]: INFO nova.compute.
=======
[Impact]
Users of calico plugin cannot use multiqueue in Nova. The VM fails to boot. The workaround is to edit the XML manually and reboot it through nova, so the tap interface is recreated by libvirt while the vif.plug() method is not re-run by Nova, allowing multiqueue to be set up properly by libvirt. This workaround does not scale well.
[Test case]
1. Setting up env
1a. Deploy environment
1b. Install calico plugin as per [0]
1c. Setup SSH
ssh-keygen
1d. Create keypair for testing
openstack keypair create key1 --public-key ~/.ssh/id_rsa.pub
1e. Create test flavor
openstack flavor create --vcpu 2 --ram 1024 --disk 10 --public --id 10 test_flavor
1f. Download an example image
wget https:/
1g. Create image in glance with multiqueue metadata
openstack image create bionic-mq --file bionic-
1h. Create same image in glance without multiqueue metadata
openstack image create bionic --file bionic-
1f. Create instance without multiqueue. Make sure instance creation and connectivity succeeds.
openstack server create --network calico --flavor test_flavor --image bionic --key-name key1 no-mq
2. Reproducing the bug
2a. Create instance with multiqueue
openstack server create --network calico --flavor test_flavor --image bionic-mq --key-name key1 mq
Instance creation will fail
2b. Check logs for error
egrep "libvirt.
3. Cleanup
3a. Delete instances "mq" and "no-mq"
4. Install package that contains the fixed code
5. Repeat step 2a. 2a should now succeed.
[Regression Potential]
New Code path is not triggered if image metadata is not used. For all other use cases, the previous behavior is maintained.
[Other Info]
None
[0] https:/
Changed in nova: | |
assignee: | nobody → Rodrigo Barbieri (rodrigo-barbieri2010) |
status: | New → In Progress |
tags: | added: sts |
sean mooney (sean-k-mooney) wrote : Re: Cannot create instance with multiqueue image and vif_type=tap (calico) | #2 |
triaging this as medium this affect all release where vnic_type=tap and multiqueue are supported
Changed in nova: | |
importance: | Undecided → Medium |
tags: | added: libvirt network |
sean mooney (sean-k-mooney) wrote : | #3 |
sts is not a tag we use, what is it for?
@sean, regarding "sts" tag: I have notification rules associated with this tag for me and my team mates
Reviewed: https:/
Committed: https:/
Submitter: Zuul
Branch: master
commit 84cfc8e9ab1396e
Author: Rodrigo Barbieri <email address hidden>
Date: Thu Aug 27 17:20:19 2020 -0300
Allow tap interface with multiqueue
When vif_type="tap" (such as when using calico),
attempting to create an instance using an image that has
the property hw_vif_
the interface is always being created without multiqueue
flags.
This change checks if the property is defined and passes
the multiqueue parameter to create the tap interface
accordingly.
In case the multiqueue parameter is passed but the
vif_model is not virtio (or unspecified), the old
behavior is maintained.
Change-Id: I0307c43dcd0cac
Closes-bug: #1893263
Changed in nova: | |
status: | In Progress → Fix Released |
Fix proposed to branch: stable/ussuri
Review: https:/
Reviewed: https:/
Committed: https:/
Submitter: Zuul
Branch: stable/ussuri
commit a69845f3732843e
Author: Rodrigo Barbieri <email address hidden>
Date: Thu Aug 27 17:20:19 2020 -0300
Allow tap interface with multiqueue
When vif_type="tap" (such as when using calico),
attempting to create an instance using an image that has
the property hw_vif_
the interface is always being created without multiqueue
flags.
This change checks if the property is defined and passes
the multiqueue parameter to create the tap interface
accordingly.
In case the multiqueue parameter is passed but the
vif_model is not virtio (or unspecified), the old
behavior is maintained.
Conflicts:
NOTE: The conflict is due to not having patch
Iefa6009874
Change-Id: I0307c43dcd0cac
Closes-bug: #1893263
(cherry picked from commit 84cfc8e9ab1396e
tags: | added: in-stable-ussuri |
Fix proposed to branch: stable/train
Review: https:/
Fix proposed to branch: stable/stein
Review: https:/
Reviewed: https:/
Committed: https:/
Submitter: Zuul
Branch: stable/train
commit 750655c19daf4e8
Author: Rodrigo Barbieri <email address hidden>
Date: Thu Aug 27 17:20:19 2020 -0300
Allow tap interface with multiqueue
When vif_type="tap" (such as when using calico),
attempting to create an instance using an image that has
the property hw_vif_
the interface is always being created without multiqueue
flags.
This change checks if the property is defined and passes
the multiqueue parameter to create the tap interface
accordingly.
In case the multiqueue parameter is passed but the
vif_model is not virtio (or unspecified), the old
behavior is maintained.
Change-Id: I0307c43dcd0cac
Closes-bug: #1893263
(cherry picked from commit 84cfc8e9ab1396e
(cherry picked from commit a69845f3732843e
tags: | added: in-stable-train |
Reviewed: https:/
Committed: https:/
Submitter: Zuul
Branch: stable/stein
commit 8699156d86e0a40
Author: Rodrigo Barbieri <email address hidden>
Date: Thu Aug 27 17:20:19 2020 -0300
Allow tap interface with multiqueue
When vif_type="tap" (such as when using calico),
attempting to create an instance using an image that has
the property hw_vif_
the interface is always being created without multiqueue
flags.
This change checks if the property is defined and passes
the multiqueue parameter to create the tap interface
accordingly.
In case the multiqueue parameter is passed but the
vif_model is not virtio (or unspecified), the old
behavior is maintained.
Conflicts:
nova/
NOTE: The conflict is for not having change
Iab16a15a5f
Change-Id: I0307c43dcd0cac
Closes-bug: #1893263
(cherry picked from commit 84cfc8e9ab1396e
(cherry picked from commit a69845f3732843e
(cherry picked from commit 750655c19daf4e8
tags: | added: in-stable-stein |
description: | updated |
Rodrigo Barbieri (rodrigo-barbieri2010) wrote : Re: Cannot create instance with multiqueue image and vif_type=tap (calico) | #12 |
debdiff for SRU focal-ussuri
debdiff for SRU bionic-train
debdiff for SRU bionic-stein
summary: |
- Cannot create instance with multiqueue image and vif_type=tap (calico) + [SRU] Cannot create instance with multiqueue image and vif_type=tap + (calico) |
tags: | added: sts-sru-needed |
Removed the Victoria task for this as the fix shipped in Victoria, and was backported from there.
no longer affects: | cloud-archive/victoria |
Changed in cloud-archive: | |
status: | New → Invalid |
Hello Rodrigo, or anyone else affected,
Accepted nova into train-proposed. The package will build now and be available in the Ubuntu Cloud Archive in a few hours, and then in the -proposed repository.
Please help us by testing this new package. To enable the -proposed repository:
sudo add-apt-repository cloud-archive:
sudo apt-get update
Your feedback will aid us getting this update out to other Ubuntu users.
If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, and change the tag from verification-
Further information regarding the verification process can be found at https:/
tags: | added: verification-train-needed |
Hello Rodrigo, or anyone else affected,
Accepted nova into stein-proposed. The package will build now and be available in the Ubuntu Cloud Archive in a few hours, and then in the -proposed repository.
Please help us by testing this new package. To enable the -proposed repository:
sudo add-apt-repository cloud-archive:
sudo apt-get update
Your feedback will aid us getting this update out to other Ubuntu users.
If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, and change the tag from verification-
Further information regarding the verification process can be found at https:/
tags: | added: verification-stein-needed |
Brian Murray (brian-murray) wrote : | #18 |
Hello Rodrigo, or anyone else affected,
Accepted nova into focal-proposed. The package will build now and be available at https:/
Please help us by testing this new package. See https:/
If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, what testing has been performed on the package and change the tag from verification-
Further information regarding the verification process can be found at https:/
N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days.
Changed in nova (Ubuntu Focal): | |
status: | New → Fix Committed |
tags: | added: verification-needed verification-needed-focal |
Hello Rodrigo, or anyone else affected,
Accepted nova into ussuri-proposed. The package will build now and be available in the Ubuntu Cloud Archive in a few hours, and then in the -proposed repository.
Please help us by testing this new package. To enable the -proposed repository:
sudo add-apt-repository cloud-archive:
sudo apt-get update
Your feedback will aid us getting this update out to other Ubuntu users.
If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, and change the tag from verification-
Further information regarding the verification process can be found at https:/
tags: | added: verification-ussuri-needed |
Changed in nova (Ubuntu): | |
status: | New → Fix Released |
I was able to fix the issue I was having by forcing the hostname in /etc/felix/
FelixHostname = <full_hostname>
I've successfully validated bionic-stein and bionic-train, however, I cannot install the calico packages on focal:
root@juju-
This PPA contains packages for Calico's 3.17.x release series, mainly to support OpenStack-based systems (although the Felix package can be used stand-alone). It will be updated with patch releases for that series.
More info: https:/
Press [ENTER] to continue or Ctrl-c to cancel adding it.
Hit:1 http://
Hit:2 http://
Hit:3 http://
Hit:4 http://
Ign:5 http://
Err:6 http://
404 Not Found [IP: 91.189.95.83 80]
Reading package lists... Done
E: The repository 'http://
N: Updating from such a repository can't be done securely, and is therefore disabled by default.
N: See apt-secure(8) manpage for repository creation and user configuration details.
Slightly more context on my previous comment: When I said " was able to fix the issue I was having by forcing the hostname in /etc/felix/
Mauricio Faria de Oliveira (mfo) wrote : | #22 |
> I've successfully validated bionic-stein and bionic-train, however, I cannot install the calico packages on focal:
> ...
> E: The repository 'http://
Apparently the issue is that the PPA does not build/have Focal yet (no '/dists/focal/', only trusty/
This is going to be resolved via an upstream point release for Nova that is proposed in https:/
Fix proposed to branch: master /review. opendev. org/748533
Review: https:/