Inconsistent nested KVM status with race conditions across multiple hosts: rmmod: ERROR: Module kvm_intel is in use
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
qemu (Ubuntu) |
Fix Released
|
Undecided
|
Unassigned | ||
Bionic |
Incomplete
|
Undecided
|
Unassigned |
Bug Description
Ubuntu bionic
qemu-system-x86: 1:2.11+
When installing qemu-system-x86, nested KVM will be enabled by default thanks to the file offered by the package:
/etc/modprobe.
and postinst (/var/lib/
# If the host had already installed kvm_intel.ko without nested=1, then
# re-load it now, honoring whatever is in qemu-system-
if [ "$1" = configure ] ; then
if grep -q kvm_intel /proc/modules && [ -f $INTEL_NESTED ]; then
if [ "x$v" != "xY" ]; then
fi
fi
fi
However, we found that some of the hosts out of 10+ had nested KVM disabled after the package installation somehow. Then found the error "rmmod: ERROR: Module kvm_intel is in use" during the phase.
2019-11-18 17:29:55 DEBUG install Setting up qemu-system-x86 (1:2.11+
2019-11-18 17:29:55 DEBUG install rmmod: ERROR: Module kvm_intel is in use
2019-11-18 17:29:55 DEBUG install Setting up qemu-kvm (1:2.11+
2019-11-18 17:29:55 DEBUG install Setting up libpangocairo-
By running `rmmod kvm_intel && modprobe kvm_intel` by hand after that, nested KVM got enabled properly. So there should be some sort of race conditions during the installation.
FWIW, at the same timing of the rmmod failure, the kernel seemed running the L1TF test as follows:
Nov 18 17:29:52 host kernel: [ 347.125789] audit: type=1400 audit(157409819
qemu_bridge_helper" pid=31696 comm="apparmor_
Nov 18 17:29:55 host kernel: [ 350.078487] ip6_tables: (C) 2000-2006 Netfilter Core Team
Nov 18 17:29:55 host kernel: [ 350.209707] Ebtables v2.0 registered
Nov 18 17:29:55 host kernel: [ 350.464461] L1TF CPU bug present and SMT on, data leak possible. See CVE-2018-3646 and https:/
/l1tf.html for details.
Nov 18 17:29:56 host kernel: [ 351.438393] bridge: filtering via arp/ip/ip6tables is no longer available by default. Update your scripts to load br_netfilter if you need this.
My goal here is to have the consistent status of nested KVM because if it's inconsistent across multiple hosts, live-migration will fail with the following error:
[instance: afd27b8f-
The command executed during automated installation with Juju nova-compute charm:
Commandline: apt-get --assume-yes --option=
# apt policy qemu-system-x86
qemu-system-x86:
Installed: 1:2.11+
Candidate: 1:2.11+
Version table:
*** 1:2.11+
500 http://
500 http://
100 /var/lib/
1:
500 http://
Hi Nobotu,
thanks for this high quality bug report.
I'm glad that running "rmmod kvm_intel && modprobe kvm_intel" later worked for you.
I was having different thoughts on this separated in the sections below ...
---
The obvious solution that is true for most kind initialization issues would be to reboot after install which also works reliable. I'm only not suggesting this as a real fix I generally dislike reboots :-)
---
Unfortunately the postinst isn't allowed to "wait a while until it resolves" so I'm not going that way. Surely we could do a fast non sleeping loop, but that will not make it reliable which is what you'd want. After all you already know that with your manual reload you already can enable it.
Could you give it a try if a (almost) non sleeping retry will fix your issue "statistically reliable"?
We'd know then a bit more about the size of that race window.
Let me know if you need my help for a custom build with that.
Heads up - this surely won't be an SRU on its own.
For something not really supported I'd not want to trigger everyone to download a new version. But there are enough that we could tuck it along.
---
Please be reminded that strictly speaking and in general nested is "as good as possible but not supported" [1]. But that never stopped us from helping as much as possible.
Things got much better going forward in regard to nested support and I'm confident that it might be supported soon and then also enabled by default in the module which will eliminate all of this potential toggle issue.
That will is available quite some time for AMD (>2.6.32) and be available with kernel >4.20 [2] for Intel as the default. And since then it got even more stable, so for Bionic if using nested consider using the HWE kernels.
---
There is another option which is to stay at 4.15 but override the default. d/*.conf before kvm_intel is ever loaded.
You could obviously drop in an /etc/modprobe.
Even earlier you can set kernel commandline kvm_intel.nested=1 which will flip the default from 0 to 1 and avoid later races.
That is part of controlling your deployment, but might be an option if upgrading to the HWE kernel isn't a valid option for you.
---
I think the best options you have in order are:
1. using the HWE kernel this is available right now and will work right away (I prefer that as you also get plenty of fixes for nested).
2. control your deployment by overriding the default as kernel commandline. That will also make sure that it is 1 right from the beginning.
3. reboot after install, silly but effective and for some people it might be preferred
3. We can try the retry loop approach but I'm not really convinced of it, and after all the SRU team might even like it less and reject it.
Let me know what you think and if #1 or #2 will work for you.
If you insist on trying #4 let me know if you need my support for a test build.
P.S. since >=Cosmic had a recent enough kernel I'll mark it Fix released but add a Bionic task.
[1]: https:/ /git.launchpad. net/ubuntu/ +source/ qemu/tree/ debian/ qemu-system- x86.README. Debian? h=ubuntu/ bionic- devel /git.kernel. org/pub/ scm/linux/ kernel/ git/torvalds/ linux.git/ commit/ ?id=1e58e5...
[2]: https:/