Bionic kernel 4.15.0-136 causes dosemu2 (with kvm mode) freezes due to lack of KVM patch

Bug #1917138 reported by Bambang Pranoto on 2021-02-27
18
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Dosemu2
Fix Released
Unknown
linux (Ubuntu)
Undecided
Unassigned
Bionic
High
Guilherme G. Piccoli

Bug Description

[Impact]
* Since kernel 4.15.0-136, Bionic kernel included a very complex KVM fix for a kind of "race" in interrupt window with irqchip-split (reported in [0]). The fix was proposed in the form of a patch series containing 2 patches [1] - this was merged in Ubuntu though the stable tree, in the form of the following commit:
71cc849b7093 ("KVM: x86: Fix split-irqchip vs interrupt injection window request") [2]

* The problem is that such commit has a companion required commit, which was not proposed in the stable tree. In fact, there was a confusion among KVM community and the stable maintainer [3], due to the lack of such missing commit - because of that, the series was removed from stable trees 4.14.y and 4.9.y, but the solo commit was merged alone in Ubuntu kernel.

* Without the companion patch, we might have a KVM infinite "loop" condition in the core IRQ handling, since the merged commit requires an extra check in kvm_cpu_has_extint() and a condition "inversion" in kvm_cpu_get_extint(), only present in the missing companion patch. Users reported that this manifested as dosemu2 (running in KVM mode) to be stuck in kernel 4.15.0-136 and -137, while works fine in 4.15.0-135 and the
-137 plus the companion patch.

* So, we hereby backport the companion commit, originally upstream patch: 72c3bcdcda ("KVM: x86: handle !lapic_in_kernel case in kvm_cpu_*_extint") [4]

[Test Case]
* The test case proposed was the reported bug: try running dosemu2 (with kvm mode enabled) and it fails without the companion commit.

* In order to test the correctness of both fixes together, we could rely in the test proposed in [0] (running a guest with "noapic"), but it wasn't consistent and the VMM wasn't mentioned, so we might have a workaround mechanism in qemu, for example, preventing such test to reproduce the issue.

[Where problems could occur]
* Since this is a KVM core modification, it could affect interrupt handling in KVM but without the fix, we are already experiencing a bug. Also, both commits were backported to 5.4.y and 4.19.y, so Focal and subsequent released are already running with them.

[0] https://lore<email address hidden>/

[1] https://<email address hidden>/

[2] http://git.kernel.org/linus/71cc849b70

[3] https://<email address hidden>/

[4] http://git.kernel.org/linus/72c3bcdcda

<Original description>

With the latest kernel 4.15.0-136 updates on ubuntu 18.04 and ubuntu 16.04, dosemu2 with kvm freezes boot.

dosemu2 source: https://github.com/dosemu2/dosemu2

dosemu2 package can be obtained from https://launchpad.net/~dosemu2/+archive/ubuntu/ppa

1. ubuntu version
lsb_release -rd
Description: Ubuntu 18.04.5 LTS
Release: 18.04

2. package version
$ apt-cache policy dosemu2
dosemu2:
  Installed: (none)
  Candidate: (none)
  Version table:
     2.0~pre8-2 -1
        100 /var/lib/dpkg/status

3. What is expected to happen: The dosemu program runs fine as in previous kernel version
4. What happened instead: The dosemu freezes on loading

I have also reported this problem to dosemu2 developer, here is my bug report:
https://github.com/dosemu2/dosemu2/issues/1404

CVE References

Hans-Christian (hc-koch) wrote :

I can confirm that I also am experiencing this bug on various 16.04. LTS systems since the latest update.

Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in ubuntu:
status: New → Confirmed
Hans-Christian (hc-koch) on 2021-02-27
no longer affects: dosemu (Ubuntu)

Thank you for taking the time to report this bug and helping to make Ubuntu better. It seems that your bug report is not filed about a specific source package though, rather it is just filed against Ubuntu in general. It is important that bug reports be filed about source packages so that people interested in the package can find the bugs about it. You can find some hints about determining what package your bug might be about at https://wiki.ubuntu.com/Bugs/FindRightPackage. You might also ask for help in the #ubuntu-bugs irc channel on Freenode.

To change the source package that this bug is filed about visit https://bugs.launchpad.net/ubuntu/+bug/1917138/+editstatus and add the package name in the text box next to the word Package.

[This is an automated message. I apologize if it reached you inappropriately; please just reply to this message indicating so.]

tags: added: bot-comment
stsp (stsp-0) on 2021-02-27
affects: ubuntu → linux (Ubuntu)
affects: linux → dosemu2
Changed in dosemu2:
status: Unknown → New
Guilherme G. Piccoli (gpiccoli) wrote :

Thanks for the report! Are you using nested virtualization (a virtual machine inside another virtual machine) or you run dosbox directly in your host/bare-metal system?
Thanks!

Guilherme G. Piccoli (gpiccoli) wrote :

Thanks again Bambang and Hans-Christian for the report, it seems indeed a bug in the kernel. I think I know what's going on, but I'd like to ask you to perform some testing in order we can prove that, and then fix it.

So, first it'll be 2 tests - depending on the result, a third test will be necessary.
Please be sure to try both (1) and (2) with dos2emu using the KVM mode:

(1) For the first test, please get the Bionic proposed kernel (see [0] for information on how to do it) - the current proposed version is 4.15.0-137. Try to reproduce with that version - I expect it to reproduce.

(2) Now, get the build I made for you in https://kernel.ubuntu.com/~gpiccoli/lp1917138/ - observe that if you have the "linux-modules-extra" and "linux-headers" already installed on your system after test (1), please also install them from my build or else you might have issues in the boot process. To install the packages, download them from the link above to some directory and then run "dpkg -i *.deb" (supposing no other .deb file is present in that directory).
Reboot the system after a successful installation and try to reproduce the issue - I suspect it won't reproduce, dosemu2 should work fine even in KVM mode.

Thanks in advance,

Guilherme

[0] https://wiki.ubuntu.com/Testing/EnableProposed
(don't forget to disable the proposed repository after your tests)

Dear Guilherme,

Thank you very much for your reply.

On Tue, Mar 2, 2021 at 3:00 AM Guilherme G. Piccoli <
<email address hidden>> wrote:

> Thanks for the report! Are you using nested virtualization (a virtual
> machine inside another virtual machine) or you run dosbox directly in your
> host/bare-metal system?
>
>
I run it directly on the real hardware/bare-metal system.

Bambang Pranoto (bpranoto) wrote :

Dear Guilherme,

I will try and come back to you later.

Thanks again.

On Tue, Mar 2, 2021 at 4:56 AM Guilherme G. Piccoli <
<email address hidden>> wrote:

> Thanks again Bambang and Hans-Christian for the report, it seems indeed
> a bug in the kernel. I think I know what's going on, but I'd like to ask
> you to perform some testing in order we can prove that, and then fix it.
>
> So, first it'll be 2 tests - depending on the result, a third test will be
> necessary.
> Please be sure to try both (1) and (2) with dos2emu using the KVM mode:
>
> (1) For the first test, please get the Bionic proposed kernel (see [0]
> for information on how to do it) - the current proposed version is
> 4.15.0-137. Try to reproduce with that version - I expect it to
> reproduce.
>
> (2) Now, get the build I made for you in
> https://kernel.ubuntu.com/~gpiccoli/lp1917138/ - observe that if you have
> the "linux-modules-extra" and "linux-headers" already installed on your
> system after test (1), please also install them from my build or else you
> might have issues in the boot process. To install the packages, download
> them from the link above to some directory and then run "dpkg -i *.deb"
> (supposing no other .deb file is present in that directory).
> Reboot the system after a successful installation and try to reproduce the
> issue - I suspect it won't reproduce, dosemu2 should work fine even in KVM
> mode.
>
> Thanks in advance,
>
>
> Guilherme
>
>
> [0] https://wiki.ubuntu.com/Testing/EnableProposed
> (don't forget to disable the proposed repository after your tests)
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1917138
>
> Title:
> kernel 4.15.0-136 causes dosemu2 with kvm freezes
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/dosemu2/+bug/1917138/+subscriptions
>

--
Bambang P
http://bpranoto.blogspot.com

Thanks Bambang, let me know as soon as possible, so we can maybe fix this cycle (worst case, next cycle).

Hello Guilherme,

Yes you are right. The 137 from the Pre-released update doesn't work.

The 137 files from https://kernel.ubuntu.com/~gpiccoli/lp1917138/ works
perfectly.

Thank you very much! Out of curiosity, what went wrong with the 136?

On Tue, Mar 2, 2021 at 7:20 PM Guilherme G. Piccoli <
<email address hidden>> wrote:

> Thanks Bambang, let me know as soon as possible, so we can maybe fix
> this cycle (worst case, next cycle).
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1917138
>
> Title:
> kernel 4.15.0-136 causes dosemu2 with kvm freezes
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/dosemu2/+bug/1917138/+subscriptions
>

--
Bambang P
http://bpranoto.blogspot.com

Hi Bambang, thanks a lot for your testing! I'll need you to test one more kernel to be 100% sure of the issue and be able to fix it. Let me elaborate on what's happening.

So, since kernel 4.15.0-136 we included a very complex KVM fix for a kind of "race" in interrupt window triggered by David Woodhouse [0] and analyzed by some members of KVM community (causing a live-lock - I suggest reading [0], very informative thread). The fix was proposed in the form of a patch series containing 2 patches [1] from Paolo Bonzini. This was merged in Ubuntu though the stable tree, in the form of the following commit:

71cc849b7093 ("KVM: x86: Fix split-irqchip vs interrupt injection window request") [2]

The problem is that such commit has a companion required commit, which was not proposed in the stable tree. In fact, there was a confusion among KVM community and the stable maintainer [3], due to the lack of such missing commit - because of that, the series was removed from stable trees 4.14.y and 4.9.y, but the solo commit was merged alone in Ubuntu kernel 4.15.

My theory is that this commit alone is causing an odd behavior (specially since dosemu2 seems to make use of leagacy PIC instead of APIC), so the fix would be to merge the companion commit in Ubuntu tree:

72c3bcdcda49 ("KVM: x86: handle !lapic_in_kernel case in kvm_cpu_*_extint") [4]

It's interesting to note that after the stable confusion in [3], both commits were removed from 4.9.y and 4.14.y trees - I intend to submit both to 4.14.y tree after the next test, as well as merge it on Ubuntu kernel.

I'll defer the test for next comment, in order to not pollute this one (which is already big and a bit over-detailed).
Cheers,

Guilherme

[0] https://lore<email address hidden>/
[1] https://<email address hidden>/
[2] http://git.kernel.org/linus/71cc849b70
[3] https://<email address hidden>/
[4] http://git.kernel.org/linus/72c3bcdcda

So Bambang and/or Hans-Christian, can you please test the following kernel build that includes the aforementioned missing commit:

https://kernel.ubuntu.com/~gpiccoli/lp1917138-2/

It's the same process as before, just install all the packages from the link above, reboot and retry the reproducer - be sure that dosemu2 is using KVM mode.
Let me know the results, so I can submit this patch properly to the mailing-list to fix Bionic kernel.

Thanks,

Guilherme

Changed in linux (Ubuntu):
status: Confirmed → In Progress
Changed in dosemu2:
status: New → Fix Released

Dear Guilherme,

On Wed, Mar 3, 2021 at 1:45 AM Guilherme G. Piccoli <
<email address hidden>> wrote:

> So Bambang and/or Hans-Christian, can you please test the following
> kernel build that includes the aforementioned missing commit:
>
> https://kernel.ubuntu.com/~gpiccoli/lp1917138-2/
>
>
Yes, it works very well.

Thank you very much!

summary: - kernel 4.15.0-136 causes dosemu2 with kvm freezes
+ Bionic kernel 4.15.0-136 causes dosemu2 (with kvm mode) freezes due to
+ lack of KVM patch
description: updated
Changed in linux (Ubuntu):
importance: Undecided → High
assignee: nobody → Guilherme G. Piccoli (gpiccoli)

Thanks a lot for the tests Bambang, and for the report! The patch was submitted to the Ubuntu kernel ML, soon it should be applied.
Cheers,

Guilherme

Changed in dosemu2:
status: Fix Released → New
Stefan Bader (smb) on 2021-03-04
Changed in linux (Ubuntu Bionic):
assignee: nobody → Guilherme G. Piccoli (gpiccoli)
importance: Undecided → High
status: New → In Progress
Changed in linux (Ubuntu):
assignee: Guilherme G. Piccoli (gpiccoli) → nobody
importance: High → Undecided
status: In Progress → Invalid
Stefan Bader (smb) on 2021-03-05
Changed in linux (Ubuntu Bionic):
status: In Progress → Fix Committed

Dear Guilherme,

I just want to inform you . Today my kernel is upgraded to
version 4.15.0-139-generic, but the bug is still there.

Thank you.

On Fri, Mar 5, 2021 at 4:50 PM Stefan Bader <email address hidden>
wrote:

> ** Changed in: linux (Ubuntu Bionic)
> Status: In Progress => Fix Committed
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1917138
>
> Title:
> Bionic kernel 4.15.0-136 causes dosemu2 (with kvm mode) freezes due to
> lack of KVM patch
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/dosemu2/+bug/1917138/+subscriptions
>

--
Bambang P
http://bpranoto.blogspot.com

This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-bionic' to 'verification-done-bionic'. If the problem still exists, change the tag 'verification-needed-bionic' to 'verification-failed-bionic'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-bionic
tags: added: verification-done-bionic
removed: bot-comment verification-needed-bionic

hi Bambang, the fix is present in the current proposed version, 4.15.0-141 - both versions 4.15.0-139 and -140 are CVE fixes only, so they didn't include the KVM fix.
Cheers,

Guilherme

Launchpad Janitor (janitor) wrote :
Download full text (11.4 KiB)

This bug was fixed in the package linux - 4.15.0-141.145

---------------
linux (4.15.0-141.145) bionic; urgency=medium

  * bionic/linux: 4.15.0-141.145 -proposed tracker (LP: #1919536)

  * binary assembly failures with CONFIG_MODVERSIONS present (LP: #1919315)
    - [Packaging] quiet (nomially) benign errors in BUILD script

  * selftests: bpf verifier fails after sanitize_ptr_alu fixes (LP: #1920995)
    - bpf: Simplify alu_limit masking for pointer arithmetic
    - bpf: Add sanity check for upper ptr_limit
    - bpf, selftests: Fix up some test_verifier cases for unprivileged

  * Packaging resync (LP: #1786013)
    - update dkms package versions

  * CVE-2018-13095
    - xfs: More robust inode extent count validation

  * i40e PF reset due to incorrect MDD event (LP: #1772675)
    - i40e: change behavior on PF in response to MDD event

  * Bionic update: upstream stable patchset 2021-03-09 (LP: #1918330)
    - ACPI: sysfs: Prefer "compatible" modalias
    - ARM: dts: imx6qdl-gw52xx: fix duplicate regulator naming
    - wext: fix NULL-ptr-dereference with cfg80211's lack of commit()
    - net: usb: qmi_wwan: added support for Thales Cinterion PLSx3 modem family
    - drivers: soc: atmel: Avoid calling at91_soc_init on non AT91 SoCs
    - drivers: soc: atmel: add null entry at the end of at91_soc_allowed_list[]
    - KVM: x86/pmu: Fix HW_REF_CPU_CYCLES event pseudo-encoding in
      intel_arch_events[]
    - KVM: x86: get smi pending status correctly
    - xen: Fix XenStore initialisation for XS_LOCAL
    - leds: trigger: fix potential deadlock with libata
    - mt7601u: fix kernel crash unplugging the device
    - mt7601u: fix rx buffer refcounting
    - xen-blkfront: allow discard-* nodes to be optional
    - ARM: imx: build suspend-imx6.S with arm instruction set
    - netfilter: nft_dynset: add timeout extension to template
    - xfrm: Fix oops in xfrm_replay_advance_bmp
    - RDMA/cxgb4: Fix the reported max_recv_sge value
    - iwlwifi: pcie: use jiffies for memory read spin time limit
    - iwlwifi: pcie: reschedule in long-running memory reads
    - mac80211: pause TX while changing interface type
    - can: dev: prevent potential information leak in can_fill_info()
    - x86/entry/64/compat: Preserve r8-r11 in int $0x80
    - x86/entry/64/compat: Fix "x86/entry/64/compat: Preserve r8-r11 in int $0x80"
    - iommu/vt-d: Gracefully handle DMAR units with no supported address widths
    - iommu/vt-d: Don't dereference iommu_device if IOMMU_API is not built
    - NFC: fix resource leak when target index is invalid
    - NFC: fix possible resource leak
    - team: protect features update by RCU to avoid deadlock
    - tcp: fix TLP timer not set when CA_STATE changes from DISORDER to OPEN
    - kernel: kexec: remove the lock operation of system_transition_mutex
    - PM: hibernate: flush swap writer after marking
    - pNFS/NFSv4: Fix a layout segment leak in pnfs_layout_process()
    - net/mlx5: Fix memory leak on flow table creation error flow
    - rxrpc: Fix memory leak in rxrpc_lookup_local
    - net: dsa: bcm_sf2: put device node before return
    - ibmvnic: Ensure that CRQ entry read are correctly ordered
    - ACPI: thermal: Do...

Changed in linux (Ubuntu Bionic):
status: Fix Committed → Fix Released
Changed in dosemu2:
status: New → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.