VM enter into hung status after triggering a crash

Bug #1882623 reported by Rakesh Ginjupalli
16
This bug affects 3 people
Affects Status Importance Assigned to Milestone
linux-azure (Ubuntu)
Fix Released
Medium
Marcelo Cerri
Bionic
Fix Released
Medium
Marcelo Cerri
linux-azure-4.15 (Ubuntu)
Fix Released
Undecided
Unassigned
Bionic
Fix Released
Undecided
Unassigned

Bug Description

[Impact]

* When kdumping on trusty/4.15 in an Azure instance, we observe quite frequently a stall on the kdump kernel, it gets blocked and soon we see a stack like the following:

[ 65.452007] INFO: rcu_sched detected stalls on CPUs/tasks:
[ 65.456004] 1-...!: (0 ticks this GP) idle=488/0/0 softirq=1/1 fqs=0
[ 65.456004] (detected by 0, t=15002 jiffies, g=707, c=706, q=8457)
[ 65.456004] rcu_sched kthread starved for 15002 jiffies! g707 c706 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x402 ->cpu=1

* By using the Azure serial console, we collected a sysrq-w when the issue happens:
[ 529.515013] sysrq: Show Blocked State
[ 529.517730] task PC stack pid father
[ 529.519006] kworker/u4:2 D 0 94 2 0x80000000
[ 529.519006] Workqueue: events_unbound fsnotify_mark_destroy_workfn
[ 529.519006] Call Trace:
[ 529.519006] __schedule+0x292/0x880
[ 529.519006] schedule+0x36/0x80
[ 529.519006] schedule_timeout+0x1d5/0x2f0
[ 529.519006] ? check_preempt_wakeup+0x162/0x260
[ 529.519006] wait_for_completion+0xa5/0x110
[ 529.519006] ? wake_up_q+0x80/0x80
[ 529.519006] __synchronize_srcu.part.14+0x67/0x80
[ 529.519006] ? trace_raw_output_rcu_utilization+0x50/0x50
[ 529.519006] ? __switch_to_asm+0x41/0x70
[ 529.519006] synchronize_srcu+0xd1/0xd6
[ 529.519006] fsnotify_mark_destroy_workfn+0x6d/0xc0
[ 529.519006] process_one_work+0x14e/0x390
[ 529.519006] worker_thread+0x1cc/0x3d0
[ 529.519006] kthread+0x105/0x140
[ 529.519006] ? max_active_store+0x60/0x60
[ 529.519006] ? kthread_bind+0x20/0x20
[ 529.519006] ret_from_fork+0x35/0x40
[ 529.519006] udevadm D 0 544 1 0x00000000
[ 529.519006] Call Trace:
[ 529.519006] __schedule+0x292/0x880
[ 529.519006] schedule+0x36/0x80
[ 529.519006] schedule_timeout+0x1d5/0x2f0
[ 529.519006] ? try_to_wake_up+0x4a/0x460
[ 529.519006] ? try_to_wake_up+0x4a/0x460
[ 529.519006] wait_for_completion+0xa5/0x110
[ 529.519006] ? wake_up_q+0x80/0x80
[ 529.519006] __flush_work.isra.29+0x119/0x1b0
[ 529.519006] ? destroy_worker+0x90/0x90
[ 529.519006] flush_delayed_work+0x3f/0x50
[ 529.519006] fsnotify_wait_marks_destroyed+0x15/0x20
[ 529.519006] fsnotify_destroy_group+0x4e/0xc0
[ 529.519006] inotify_release+0x1e/0x50
[ 529.519006] __fput+0xea/0x220
[ 529.519006] ____fput+0xe/0x10
[ 529.519006] task_work_run+0x8c/0xb0
[ 529.519006] exit_to_usermode_loop+0x70/0xa9
[ 529.519006] do_syscall_64+0x1b5/0x1e0
[ 529.519006] entry_SYSCALL_64_after_hwframe+0x41/0xa6
[ 529.519006] dhclient D 0 573 572 0x00000000
[ 529.519006] Call Trace:
[ 529.519006] __schedule+0x292/0x880
[ 529.519006] schedule+0x36/0x80
[ 529.519006] schedule_timeout+0x1d5/0x2f0
[ 529.519006] ? aa_profile_af_perm+0xb4/0xf0
[ 529.519006] wait_for_completion+0xa5/0x110
[ 529.519006] ? wake_up_q+0x80/0x80
[ 529.519006] __wait_rcu_gp+0x123/0x150
[ 529.519006] synchronize_sched+0x4e/0x60
[ 529.519006] ? __call_rcu+0x2f0/0x2f0
[ 529.519006] ? trace_raw_output_rcu_utilization+0x50/0x50
[ 529.519006] synchronize_net+0x1c/0x30
[ 529.519006] __unregister_prot_hook+0xcd/0xf0
[ 529.519006] packet_do_bind+0x1bd/0x250
[ 529.519006] packet_bind+0x2f/0x50
[ 529.519006] SYSC_bind+0xd8/0x110
[ 529.519006] ? sock_alloc_file+0x91/0x130
[ 529.519006] SyS_bind+0xe/0x10
[ 529.519006] do_syscall_64+0x80/0x1e0
[ 529.519006] entry_SYSCALL_64_after_hwframe+0x41/0xa6

* Bisecting mainline kernels, we found that v4.17-rc1 didn't reproduce the issue, whereas v4.16 reproduced. Then, a fine-grained git bisect led us to the fix - the following patch, when backported to a problematic version, fixes the issue: d8e462e19305 ("Drivers: hv: vmbus: Implement Direct Mode for stimer0")

* In Azure/Hyper-V, before the aforementioned commit, timer interrupts were passed to the hypervisor through a vmbus message, a mechanism of communication of hyper-v guests/hypervisor. With the patch, a check is made (through MSR-like mechanism) and if the hypervisor supports, a direct timer IRQ mechanism is put in-place instead of the vmbus channel.

* Our theory is that on kdump kernel, specially due to the single cpu nature, the vmbus-messaged timer IRQ could interfere with scheduling and create a dead-lock condition, which is what we observe from the stack traces. Hence, we hereby propose to backport such patch to azure 4.15 kernels.

* Our hypothesis about why this wasn't observed in Bionic is that changes in the kdump mechanism make it start earlier, and this might hide the schedule issue. Lack of testing also could be the cause of the non-observability.

[Test Case]

* Perform a kdump in Trusty/4.15 and after some rounds, you should see the mentioned RCU stack trace. With the patch, no issue is observed, and we could perform ~30 kdumps in a row.

[Where problems could occur]

* Since the patch touches IRQ code (although restricted to Hyper-v), it's a somewhat invasive change. It could lead to issues in the IRQ subsystem, although the patch is upstream since 2018 and no fixes for that were found (we count on Fixes tag for that).

CVE References

Revision history for this message
lilideng (lilideng) wrote :

Last time I did verification against gallery image, the default kernel version is 4.4.0-148-generic.
And I tried below kernel, found the kernel linux-image-4.15.0-1042-azure is the first one which we hit this bug, it is not specific for the esm kernel.

# Kernel Version/Package Result
0 Default kernel 4.4.0-148-generic => Good
1 linux-image-4.15.0-1023-azure => Good
2 linux-image-4.15.0-1030-azure
3 linux-image-4.15.0-1031-azure
4 linux-image-4.15.0-1032-azure
5 linux-image-4.15.0-1035-azure
6 linux-image-4.15.0-1036-azure => Good
7 linux-image-4.15.0-1037-azure
8 linux-image-4.15.0-1039-azure
9 linux-image-4.15.0-1040-azure => Good
10 linux-image-4.15.0-1041-azure => Good
11 linux-image-4.15.0-1042-azure => Bad
12 linux-image-4.15.0-1045-azure => Bad

Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in linux-azure (Ubuntu):
status: New → Confirmed
Changed in linux-azure (Ubuntu):
status: Confirmed → In Progress
assignee: nobody → Guilherme G. Piccoli (gpiccoli)
description: updated
Changed in linux-azure (Ubuntu):
assignee: Guilherme G. Piccoli (gpiccoli) → nobody
assignee: nobody → Marcelo Cerri (mhcerri)
Changed in linux-azure (Ubuntu Bionic):
assignee: nobody → Marcelo Cerri (mhcerri)
status: New → In Progress
Changed in linux-azure (Ubuntu):
importance: Undecided → Medium
Changed in linux-azure (Ubuntu Bionic):
importance: Undecided → Medium
Changed in linux-azure (Ubuntu Bionic):
status: In Progress → Fix Committed
Changed in linux-azure-4.15 (Ubuntu):
status: New → Fix Committed
Changed in linux-azure-4.15 (Ubuntu Bionic):
status: New → Fix Committed
Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote :

This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-bionic' to 'verification-done-bionic'. If the problem still exists, change the tag 'verification-needed-bionic' to 'verification-failed-bionic'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-bionic
Marcelo Cerri (mhcerri)
tags: added: verification-done-bionic
removed: verification-needed-bionic
Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (34.0 KiB)

This bug was fixed in the package linux-azure-4.15 - 4.15.0-1123.136

---------------
linux-azure-4.15 (4.15.0-1123.136) bionic; urgency=medium

  * bionic/linux-azure-4.15: 4.15.0-1123.136 -proposed tracker (LP: #1939816)

  * VM enter into hung status after triggering a crash (LP: #1882623)
    - Drivers: hv: vmbus: Implement Direct Mode for stimer0

  [ Ubuntu: 4.15.0-156.163 ]

  * bionic/linux: 4.15.0-156.163 -proposed tracker (LP: #1940162)
  * linux (LP: #1940564)
    - SAUCE: Revert "scsi: core: Cap scsi_host cmd_per_lun at can_queue"
  * fails to launch linux L2 guests on AMD (LP: #1940134) // CVE-2021-3653
    - KVM: nSVM: avoid picking up unsupported bits from L2 in int_ctl
      (CVE-2021-3653)
  * fails to launch linux L2 guests on AMD (LP: #1940134)
    - SAUCE: Revert "UBUNTU: SAUCE: KVM: nSVM: avoid picking up unsupported bits
      from L2 in int_ctl"

  [ Ubuntu: 4.15.0-155.162 ]

  * bionic/linux: 4.15.0-155.162 -proposed tracker (LP: #1939833)
  * Packaging resync (LP: #1786013)
    - debian/dkms-versions -- update from kernel-versions (main/2021.08.16)
  * CVE-2021-3656
    - SAUCE: KVM: nSVM: always intercept VMLOAD/VMSAVE when nested
  * CVE-2021-3653
    - SAUCE: KVM: nSVM: avoid picking up unsupported bits from L2 in int_ctl
  * dev_forward_skb: do not scrub skb mark within the same name space
    (LP: #1935040)
    - dev_forward_skb: do not scrub skb mark within the same name space
  * 'ptrace trace' needed to readlink() /proc/*/ns/* files on older kernels
    (LP: #1890848)
    - apparmor: fix ptrace read check
  * Bionic update: upstream stable patchset 2021-08-03 (LP: #1938824)
    - ALSA: usb-audio: fix rate on Ozone Z90 USB headset
    - media: dvb-usb: fix wrong definition
    - Input: usbtouchscreen - fix control-request directions
    - net: can: ems_usb: fix use-after-free in ems_usb_disconnect()
    - usb: gadget: eem: fix echo command packet response issue
    - USB: cdc-acm: blacklist Heimann USB Appset device
    - ntfs: fix validity check for file name attribute
    - iov_iter_fault_in_readable() should do nothing in xarray case
    - Input: joydev - prevent use of not validated data in JSIOCSBTNMAP ioctl
    - ARM: dts: at91: sama5d4: fix pinctrl muxing
    - btrfs: send: fix invalid path for unlink operations after parent
      orphanization
    - btrfs: clear defrag status of a root if starting transaction fails
    - ext4: cleanup in-core orphan list if ext4_truncate() failed to get a
      transaction handle
    - ext4: fix kernel infoleak via ext4_extent_header
    - ext4: correct the cache_nr in tracepoint ext4_es_shrink_exit
    - ext4: remove check for zero nr_to_scan in ext4_es_scan()
    - ext4: fix avefreec in find_group_orlov
    - ext4: use ext4_grp_locked_error in mb_find_extent
    - can: gw: synchronize rcu operations before removing gw job entry
    - can: peak_pciefd: pucan_handle_status(): fix a potential starvation issue in
      TX path
    - SUNRPC: Fix the batch tasks count wraparound.
    - SUNRPC: Should wake up the privileged task firstly.
    - s390/cio: dont call css_wait_for_slow_path() inside a lock
    - rtc: stm32: Fix unbalanced clk_disable_unprepare() on probe error pa...

Changed in linux-azure-4.15 (Ubuntu Bionic):
status: Fix Committed → Fix Released
Changed in linux-azure-4.15 (Ubuntu):
status: Fix Committed → Fix Released
Changed in linux-azure (Ubuntu Bionic):
status: Fix Committed → Fix Released
Changed in linux-azure (Ubuntu):
status: In Progress → Fix Released
Revision history for this message
Mahesh Saptasagar (maheshsaptasagar) wrote :

Hi Team,

When kdumping on 4.15.18 kernel in an Azure instance, we observe quite frequently a stall on the kdump kernel, it gets blocked and soon we see a stack like the following:

INFO: rcu_sched detected stalls on CPUs/tasks:
1-...!: (0 ticks this GP) idle=488/0/0 softirq=1/1 fqs=0
(detected by 0, t=15002 jiffies, g=707, c=706, q=8457)
rcu_sched kthread starved for 15002 jiffies! g707 c706 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x402 ->cpu=1

We tried porting this change https://<email address hidden>/ which seems to be the potential cause of above issue, but after this regular azure instance is hanging during the boot.

Please let me know if I am missing any patches to solve this issue.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.