[Hyper-V] srcu: Lock srcu_data structure in srcu_gp_start()

Bug #1802021 reported by Joshua R. Poulson on 2018-11-07
60
This bug affects 8 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Undecided
Unassigned
Bionic
Undecided
Unassigned
Cosmic
Undecided
Unassigned
linux-azure (Ubuntu)
Medium
Marcelo Cerri
Bionic
Undecided
Marcelo Cerri
Cosmic
Undecided
Marcelo Cerri

Bug Description

We had a customer seeing traces like the following:

tack trace from kern.log:
2018-10-10T04:43:08.542464+00:00 hbp2ann-2 kernel: INFO: task kworker/u16:0:16678 blocked for more than 120 seconds.
2018-10-10T04:43:08.542503+00:00 hbp2ann-2 kernel: Not tainted 4.15.0-1023-azure #24~16.04.1-Ubuntu
2018-10-10T04:43:08.542513+00:00 hbp2ann-2 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
2018-10-10T04:43:08.547366+00:00 hbp2ann-2 kernel: kworker/u16:0 D 0 16678 2 0x80000000
2018-10-10T04:43:08.547386+00:00 hbp2ann-2 kernel: Workqueue: events_unbound fsnotify_mark_destroy_workfn
2018-10-10T04:43:08.547395+00:00 hbp2ann-2 kernel: Call Trace:
2018-10-10T04:43:08.547413+00:00 hbp2ann-2 kernel: __schedule+0x3d6/0x8b0
2018-10-10T04:43:08.547422+00:00 hbp2ann-2 kernel: ? check_preempt_wakeup+0xfb/0x240
2018-10-10T04:43:08.547431+00:00 hbp2ann-2 kernel: ? sched_clock_local+0x17/0x90
2018-10-10T04:43:08.547440+00:00 hbp2ann-2 kernel: schedule+0x36/0x80
2018-10-10T04:43:08.547448+00:00 hbp2ann-2 kernel: schedule_timeout+0x1db/0x370
2018-10-10T04:43:08.547458+00:00 hbp2ann-2 kernel: ? __enqueue_entity+0x5c/0x60
2018-10-10T04:43:08.547467+00:00 hbp2ann-2 kernel: ? enqueue_entity+0x112/0x670
2018-10-10T04:43:08.547477+00:00 hbp2ann-2 kernel: wait_for_completion+0xb4/0x140
2018-10-10T04:43:08.547486+00:00 hbp2ann-2 kernel: ? wake_up_q+0x70/0x70
2018-10-10T04:43:08.547510+00:00 hbp2ann-2 kernel: __synchronize_srcu.part.13+0x85/0xb0
2018-10-10T04:43:08.547535+00:00 hbp2ann-2 kernel: ? trace_raw_output_rcu_utilization+0x50/0x50
2018-10-10T04:43:08.547560+00:00 hbp2ann-2 kernel: synchronize_srcu+0xd3/0xe0
2018-10-10T04:43:08.547594+00:00 hbp2ann-2 kernel: ? synchronize_srcu+0xd3/0xe0
2018-10-10T04:43:08.547604+00:00 hbp2ann-2 kernel: fsnotify_mark_destroy_workfn+0x7c/0xe0
2018-10-10T04:43:08.547612+00:00 hbp2ann-2 kernel: process_one_work+0x14d/0x410
2018-10-10T04:43:08.547620+00:00 hbp2ann-2 kernel: worker_thread+0x4b/0x460
2018-10-10T04:43:08.547628+00:00 hbp2ann-2 kernel: kthread+0x105/0x140
2018-10-10T04:43:08.547637+00:00 hbp2ann-2 kernel: ? process_one_work+0x410/0x410
2018-10-10T04:43:08.547645+00:00 hbp2ann-2 kernel: ? kthread_destroy_worker+0x50/0x50
2018-10-10T04:43:08.547654+00:00 hbp2ann-2 kernel: ? do_syscall_64+0x73/0x130
2018-10-10T04:43:08.547677+00:00 hbp2ann-2 kernel: ? SyS_exit_group+0x14/0x20
2018-10-10T04:43:08.547685+00:00 hbp2ann-2 kernel: ret_from_fork+0x35/0x40

Error Code: INFO: task kworker/u16:0:16678 blocked for more than 120 seconds.

We are seeing more issue with fsnotify related callbacks. These are not a soft/hard lockup but seem to significantly degrade the responsiveness of systemd (and from there everything else).

The following upstream commit may fix this issue, but it is in Paul's RCU tree and not in linux-next or upstream yet:

https://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git/commit/?h=dev&id=1a05c0cd2fee234a10362cc8f66057557cbb291f

srcu: Lock srcu_data structure in srcu_gp_start()
The srcu_gp_start() function is called with the srcu_struct structure's
->lock held, but not with the srcu_data structure's ->lock. This is
problematic because this function accesses and updates the srcu_data
structure's ->srcu_cblist, which is protected by that lock. Failing to
hold this lock can result in corruption of the SRCU callback lists,
which in turn can result in arbitrarily bad results.

This commit therefore makes srcu_gp_start() acquire the srcu_data
structure's ->lock across the calls to rcu_segcblist_advance() and
rcu_segcblist_accelerate(), thus preventing this corruption.

Please investigate this issue and evaluate the proposed fix.

Joshua R. Poulson (jrp) on 2018-11-07
Changed in linux-azure (Ubuntu):
status: New → Confirmed
tags: added: kernel-da-key kernel-hyper-v
Changed in linux-azure (Ubuntu):
importance: Undecided → Medium
status: Confirmed → Triaged
Marcelo Cerri (mhcerri) wrote :

Hi, Josh. Do you have an specific workload that triggers that issue?

Joshua R. Poulson (jrp) wrote :

It's a heavy database workload from a online site, so it is difficult to make a simple repro for.

Changed in linux-azure (Ubuntu):
assignee: nobody → Joseph Salisbury (jsalisbury)
status: Triaged → In Progress
Joseph Salisbury (jsalisbury) wrote :

I built a test kernel with commit eb4c2382272ae7 from linux-next. This commit relies on commit d633198088bd9 for the definition of spin_lock_rcu_node and it's corresponding unlock. That commit was added to mainline in v4.16-rc1

The test kernel can be downloaded from:
http://kernel.ubuntu.com/~jsalisbury/lp1802021

It sounds like this bug is difficult to reproduce, but it would be great if the affected customer was willing to test this kernel.

Note about installing test kernels:
• If the test kernel is prior to 4.15(Bionic) you need to install the linux-image and linux-image-extra .deb packages.
• If the test kernel is 4.15(Bionic) or newer, you need to install the linux-modules, linux-modules-extra and linux-image-unsigned .deb packages.

Thanks in advance!

Brad Figg (brad-figg) on 2018-12-10
tags: added: bjf
tags: removed: bjf
tags: added: bjf-tracking
Joshua R. Poulson (jrp) wrote :

Joe, is your kernel different than the one Gavin Guo built for us here? https://launchpad.net/~mimi0213kimo/+archive/ubuntu/sf00204509-rcu-backport

We have enabled extra debugging and given that kernel to our internal customer who attempting to repro. Since the repro takes a very long time it is difficult to decide when the fix is working or not.

Paul McKenney upstream has submitted a pull request for this patch (and others) to go into 4.21. Getting some "burn in" time upstream hasn't really started in earnest yet, but there is no negative discussion about the PR, and I am tempted to get this into the regular based on Paul's comment: "Lock srcu_data structure in srcu_gp_start(), fixing a an extremely rare but also extremely embarrassing concurrency bug, courtesy of Dennis Krein."

Gavin Guo (mimi0213kimo) wrote :

Hi Joshua,

I just reviewed the commit Joseph provided in the launchpad. It's the
same as the two patches I backported.

eb4c2382272a srcu: Lock srcu_data structure in srcu_gp_start()
d633198088bd srcu: Prohibit call_srcu() use under raw spinlocks

The commit id eb4c2382272a is the latest linux-next commit id. When I
backported the above two patches, the eb4c2382272a was in Paul's
rcu-next tree. So, for the SRU process, Joseph's backport are the formal one.

Joseph Salisbury (jsalisbury) wrote :

Hi Joshua,

I will touched base with Gavin to compare our trees. My test kernel is the current Azure kernel with two commits applied: d633198088bd9 and eb4c2382272ae7. Commit eb4c2382272ae7 being the patch from
Dennis Krein in linux-next:

eb4c2382272a ("srcu: Lock srcu_data structure in srcu_gp_start()")

Gavin and I both had the same set of commits. I can submit and SRU request for this if you don't want to wait for the testing, since it could take a long time. If I submit it this week, it won't land in the Azure kernel until the next SRU cycle in the new year. Just let us know what you think.

Here are the dates for the next cycle:

cycle: 14-Jan through 03-Feb
====================================================================
         11-Jan Last day for kernel commits for this cycle.
14-Jan - 18-Jan Kernel prep week.
21-Jan - 01-Feb Bug verification & Regression testing.
         31-Jan Release 18.04.2 kernels to -updates
         04-Feb Release remaining kernels to -updates.

Looking at these dates, we may want to SRU it this week, due to a Company shutdown between 24-Dec and 06-Jan.

Changed in linux-azure (Ubuntu):
assignee: Joseph Salisbury (jsalisbury) → Marcelo Cerri (mhcerri)
status: In Progress → Triaged
Marcelo Cerri (mhcerri) wrote :

Hi, Josh.

Did you have any feedback from the customer regarding the test kernel?

How do you want to proceed with that?

Joshua R. Poulson (jrp) wrote :

No new instances of the problem on the test cluster for many weeks. Let's move forward with this change.

overlord (lazamarius1) wrote :

Will this fix be available for Linux 4.15.0-generic x86_64, or is it available already?

I am currently on Linux 4.15.0-43-generic x86_64 and on some servers I have this issue, others are fine, I am not sure what triggers the problem but when it triggers kworker, dockerd, systemd, go in uninterruptible sleep and I need to reboot the servers to recover from the issue.

After a while the issue reappears, so I would like to patch the servers as fast as possible.
Thanks!

overlord (lazamarius1) wrote :

Using Ubuntu 16.04.5 LTS btw.

Joshua R. Poulson (jrp) wrote :

The fix was picked up for upstream stable 4.19.15 and 4.20.2. I would expect the generic kernels to eventually pick up this fix.

Marcelo Cerri (mhcerri) wrote :

Hi, Josh. Should we apply the fixes for the 4.15 and 4.18 linux-azure kernel then?

Joshua R. Poulson (jrp) wrote :

Yes, please, per comment #8

Marcelo Cerri (mhcerri) on 2019-01-31
Changed in linux-azure (Ubuntu Bionic):
assignee: nobody → Marcelo Cerri (mhcerri)
Changed in linux-azure (Ubuntu Cosmic):
assignee: nobody → Marcelo Cerri (mhcerri)
Changed in linux-azure (Ubuntu Bionic):
status: New → Confirmed
Changed in linux-azure (Ubuntu Cosmic):
status: New → Confirmed
Changed in linux-azure (Ubuntu):
status: Triaged → Confirmed
overlord (lazamarius1) on 2019-02-01
Changed in linux-hwe (Ubuntu):
status: New → Confirmed
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

overlord (lazamarius1) on 2019-02-01
affects: linux-hwe (Ubuntu) → linux-meta-hwe (Ubuntu)
Changed in linux-meta-hwe (Ubuntu Bionic):
status: New → Confirmed
Changed in linux-meta-hwe (Ubuntu Cosmic):
status: New → Confirmed
Marcelo Cerri (mhcerri) on 2019-02-01
no longer affects: linux-meta-hwe (Ubuntu)
no longer affects: linux-meta-hwe (Ubuntu Bionic)
no longer affects: linux-meta-hwe (Ubuntu Cosmic)
Marcelo Cerri (mhcerri) on 2019-02-01
no longer affects: linux-azure (Ubuntu Bionic)
no longer affects: linux-azure (Ubuntu Cosmic)
Changed in linux-azure (Ubuntu Bionic):
status: New → In Progress
Changed in linux-azure (Ubuntu Cosmic):
status: New → In Progress
Changed in linux-azure (Ubuntu Bionic):
assignee: nobody → Marcelo Cerri (mhcerri)
Changed in linux-azure (Ubuntu Cosmic):
assignee: nobody → Marcelo Cerri (mhcerri)
Changed in linux-azure (Ubuntu Bionic):
importance: Undecided → Medium
Changed in linux-azure (Ubuntu Cosmic):
importance: Undecided → Medium

This bug is missing log files that will aid in diagnosing the problem. While running an Ubuntu kernel (not a mainline or third-party kernel) please enter the following command in a terminal window:

apport-collect 1802021

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete
Changed in linux (Ubuntu Bionic):
status: New → Incomplete
Changed in linux (Ubuntu Cosmic):
status: New → Incomplete
tags: added: bionic
overlord (lazamarius1) wrote :

Hey guys, I am not sure how to do this but can you also make a patch for Xenial, 4.15.0-generic, for linux-hwe package? We have the same problem and we can't upgrade the boxes to a newer release so this would really help us.

affects: linux (Ubuntu) → linux-hwe (Ubuntu)
Changed in linux-hwe (Ubuntu):
status: Incomplete → Confirmed
overlord (lazamarius1) wrote :

Can't run the logs collection tool since the system is stuck in D state and times out. (Also the tool is not installed on the systems). We just need this fix back-ported on 4.15.0-generic for Xenial on linux-hwe package (we don't use linux-azure)

Changed in linux-hwe (Ubuntu Bionic):
status: Incomplete → Confirmed
Changed in linux-hwe (Ubuntu Cosmic):
status: Incomplete → Confirmed
overlord (lazamarius1) on 2019-02-04
information type: Public → Public Security
Changed in linux-hwe (Ubuntu):
assignee: nobody → overlord (lazamarius1)
assignee: overlord (lazamarius1) → nobody
overlord (lazamarius1) on 2019-02-04
information type: Public Security → Public
Marcelo Cerri (mhcerri) wrote :

Hi, @overlord.

I changed it to "linux" instead because xenial/linux-hwe is simply a backport of bionic/linux, so we need to apply the fix to bionic/linux first and that will be include to xenial/linux-hwe automatically.

Marcelo @mhcerri,

Would you be able to provide a test kernel for bionic/linux-hwe so that @lazamarius1 can provide test results for -generic?

I'll be happy to do that as well if you're short on time right now.
(I guess the patchset is the same you posted for linux-azure.)

Thanks,
Mauricio

overlord (lazamarius1) wrote :

Thanks Marcelo! However I still see the unassigned package is linux-hwe, and I can't add Xenial tag.
I also notices I received an update today from 4.15.0-43-generic to 4.15.0-45-generic but I cannot tell if this update has the fix from this bug.
Could you help me with some directions on this? Where could I check ?
Thanks!

overlord (lazamarius1) wrote :

Mauricio @mfo, is there a way to reproduce the issue easily in order to test it? I was not able to find it.
The only way I can tell the issue is there or not is to apply the patch and wait for the servers to "hit" the problem (could take days or weeks...), and when that happens, in my case, docker tasks will end up in D state and load average will go to 100 very fast, then also a certain kworker will hit the D state and possibly in the end the init/systemd will go also in D state, and the only recovery action is to restart the box.

David Coronel (davecore) wrote :

Hi overlord. An easy way to check the updates included in released kernels is to look at the "-changes" mailing list for your Ubuntu release.

In this situation it would be https://lists.ubuntu.com/archives/xenial-changes/

And you can find that new kernel here:

https://lists.ubuntu.com/archives/xenial-changes/2019-February/023480.html

You can see this kernel 4.15.0-45 does not include the fix for this LP bug #1802021.

David Coronel (davecore) wrote :

@overlord: AFAIK, there is no simple reproducer test case for this issue. The ideal testing scenario for a bug and fix like this one would be for each user who reported this issue to use a test kernel with the fix in their environment and report back if the issue still manifests or not after some time has passed with that test kernel in your affected environment.

overlord (lazamarius1) wrote :

Thanks David, that was my intention to patch the systems and wait for it to reproduce (usually we get the issue back in 3-8 days or so...)
Also thanks for the pointers for checking the release notes for certain releases.
Is there a connection between the changes listed in the second link and the code base for those changes? I mean to say maybe the LP bug does not appear in the changes because it was not tagged with a correct tag for it to show up but maybe the code fix is already there ? Does this make sense? I know the code fix it's already available starting with 4.19.16 but I don't know how the back-porting is handled.

David Coronel (davecore) wrote :

You can clone the ubuntu-xenial kernel:

git clone git://kernel.ubuntu.com/ubuntu/ubuntu-xenial.git

And then grep for the commit you're looking for. There's a few different ways to do it, I do:

git log --oneline | grep "Expose SMT control init function"

Marcelo Cerri (mhcerri) on 2019-02-04
no longer affects: linux-hwe (Ubuntu)
no longer affects: linux-hwe (Ubuntu Bionic)
no longer affects: linux-hwe (Ubuntu Cosmic)
no longer affects: linux-azure (Ubuntu Bionic)
no longer affects: linux-azure (Ubuntu Cosmic)
Changed in linux-azure (Ubuntu Bionic):
status: New → Fix Committed
Changed in linux-azure (Ubuntu Cosmic):
status: New → Fix Committed
Changed in linux-azure (Ubuntu Bionic):
assignee: nobody → Marcelo Cerri (mhcerri)
Changed in linux-azure (Ubuntu Cosmic):
assignee: nobody → Marcelo Cerri (mhcerri)

This bug is missing log files that will aid in diagnosing the problem. While running an Ubuntu kernel (not a mainline or third-party kernel) please enter the following command in a terminal window:

apport-collect 1802021

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete
Changed in linux (Ubuntu Bionic):
status: New → Incomplete
Changed in linux (Ubuntu Cosmic):
status: New → Incomplete
overlord (lazamarius1) on 2019-02-07
Changed in linux (Ubuntu):
status: Incomplete → Confirmed
Changed in linux (Ubuntu Bionic):
status: Incomplete → Confirmed
Changed in linux (Ubuntu Cosmic):
status: Incomplete → Confirmed

Hi Marcelo (@mhcerri),

We have another user who confirmed the 2 patches submitted for linux-azure also fix the problem on linux(-generic).

  srcu: Prohibit call_srcu() use under raw spinlocks
  srcu: Lock srcu_data structure in srcu_gp_start()

Could they be submitted for linux as well?

Thank you very much,
Mauricio

Brad Figg (brad-figg) wrote :

This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-bionic' to 'verification-done-bionic'. If the problem still exists, change the tag 'verification-needed-bionic' to 'verification-failed-bionic'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-bionic
overlord (lazamarius1) wrote :

Hi @brad-figg,

Can we get a proposed fix for Xenial linux(-generic) package?
Or that is planned after the bionic one?

Hi @lazamarius1,

The fix for linux generic should be applied in the next kernel SRU cycle.
The current cycle ends on late February [1].

[1] https://kernel.ubuntu.com/

David Coronel (davecore) wrote :

@lazamarius1: Just to clarify, the fix is scheduled to go in the 4.15 kernel in Bionic which is the same kernel as the Xenial HWE kernel. So there's no need to add anything to the Affects section. You will see a new linux-hwe 4.15 kernel in xenial-proposed once this is ready to test. Thanks!

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Duplicates of this bug

Other bug subscribers