Writing to GPT partition on NVM device triggers "Kernel BUG"

Bug #1700225 reported by Jukka Laurila
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Confirmed
Medium
Unassigned

Bug Description

When I try writing a large amount of data to a GPT partition on my NVM SSD using dd, after a second or so I get "kernel BUG" messages in dmesg, and the writing comes to a halt. The machine remains responsive and I can write to other disks, but if I run "sync", it hangs forever.

The bug is 100 % repeatable. To replicate, run:

  dd if=/dev/zero of=/dev/nvme0n1p1

This happens only when I write more than 0.5-1.5 GiB to a GPT partition on the NVM device. The bug does not happen in the following cases:

- writing just 500 MB on a GPT partition
- writing to the raw device (/dev/nvme0n1)
- writing to a DOS partition (i.e. when the NVM device is using a DOS style partition table)

The device in question is a 1 TB Samsung 960 EVO M.2 SSD.
Ubuntu release: 17.04

I have dmesg logs available of four instances of this happening, with very similar stack traces. I will be happy to experiment and provide any additional information you may need.
---
ApportVersion: 2.20.4-0ubuntu4.1
Architecture: amd64
AudioDevicesInUse:
 USER PID ACCESS COMMAND
 /dev/snd/controlC1: jl 1792 F.... pulseaudio
 /dev/snd/controlC0: jl 1792 F.... pulseaudio
CurrentDesktop: Unity:Unity7
DistroRelease: Ubuntu 17.04
IwConfig:
 lo no wireless extensions.

 enp0s31f6 no wireless extensions.
MachineType: System manufacturer System Product Name
NonfreeKernelModules: nvidia_uvm nvidia_drm nvidia_modeset nvidia
Package: linux (not installed)
ProcFB:

ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-4.10.0-24-generic root=UUID=1fb7b4e5-d026-4b10-a87b-c7726fe0b1a3 ro
ProcVersionSignature: Ubuntu 4.10.0-24.28-generic 4.10.15
RelatedPackageVersions:
 linux-restricted-modules-4.10.0-24-generic N/A
 linux-backports-modules-4.10.0-24-generic N/A
 linux-firmware 1.164.1
RfKill:

Tags: zesty
Uname: Linux 4.10.0-24-generic x86_64
UpgradeStatus: No upgrade log present (probably fresh install)
UserGroups: adm cdrom dip lpadmin plugdev sambashare sudo
_MarkForUpload: True
dmi.bios.date: 09/19/2016
dmi.bios.vendor: American Megatrends Inc.
dmi.bios.version: 2202
dmi.board.asset.tag: Default string
dmi.board.name: Z170-A
dmi.board.vendor: ASUSTeK COMPUTER INC.
dmi.board.version: Rev 1.xx
dmi.chassis.asset.tag: Default string
dmi.chassis.type: 3
dmi.chassis.vendor: Default string
dmi.chassis.version: Default string
dmi.modalias: dmi:bvnAmericanMegatrendsInc.:bvr2202:bd09/19/2016:svnSystemmanufacturer:pnSystemProductName:pvrSystemVersion:rvnASUSTeKCOMPUTERINC.:rnZ170-A:rvrRev1.xx:cvnDefaultstring:ct3:cvrDefaultstring:
dmi.product.name: System Product Name
dmi.product.version: System Version
dmi.sys.vendor: System manufacturer

Revision history for this message
Jukka Laurila (jlaurila) wrote :
Revision history for this message
Jukka Laurila (jlaurila) wrote :
Revision history for this message
Jukka Laurila (jlaurila) wrote :
Revision history for this message
Ubuntu Foundations Team Bug Bot (crichton) wrote :

Thank you for taking the time to report this bug and helping to make Ubuntu better. It seems that your bug report is not filed about a specific source package though, rather it is just filed against Ubuntu in general. It is important that bug reports be filed about source packages so that people interested in the package can find the bugs about it. You can find some hints about determining what package your bug might be about at https://wiki.ubuntu.com/Bugs/FindRightPackage. You might also ask for help in the #ubuntu-bugs irc channel on Freenode.

To change the source package that this bug is filed about visit https://bugs.launchpad.net/ubuntu/+bug/1700225/+editstatus and add the package name in the text box next to the word Package.

[This is an automated message. I apologize if it reached you inappropriately; please just reply to this message indicating so.]

tags: added: bot-comment
Revision history for this message
Jukka Laurila (jlaurila) wrote :
Jukka Laurila (jlaurila)
affects: ubuntu → linux-hwe-edge (Ubuntu)
affects: linux-hwe-edge (Ubuntu) → linux (Ubuntu)
Revision history for this message
Joseph Salisbury (jsalisbury) wrote : Missing required logs.

This bug is missing log files that will aid in diagnosing the problem. While running an Ubuntu kernel (not a mainline or third-party kernel) please enter the following command in a terminal window:

apport-collect 1700225

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete
tags: added: zesty
Revision history for this message
Jukka Laurila (jlaurila) wrote : AlsaInfo.txt

apport information

tags: added: apport-collected
description: updated
Revision history for this message
Jukka Laurila (jlaurila) wrote : CRDA.txt

apport information

Revision history for this message
Jukka Laurila (jlaurila) wrote : CurrentDmesg.txt

apport information

Revision history for this message
Jukka Laurila (jlaurila) wrote : JournalErrors.txt

apport information

Revision history for this message
Jukka Laurila (jlaurila) wrote : Lspci.txt

apport information

Revision history for this message
Jukka Laurila (jlaurila) wrote : Lsusb.txt

apport information

Revision history for this message
Jukka Laurila (jlaurila) wrote : ProcCpuinfo.txt

apport information

Revision history for this message
Jukka Laurila (jlaurila) wrote : ProcCpuinfoMinimal.txt

apport information

Revision history for this message
Jukka Laurila (jlaurila) wrote : ProcEnviron.txt

apport information

Revision history for this message
Jukka Laurila (jlaurila) wrote : ProcInterrupts.txt

apport information

Revision history for this message
Jukka Laurila (jlaurila) wrote : ProcModules.txt

apport information

Revision history for this message
Jukka Laurila (jlaurila) wrote : PulseList.txt

apport information

Revision history for this message
Jukka Laurila (jlaurila) wrote : UdevDb.txt

apport information

Revision history for this message
Jukka Laurila (jlaurila) wrote : WifiSyslog.txt

apport information

Changed in linux (Ubuntu):
status: Incomplete → Confirmed
Revision history for this message
Jukka Laurila (jlaurila) wrote :

Ran apport. Note that while during that apport run the kernel was marked as tainted because of the proprietary NVIDIA driver, the bug also replicates without that driver loaded in an untainted kernel. The dmesg logs I posted earlier are of that state.

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

Did this issue start happening after an update/upgrade? Was there a prior kernel version where you were not having this particular problem?

Would it be possible for you to test the latest upstream kernel? Refer to https://wiki.ubuntu.com/KernelMainlineBuilds . Please test the latest v4.12 kernel[0].

If this bug is fixed in the mainline kernel, please add the following tag 'kernel-fixed-upstream'.

If the mainline kernel does not fix this bug, please add the tag: 'kernel-bug-exists-upstream'.

Once testing of the upstream kernel is complete, please mark this bug as "Confirmed".

Thanks in advance.

[0] http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.12

Changed in linux (Ubuntu):
importance: Undecided → Medium
status: Confirmed → Incomplete
tags: added: kernel-da-key
Revision history for this message
Jukka Laurila (jlaurila) wrote :

The issue did not start happening after an update - I just got this NVM device, and kernel 4.10.0-24.28-generic was the latest one at the time.

I tried the mainline kernel version
linux-image-4.12.0-041200-generic_4.12.0-041200.201707022031_amd64.deb
and the bug did _not_ reproduce on that. Tagging as kernel-fixed-upstream.

tags: added: kernel-fixed-upstream
Changed in linux (Ubuntu):
status: Incomplete → Confirmed
Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

Can you also give the latest upstream 4.10 kernel a test? It is available from:
http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.10.17/

Revision history for this message
Jukka Laurila (jlaurila) wrote : Re: [Bug 1700225] Re: Writing to GPT partition on NVM device triggers "Kernel BUG"

Yes, but it will take a few weeks since I am vacationing abroad. I'll try
it when I get back and let you know how it went.

On Tue, 11 Jul 2017 at 16:41, Joseph Salisbury <
<email address hidden>> wrote:

> Can you also give the latest upstream 4.10 kernel a test? It is available
> from:
> http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.10.17/
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1700225
>
> Title:
> Writing to GPT partition on NVM device triggers "Kernel BUG"
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1700225/+subscriptions
>

Revision history for this message
Jukka Laurila (jlaurila) wrote :

Sorry this took so long, but I can confirm that the bug does not reproduce on kernel 4.10.17-041017-generic that you linked to, or on 4.13.0-37-generic from 17.10. Interestingly, kernel 4.10.17 is 3-4x slower at writing to the NVM device than 4.13.0-37, with the former I get 260-290 MB/s, with the latter I get about 1 GB/s.

In any case, the bug is fixed in 4.10.17.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.