With kernel 4.13 btrfs scans for devices before all devices have been discovered

Bug #1752961 reported by Carl Reinke on 2018-03-02
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Status tracked in Cosmic
Artful
Medium
Joseph Salisbury
Bionic
Medium
Joseph Salisbury
Cosmic
Medium
Joseph Salisbury

Bug Description

See attached dmesg outputs for booting kernels 4.11.x (working) and 4.13.x (not working).

dmesg-4.11.0-14-good.txt shows the dmesg output when booting kernel 4.11.x.
btrfs scans for devices after all 4 (sda, sdb, sdc, sdd) of the devices have been discovered by the kernel. The btrfs RAID1 filesystem mounts, and everything is good.

dmesg-4.13.0-36-fail.txt shows the dmesg output when booting kernel 4.13.x.
btrfs scans for devices after only 2 (sda, sdb) of the devices have been discovered by the kernel. The btrfs RAID1 filesystem fails to mount ("failed to read the system array: -5"). The remaining 2 devices (sdc, sdd) are discovered by the kernel immediately afterward. At the end of the log, I run `btrfs device scan` and mount the filesystem manually.

Hardware:
  HP ProLiant MicroServer Gen8
  4x WDC WD20EFRX
---
AlsaDevices: Error: command ['ls', '-l', '/dev/snd/'] failed with exit code 2: ls: cannot access '/dev/snd/': No such file or directory
AplayDevices: aplay: device_list:270: no soundcards found...
ApportVersion: 2.20.9-0ubuntu7
Architecture: amd64
ArecordDevices: arecord: device_list:270: no soundcards found...
DistroRelease: Ubuntu 18.04
InstallationDate: Installed on 2015-10-15 (933 days ago)
InstallationMedia: Xubuntu 15.04 "Vivid Vervet" - Release amd64 (20150422.1)
MachineType: HP ProLiant MicroServer Gen8
Package: linux (not installed)
ProcEnviron:
 TERM=linux
 PATH=(custom, no user)
 LANG=en_US.UTF-8
 SHELL=/bin/bash
ProcKernelCmdLine: BOOT_IMAGE=/@/boot/vmlinuz-4.15.0-20-generic root=UUID=d976ab07-8377-46dd-ac6c-f5f7312a8305 ro rootflags=subvol=@ rootdelay=10
ProcVersionSignature: Ubuntu 4.15.0-20.21-generic 4.15.17
PulseList: Error: command ['pacmd', 'list'] failed with exit code 1: No PulseAudio daemon running, or not running as session daemon.
Tags: bionic apport-hook-error
Uname: Linux 4.15.0-20-generic x86_64
UnreportableReason: This report is about a package that is not installed.
UpgradeStatus: Upgraded to bionic on 2018-05-05 (0 days ago)
UserGroups:

_MarkForUpload: True
dmi.bios.date: 07/16/2015
dmi.bios.vendor: HP
dmi.bios.version: J06
dmi.chassis.type: 7
dmi.chassis.vendor: HP
dmi.modalias: dmi:bvnHP:bvrJ06:bd07/16/2015:svnHP:pnProLiantMicroServerGen8:pvr:cvnHP:ct7:cvr:
dmi.product.family: ProLiant
dmi.product.name: ProLiant MicroServer Gen8
dmi.sys.vendor: HP

Carl Reinke (carlreinke) wrote :
Carl Reinke (carlreinke) wrote :

This problem is still present as of 4.15.0-20.

Carl Reinke (carlreinke) on 2018-05-05
affects: linux-hwe-edge (Ubuntu) → linux (Ubuntu)

This bug is missing log files that will aid in diagnosing the problem. While running an Ubuntu kernel (not a mainline or third-party kernel) please enter the following command in a terminal window:

apport-collect 1752961

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete
tags: added: artful

apport information

tags: added: apport-collected apport-hook-error bionic
description: updated

apport information

apport information

apport information

apport information

apport information

apport information

apport information

apport information

apport information

apport information

Carl Reinke (carlreinke) on 2018-05-06
Changed in linux (Ubuntu):
status: Incomplete → Confirmed
Joseph Salisbury (jsalisbury) wrote :

Did this issue start happening after an update/upgrade? Was there a prior kernel version where you were not having this particular problem?

Would it be possible for you to test the latest upstream kernel? Refer to https://wiki.ubuntu.com/KernelMainlineBuilds . Please test the latest v4.17 kernel[0].

If this bug is fixed in the mainline kernel, please add the following tag 'kernel-fixed-upstream'.

If the mainline kernel does not fix this bug, please add the tag: 'kernel-bug-exists-upstream'.

Once testing of the upstream kernel is complete, please mark this bug as "Confirmed".

Thanks in advance.

[0] http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.17-rc4

Changed in linux (Ubuntu):
importance: Undecided → Medium
status: Confirmed → Incomplete
tags: added: kernel-da-key
Carl Reinke (carlreinke) wrote :

This issue started with the version 4.13 series. The last version that did not have the issue is 4.11.0-14.

Mainline kernel version 4.17-rc4 also exhibits the issue.

tags: added: kernel-bug-exists-upstream
Changed in linux (Ubuntu):
status: Incomplete → Confirmed
Joseph Salisbury (jsalisbury) wrote :

I'd like to perform a bisect to figure out what commit caused this regression. We need to identify the earliest kernel where the issue started happening as well as the last kernel that did not have this issue.

Can you test the following kernels and report back? We are looking for the first kernel version that exhibits this bug:

v4.12 Final: http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.12/
v4.13-rc1: http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.13-rc1/
v4.13-rc4: http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.13-rc4/
v4.13 Final: http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.13/

You don't have to test every kernel, just up until the kernel that first has this bug.

Thanks in advance!

tags: added: performing-bisect
Carl Reinke (carlreinke) wrote :

v4.12 Final works. v4.13-rc1 does not.

Carl Reinke (carlreinke) wrote :

FWIW, v4.12.14 also works.

Changed in linux (Ubuntu Cosmic):
status: Confirmed → In Progress
Changed in linux (Ubuntu Bionic):
status: New → In Progress
Changed in linux (Ubuntu Artful):
status: New → In Progress
Changed in linux (Ubuntu Bionic):
importance: Undecided → Medium
Changed in linux (Ubuntu Artful):
importance: Undecided → Medium
assignee: nobody → Joseph Salisbury (jsalisbury)
Changed in linux (Ubuntu Bionic):
assignee: nobody → Joseph Salisbury (jsalisbury)
Changed in linux (Ubuntu Cosmic):
assignee: nobody → Joseph Salisbury (jsalisbury)
Joseph Salisbury (jsalisbury) wrote :

I started a kernel bisect between v4.12 final and v4.13-rc1. The kernel bisect will require testing of about 7-10 test kernels.

I built the first test kernel, up to the following commit:
e5f76a2e0e84ca2a215ecbf6feae88780d055c56

The test kernel can be downloaded from:
http://kernel.ubuntu.com/~jsalisbury/lp1752961

Can you test that kernel and report back if it has the bug or not? I will build the next test kernel based on your test results.

Thanks in advance

Carl Reinke (carlreinke) wrote :

e5f76a2e0e84ca2a215ecbf6feae88780d055c56 (4.12.0-041200.201805301435) does not work.

Carl Reinke (carlreinke) wrote :

I guess it's less ambiguous to say that e5f76a2 has the bug.

Joseph Salisbury (jsalisbury) wrote :

I built the next test kernel, up to the following commit:
1849f800fba32cd5a0b647f824f11426b85310d8

The test kernel can be downloaded from:
http://kernel.ubuntu.com/~jsalisbury/lp1752961

Can you test that kernel and report back if it has the bug or not? I will build the next test kernel based on your test results.

Thanks in advance

Carl Reinke (carlreinke) wrote :

1849f80 has the bug.

Joseph Salisbury (jsalisbury) wrote :

Sorry for the delay.

I built the next test kernel, up to the following commit:
cbcd4f08aa637b74f575268770da86a00fabde6d

The test kernel can be downloaded from:
http://kernel.ubuntu.com/~jsalisbury/lp1752961

Can you test that kernel and report back if it has the bug or not? I will build the next test kernel based on your test results.

Thanks in advance

Carl Reinke (carlreinke) wrote :

cbcd4f0 has the bug.

Joseph Salisbury (jsalisbury) wrote :

I built the next test kernel, up to the following commit:
1b044f1cfc65a7d90b209dfabd57e16d98b58c5b

The test kernel can be downloaded from:
http://kernel.ubuntu.com/~jsalisbury/lp1752961

Can you test that kernel and report back if it has the bug or not? I will build the next test kernel based on your test results.

Thanks in advance

Carl Reinke (carlreinke) wrote :

1b044f1 does NOT have the bug.

Joseph Salisbury (jsalisbury) wrote :

I built the next test kernel, up to the following commit:
cbf4b3867875206aa548a8c6d7c886f3299d619e

The test kernel can be downloaded from:
http://kernel.ubuntu.com/~jsalisbury/lp1752961

Can you test that kernel and report back if it has the bug or not? I will build the next test kernel based on your test results.

Thanks in advance

Carl Reinke (carlreinke) wrote :

cbf4b38 does NOT have the bug.

Joseph Salisbury (jsalisbury) wrote :

I built the next test kernel, up to the following commit:
4422d80ed7d4bdb2d6e9fb890c66c3d9250ba694

The test kernel can be downloaded from:
http://kernel.ubuntu.com/~jsalisbury/lp1752961

Can you test that kernel and report back if it has the bug or not? I will build the next test kernel based on your test results.

Thanks in advance

Carl Reinke (carlreinke) wrote :

4422d80 does NOT have the bug.

Joseph Salisbury (jsalisbury) wrote :

I built the next test kernel, up to the following commit:
24040a58379e2f2fa6aa9466911b758073b6bdfa

The test kernel can be downloaded from:
http://kernel.ubuntu.com/~jsalisbury/lp1752961

Can you test that kernel and report back if it has the bug or not? I will build the next test kernel based on your test results.

Thanks in advance

Carl Reinke (carlreinke) wrote :

24040a5 does NOT have the bug.

Joseph Salisbury (jsalisbury) wrote :

I built the next test kernel, up to the following commit:
44c891af576997763d1d4c790d50d10db9eff00f

The test kernel can be downloaded from:
http://kernel.ubuntu.com/~jsalisbury/lp1752961

Can you test that kernel and report back if it has the bug or not? I will build the next test kernel based on your test results.

Thanks in advance

Carl Reinke (carlreinke) wrote :

44c891a does NOT have the bug.

Joseph Salisbury (jsalisbury) wrote :

I built the next test kernel, up to the following commit:
c94dc34f771a25b8c3e0955147fdc4f5e3d79908

The test kernel can be downloaded from:
http://kernel.ubuntu.com/~jsalisbury/lp1752961

Can you test that kernel and report back if it has the bug or not? I will build the next test kernel based on your test results.

Thanks in advance

Carl Reinke (carlreinke) wrote :

c94dc34 does NOT have the bug.

Joseph Salisbury (jsalisbury) wrote :

I built the next test kernel, up to the following commit:
0cf9f5096da2200b52cee0e38139c99c4fc0151c

The test kernel can be downloaded from:
http://kernel.ubuntu.com/~jsalisbury/lp1752961

Can you test that kernel and report back if it has the bug or not? I will build the next test kernel based on your test results.

Thanks in advance

Carl Reinke (carlreinke) wrote :

0cf9f50 does NOT have the bug.

Joseph Salisbury (jsalisbury) wrote :

I built the next test kernel, up to the following commit:
409acdd0412e9343095d965a9228f6e6a83a416f

The test kernel can be downloaded from:
http://kernel.ubuntu.com/~jsalisbury/lp1752961

Can you test that kernel and report back if it has the bug or not? I will build the next test kernel based on your test results.

Thanks in advance

Carl Reinke (carlreinke) wrote :

409acdd does NOT have the bug.

Joseph Salisbury (jsalisbury) wrote :

I built the next test kernel, up to the following commit:
878c33a78811f90795f17333bc3a7c819a1589a7

The test kernel can be downloaded from:
http://kernel.ubuntu.com/~jsalisbury/lp1752961

Can you test that kernel and report back if it has the bug or not? I will build the next test kernel based on your test results.

Thanks in advance

Carl Reinke (carlreinke) wrote :

878c33a does NOT have the bug.

This bug was nominated against a series that is no longer supported, ie artful. The bug task representing the artful nomination is being closed as Won't Fix.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu Artful):
status: In Progress → Won't Fix
Carl Reinke (carlreinke) wrote :

Any new build to test?

This issue is still present in 4.15.0-30.

Joseph Salisbury (jsalisbury) wrote :

Sorry for the delay. I missed the email regarding you're test results of the last kernel.

I built the next test kernel, up to the following commit:

The test kernel can be downloaded from:
http://kernel.ubuntu.com/~jsalisbury/lp1752961

Can you test that kernel and report back if it has the bug or not? I will build the next test kernel based on your test results.

Thanks in advance

Joseph Salisbury (jsalisbury) wrote :

The latest test kernel is for commit 362f6729cbb1d6bbab59e069f19441b0622ff7ec

Carl Reinke (carlreinke) wrote :

362f672 has the bug.

Joseph Salisbury (jsalisbury) wrote :

I built the next test kernel, up to the following commit:
6836796de4019944f4ba4c99a360e8250fd2e735

The test kernel can be downloaded from:
http://kernel.ubuntu.com/~jsalisbury/lp1752961

Can you test that kernel and report back if it has the bug or not? I will build the next test kernel based on your test results.

Thanks in advance

Carl Reinke (carlreinke) wrote :

6836796 does NOT have the bug.

Joseph Salisbury (jsalisbury) wrote :

Commit 362f6729cbb1d6bbab59e069f19441b0622ff7ec was reported as the first bad commit. However, this commit is a merge tag, so we'll need to bisect into that merge.

Before we start that process, this bug has been open a while and 4.18 final is now available. Could you test that kernel to see if this bug was already fixed upstream?

http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.18

Carl Reinke (carlreinke) wrote :

It looks like 4.18 does not have the issue.

Joseph Salisbury (jsalisbury) wrote :

That's good news. We now have to identify the fix in 4.18 and ensure it is applied to Ubuntu. Can you test the following two upstream stable kernels to see if the fix was already sent to stable:

4.15.18: http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.15.18/
4.17.17: http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.17.17/

Carl Reinke (carlreinke) wrote :

Both 4.15.18 and 4.17.17 have the bug.

Joseph Salisbury (jsalisbury) wrote :

Thanks for testing. We need to narrow down the specific version of 4.18 that has the fix. Can you test the following kernels:

v4.18-rc1: http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.18-rc1/
v4.18-rc4: http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.18-rc4/

Carl Reinke (carlreinke) wrote :

Both 4.18-rc1 and 4.18-rc4 do NOT have the bug.

Joseph Salisbury (jsalisbury) wrote :

I reviewed the btrfs commits in v4.18-rc1, but the are quite a few of them. We should perform a "Reverse" bisect to identify the correct on that fixes this bug.

I started a "Reverse" bisect between v4.17 final and v4.18-rc1.

I built the first test kernel, up to the following commit:
1c8c5a9d38f607c0b6fd12c91cbe1a4418762a21

The test kernel can be downloaded from:
http://kernel.ubuntu.com/~jsalisbury/lp1752961

Can you test that kernel and report back if it has the bug or not? I will build the next test kernel based on your test results.

Thanks in advance

Carl Reinke (carlreinke) wrote :

1c8c5a9 has the bug.

Joseph Salisbury (jsalisbury) wrote :

I built the next test kernel, up to the following commit:
d60dafdca4b463405e5586df923f05b10e9ac2f9

The test kernel can be downloaded from:
http://kernel.ubuntu.com/~jsalisbury/lp1752961

Can you test that kernel and report back if it has the bug or not? I will build the next test kernel based on your test results.

Thanks in advance

Carl Reinke (carlreinke) wrote :

d60dafd does NOT have the bug.

Joseph Salisbury (jsalisbury) wrote :

I built the next test kernel, up to the following commit:
fdea70d26a471e002f2afc3a48821323b699f1e6

The test kernel can be downloaded from:
http://kernel.ubuntu.com/~jsalisbury/lp1752961

Can you test that kernel and report back if it has the bug or not? I will build the next test kernel based on your test results.

Thanks in advance

Carl Reinke (carlreinke) wrote :

fdea70d has the bug.

Joseph Salisbury (jsalisbury) wrote :

I built the next test kernel, up to the following commit:
6a8b25abf1b79db6877645335c73ad6a5061d9b0

The test kernel can be downloaded from:
http://kernel.ubuntu.com/~jsalisbury/lp1752961

Can you test that kernel and report back if it has the bug or not? I will build the next test kernel based on your test results.

Thanks in advance

Carl Reinke (carlreinke) wrote :

The packages appear to still be the build from 10/01.

I rebuilt the kernel, so they should be there now:
http://kernel.ubuntu.com/~jsalisbury/lp1752961

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers