With kernel 4.13 btrfs scans for devices before all devices have been discovered

Bug #1752961 reported by Carl Reinke on 2018-03-02
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Status tracked in Cosmic
Artful
Medium
Joseph Salisbury
Bionic
Medium
Joseph Salisbury
Cosmic
Medium
Joseph Salisbury

Bug Description

See attached dmesg outputs for booting kernels 4.11.x (working) and 4.13.x (not working).

dmesg-4.11.0-14-good.txt shows the dmesg output when booting kernel 4.11.x.
btrfs scans for devices after all 4 (sda, sdb, sdc, sdd) of the devices have been discovered by the kernel. The btrfs RAID1 filesystem mounts, and everything is good.

dmesg-4.13.0-36-fail.txt shows the dmesg output when booting kernel 4.13.x.
btrfs scans for devices after only 2 (sda, sdb) of the devices have been discovered by the kernel. The btrfs RAID1 filesystem fails to mount ("failed to read the system array: -5"). The remaining 2 devices (sdc, sdd) are discovered by the kernel immediately afterward. At the end of the log, I run `btrfs device scan` and mount the filesystem manually.

Hardware:
  HP ProLiant MicroServer Gen8
  4x WDC WD20EFRX
---
AlsaDevices: Error: command ['ls', '-l', '/dev/snd/'] failed with exit code 2: ls: cannot access '/dev/snd/': No such file or directory
AplayDevices: aplay: device_list:270: no soundcards found...
ApportVersion: 2.20.9-0ubuntu7
Architecture: amd64
ArecordDevices: arecord: device_list:270: no soundcards found...
DistroRelease: Ubuntu 18.04
InstallationDate: Installed on 2015-10-15 (933 days ago)
InstallationMedia: Xubuntu 15.04 "Vivid Vervet" - Release amd64 (20150422.1)
MachineType: HP ProLiant MicroServer Gen8
Package: linux (not installed)
ProcEnviron:
 TERM=linux
 PATH=(custom, no user)
 LANG=en_US.UTF-8
 SHELL=/bin/bash
ProcKernelCmdLine: BOOT_IMAGE=/@/boot/vmlinuz-4.15.0-20-generic root=UUID=d976ab07-8377-46dd-ac6c-f5f7312a8305 ro rootflags=subvol=@ rootdelay=10
ProcVersionSignature: Ubuntu 4.15.0-20.21-generic 4.15.17
PulseList: Error: command ['pacmd', 'list'] failed with exit code 1: No PulseAudio daemon running, or not running as session daemon.
Tags: bionic apport-hook-error
Uname: Linux 4.15.0-20-generic x86_64
UnreportableReason: This report is about a package that is not installed.
UpgradeStatus: Upgraded to bionic on 2018-05-05 (0 days ago)
UserGroups:

_MarkForUpload: True
dmi.bios.date: 07/16/2015
dmi.bios.vendor: HP
dmi.bios.version: J06
dmi.chassis.type: 7
dmi.chassis.vendor: HP
dmi.modalias: dmi:bvnHP:bvrJ06:bd07/16/2015:svnHP:pnProLiantMicroServerGen8:pvr:cvnHP:ct7:cvr:
dmi.product.family: ProLiant
dmi.product.name: ProLiant MicroServer Gen8
dmi.sys.vendor: HP

Carl Reinke (carlreinke) wrote :
Carl Reinke (carlreinke) wrote :

This problem is still present as of 4.15.0-20.

Carl Reinke (carlreinke) on 2018-05-05
affects: linux-hwe-edge (Ubuntu) → linux (Ubuntu)

This bug is missing log files that will aid in diagnosing the problem. While running an Ubuntu kernel (not a mainline or third-party kernel) please enter the following command in a terminal window:

apport-collect 1752961

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete
tags: added: artful

apport information

tags: added: apport-collected apport-hook-error bionic
description: updated

apport information

apport information

apport information

apport information

apport information

apport information

apport information

apport information

apport information

apport information

Carl Reinke (carlreinke) on 2018-05-06
Changed in linux (Ubuntu):
status: Incomplete → Confirmed
Joseph Salisbury (jsalisbury) wrote :

Did this issue start happening after an update/upgrade? Was there a prior kernel version where you were not having this particular problem?

Would it be possible for you to test the latest upstream kernel? Refer to https://wiki.ubuntu.com/KernelMainlineBuilds . Please test the latest v4.17 kernel[0].

If this bug is fixed in the mainline kernel, please add the following tag 'kernel-fixed-upstream'.

If the mainline kernel does not fix this bug, please add the tag: 'kernel-bug-exists-upstream'.

Once testing of the upstream kernel is complete, please mark this bug as "Confirmed".

Thanks in advance.

[0] http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.17-rc4

Changed in linux (Ubuntu):
importance: Undecided → Medium
status: Confirmed → Incomplete
tags: added: kernel-da-key
Carl Reinke (carlreinke) wrote :

This issue started with the version 4.13 series. The last version that did not have the issue is 4.11.0-14.

Mainline kernel version 4.17-rc4 also exhibits the issue.

tags: added: kernel-bug-exists-upstream
Changed in linux (Ubuntu):
status: Incomplete → Confirmed
Joseph Salisbury (jsalisbury) wrote :

I'd like to perform a bisect to figure out what commit caused this regression. We need to identify the earliest kernel where the issue started happening as well as the last kernel that did not have this issue.

Can you test the following kernels and report back? We are looking for the first kernel version that exhibits this bug:

v4.12 Final: http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.12/
v4.13-rc1: http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.13-rc1/
v4.13-rc4: http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.13-rc4/
v4.13 Final: http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.13/

You don't have to test every kernel, just up until the kernel that first has this bug.

Thanks in advance!

tags: added: performing-bisect
Carl Reinke (carlreinke) wrote :

v4.12 Final works. v4.13-rc1 does not.

Carl Reinke (carlreinke) wrote :

FWIW, v4.12.14 also works.

Changed in linux (Ubuntu Cosmic):
status: Confirmed → In Progress
Changed in linux (Ubuntu Bionic):
status: New → In Progress
Changed in linux (Ubuntu Artful):
status: New → In Progress
Changed in linux (Ubuntu Bionic):
importance: Undecided → Medium
Changed in linux (Ubuntu Artful):
importance: Undecided → Medium
assignee: nobody → Joseph Salisbury (jsalisbury)
Changed in linux (Ubuntu Bionic):
assignee: nobody → Joseph Salisbury (jsalisbury)
Changed in linux (Ubuntu Cosmic):
assignee: nobody → Joseph Salisbury (jsalisbury)
Joseph Salisbury (jsalisbury) wrote :

I started a kernel bisect between v4.12 final and v4.13-rc1. The kernel bisect will require testing of about 7-10 test kernels.

I built the first test kernel, up to the following commit:
e5f76a2e0e84ca2a215ecbf6feae88780d055c56

The test kernel can be downloaded from:
http://kernel.ubuntu.com/~jsalisbury/lp1752961

Can you test that kernel and report back if it has the bug or not? I will build the next test kernel based on your test results.

Thanks in advance

Carl Reinke (carlreinke) wrote :

e5f76a2e0e84ca2a215ecbf6feae88780d055c56 (4.12.0-041200.201805301435) does not work.

Carl Reinke (carlreinke) wrote :

I guess it's less ambiguous to say that e5f76a2 has the bug.

Joseph Salisbury (jsalisbury) wrote :

I built the next test kernel, up to the following commit:
1849f800fba32cd5a0b647f824f11426b85310d8

The test kernel can be downloaded from:
http://kernel.ubuntu.com/~jsalisbury/lp1752961

Can you test that kernel and report back if it has the bug or not? I will build the next test kernel based on your test results.

Thanks in advance

Carl Reinke (carlreinke) wrote :

1849f80 has the bug.

Joseph Salisbury (jsalisbury) wrote :

Sorry for the delay.

I built the next test kernel, up to the following commit:
cbcd4f08aa637b74f575268770da86a00fabde6d

The test kernel can be downloaded from:
http://kernel.ubuntu.com/~jsalisbury/lp1752961

Can you test that kernel and report back if it has the bug or not? I will build the next test kernel based on your test results.

Thanks in advance

Carl Reinke (carlreinke) wrote :

cbcd4f0 has the bug.

Joseph Salisbury (jsalisbury) wrote :

I built the next test kernel, up to the following commit:
1b044f1cfc65a7d90b209dfabd57e16d98b58c5b

The test kernel can be downloaded from:
http://kernel.ubuntu.com/~jsalisbury/lp1752961

Can you test that kernel and report back if it has the bug or not? I will build the next test kernel based on your test results.

Thanks in advance

Carl Reinke (carlreinke) wrote :

1b044f1 does NOT have the bug.

Joseph Salisbury (jsalisbury) wrote :

I built the next test kernel, up to the following commit:
cbf4b3867875206aa548a8c6d7c886f3299d619e

The test kernel can be downloaded from:
http://kernel.ubuntu.com/~jsalisbury/lp1752961

Can you test that kernel and report back if it has the bug or not? I will build the next test kernel based on your test results.

Thanks in advance

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers