[regression] bionic 4.15.0-13-generic panics on qemu ppc64el

Bug #1761626 reported by Ryan Finnie
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Won't Fix

Bug Description

When upgrading to bionic on a qemu ppc64el instance, the 4.15.0 kernel panics before initrd. Going back to xenial 4.4.0 works correctly.

  Installed: 4.15.0-13.14
  Candidate: 4.15.0-13.14
  Version table:
 *** 4.15.0-13.14 500
        500 http://nibbler.snowman.lan/deb-proxy/ubuntu-ports bionic/main ppc64el Packages
        100 /var/lib/dpkg/status

ProblemType: Bug
DistroRelease: Ubuntu 18.04
Package: linux-image-4.15.0-13-generic 4.15.0-13.14
ProcVersionSignature: Ubuntu 4.4.0-119.143-generic 4.4.114
Uname: Linux 4.4.0-119-generic ppc64le
.var.log.platform: Error: [Errno 13] Permission denied: '/var/log/platform'
 total 0
 crw-rw---- 1 root audio 116, 1 Apr 5 16:11 seq
 crw-rw---- 1 root audio 116, 33 Apr 5 16:11 timer
AplayDevices: Error: [Errno 2] No such file or directory: 'aplay': 'aplay'
ApportVersion: 2.20.9-0ubuntu2
Architecture: ppc64el
ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord': 'arecord'
AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', '/dev/snd/timer'] failed with exit code 1:
Date: Thu Apr 5 16:13:28 2018
HibernationDevice: RESUME=UUID=2e904a57-3ce7-4b31-97ed-6c63b81b2221
IwConfig: Error: [Errno 2] No such file or directory: 'iwconfig': 'iwconfig'
Lsusb: Bus 001 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub

 PATH=(custom, no user)

ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinux-4.4.0-119-generic root=UUID=294515c8-75f2-4996-8e4f-a838e6c75804 ro
ProcLoadAvg: 3.04 2.18 0.92 3/119 1643
 1: POSIX ADVISORY WRITE 1225 00:12:470 0 EOF
 2: POSIX ADVISORY WRITE 841 00:12:459 0 EOF
 3: FLOCK ADVISORY WRITE 832 00:12:449 0 EOF
 4: POSIX ADVISORY WRITE 802 00:12:432 0 EOF
 5: POSIX ADVISORY WRITE 350 00:12:294 0 EOF
 Filename Type Size Used Priority
 /dev/vda3 partition 1408960 0 -1
ProcVersion: Linux version 4.4.0-119-generic (buildd@bos02-ppc64el-006) (gcc version 5.4.0 20160609 (Ubuntu/IBM 5.4.0-6ubuntu1~16.04.9) ) #143-Ubuntu SMP Mon Apr 2 16:08:02 UTC 2018
 linux-restricted-modules-4.4.0-119-generic N/A
 linux-backports-modules-4.4.0-119-generic N/A
 linux-firmware 1.173
RfKill: Error: [Errno 2] No such file or directory: 'rfkill': 'rfkill'
SourcePackage: linux
UpgradeStatus: Upgraded to bionic on 2018-04-05 (0 days ago)
cpu_cores: Number of cores present = 1
cpu_coreson: Number of cores online = 1
cpu_smt: Error: command ['ppc64_cpu', '--smt'] failed with exit code 255: Machine is not SMT capable

Revision history for this message
Ryan Finnie (fo0bar) wrote :
Revision history for this message
Ryan Finnie (fo0bar) wrote :
Revision history for this message
Ryan Finnie (fo0bar) wrote :
Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote : Status changed to Confirmed

This change was made by a bot.

Changed in linux (Ubuntu):
status: New → Confirmed
Revision history for this message
Ryan Finnie (fo0bar) wrote :

Same result with 4.16.0-041600-generic and 4.13.0-38-generic

Changed in linux (Ubuntu):
importance: Undecided → High
tags: added: kernel-da-key
Changed in linux (Ubuntu Artful):
importance: Undecided → High
status: New → Triaged
Changed in linux (Ubuntu Bionic):
status: Confirmed → Triaged
tags: added: kernel-key
Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

We should probably also report this upstream, since it still affects current mainline. https://wiki.ubuntu.com/Bugs/Upstream/kernel

We can also perform a kernel bisect to identify the commit that introduced this bug. We would first need to identify the last kernel version that did not have the bug and the first version that did. Can you test the following kernels:

v4.6 Final: http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.6-yakkety/
v4.8 Final: http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.8/
v4.10 Final: http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.10/

You don't have to test every kernel, just up until the kernel that first has this bug.

Thanks in advance!

tags: added: performing-bisect
Revision history for this message
Ryan Finnie (fo0bar) wrote :

I can test earlier mainlines, but keep in mind that ppc64el is very slow on my home qemu setup (note "only" "very slow"; ppc64 is "amazingly slow", presumably due to endian swapping). Does Kernel have a POWER8 playground machine, if it comes to bisect compiling?

Revision history for this message
Ryan Finnie (fo0bar) wrote :

4.8.0 works, 4.9.0 panics. I'll see about bisecting.

Revision history for this message
Ryan Finnie (fo0bar) wrote :

I'm not even remotely sure about things here, so some raw notes:

* Current bisect effort ("oh man, this will take forever"):

# bad: [69973b830859bc6529a7a0468ba0d80ee5117826] Linux 4.9
# good: [c8d2bc9bc39ebea8437fd974fdbc21847bb897a3] Linux 4.8
git bisect start 'v4.9' 'v4.8'
# good: [a5af7e1fc69a46f29b977fd4b570e0ac414c2338] rxrpc: Fix loss of PING RESPONSE ACK production due to PING ACKs
git bisect good a5af7e1fc69a46f29b977fd4b570e0ac414c2338
# bad: [a379f71a30dddbd2e7393624e455ce53c87965d1] Merge branch 'akpm' (patches from Andrew)
git bisect bad a379f71a30dddbd2e7393624e455ce53c87965d1

* Found ppc_tm=off, tried it on 4.15.0 generic, works!

* Might be a duplicate of https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/1664622 - might be a qemu issue

* 4.15.0 does work on scalingstack bos02 POWER8 KVM xenial xenial hosts (-machine pseries-xenial,accel=kvm,usb=off -cpu host), but I figured that had already been tested and would have been noticed by now. (If I hadn't mentioned before, the host running the ppc64el fullvirt VM at home is a xenial host.)

* See also http://lists.gnu.org/archive/html/qemu-ppc/2018-01/msg00021.html - might be mitigated with pseries-2.12 with a bionic host qemu setup, but I don't have an easy way to test with a bionic host + qemu at the moment.

tags: removed: kernel-key
Revision history for this message
Andy Whitcroft (apw) wrote : Closing unsupported series nomination.

This bug was nominated against a series that is no longer supported, ie artful. The bug task representing the artful nomination is being closed as Won't Fix.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu Artful):
status: Triaged → Won't Fix
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.