Ubuntu 17.04: "Oops: Exception in kernel mode, sig: 5 [#1]" seen during fadump over ssh on Alpine machine.

Bug #1655241 reported by bugproxy on 2017-01-10
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
High
Tim Gardner
Zesty
High
Tim Gardner
makedumpfile (Ubuntu)
Critical
Unassigned
Zesty
Critical
Unassigned

Bug Description

Problem Description
================================
"Oops: Exception in kernel mode, sig: 5 [#1]" seen during fadump over ssh on Alpine machine.

Steps to Reproduce
============================
1. Configure fadump over ssh on Alpine machine.
ssh-keygen -t rsa

Add below lines in /etc/default/kdump-tools
SSH="ubuntu@9.114.15.240"
SSH_KEY=/root/.ssh/id_rsa

# kdump-config propagate

# kdump-config load

# kdump-config show

2. Trigger crash

Logs
=======
ubuntu@alp9:~$ [ 41.884641] usercopy: kernel memory exposure attempt detected from c0000000fb001020 (task_struct) (61408 bytes)
[ 41.884668] kernel BUG at /build/linux-okcqyo/linux-4.9.0/mm/usercopy.c:75!
[ 41.884672] Oops: Exception in kernel mode, sig: 5 [#1]
[ 41.884674] SMP NR_CPUS=2048 [ 41.884676] NUMA
[ 41.884677] pSeries
[ 41.884679] Modules linked in: pseries_rng vmx_crypto ib_iser rdma_cm iw_cm ib_cm ib_core configfs iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi ip_tables x_tables autofs4 btrfs raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear ibmvscsi lpfc crc32c_vpmsum scsi_transport_fc
[ 41.884714] CPU: 8 PID: 3977 Comm: makedumpfile Not tainted 4.9.0-11-generic #12-Ubuntu
[ 41.884717] task: c000000151fcdc00 task.stack: c000000150064000
[ 41.884719] NIP: c000000000312978 LR: c000000000312974 CTR: 00000000006338e4
[ 41.884722] REGS: c0000001500678d0 TRAP: 0700 Not tainted (4.9.0-11-generic)
[ 41.884725] MSR: 8000000000029033 <SF,EE,ME,IR,DR,RI,LE>[ 41.884732] CR: 28002222 XER: 00000004
[ 41.884734] CFAR: c000000000b26cac SOFTE: 1
GPR00: c000000000312974 c000000150067b50 c00000000141a400 0000000000000063
GPR04: c000000179a0ade8 c000000179a1fc40 0000000000a1b6ef 0000000000000000
GPR08: 0000000000000007 c000000000f7f87c 0000000178a90000 0000000000003ff0
GPR12: 0000000000002200 c00000000e794800 00003fff7f4d0010 00003fff7f4d0010
GPR16: 00000000bb010000 0000000054150c98 000000005412da08 0000000000010000
GPR20: 00000000540fea40 00003ffff8448150 0000000000000001 c000000001717798
GPR24: 0000000000010000 c000000150067cf0 0000000000000000 0000000000001020
GPR28: c0000000fb010000 0000000000000001 000000000000efe0 c0000000fb001020
NIP [c000000000312978] __check_object_size+0x88/0x2c0
[ 41.884777] LR [c000000000312974] __check_object_size+0x84/0x2c0
[ 41.884780] Call Trace:
[ 41.884782] [c000000150067b50] [c000000000312974] __check_object_size+0x84/0x2c0 (unreliable)
[ 41.884787] [c000000150067bd0] [c00000000006aea4] copy_to_user+0x64/0xa0
[ 41.884791] [c000000150067c10] [c000000000042360] copy_oldmem_page+0x140/0x1d0
[ 41.884796] [c000000150067c60] [c0000000003cd5a8] read_from_oldmem.part.0+0x138/0x150
[ 41.884800] [c000000150067cd0] [c0000000003cd6fc] read_vmcore+0x13c/0x270
[ 41.884803] [c000000150067d40] [c0000000003b8918] proc_reg_read+0x88/0xd0
[ 41.884807] [c000000150067d70] [c000000000318f4c] __vfs_read+0x3c/0x70
[ 41.884811] [c000000150067d90] [c00000000031a1ac] vfs_read+0xbc/0x1b0
[ 41.884814] [c000000150067de0] [c00000000031bdc8] SyS_read+0x68/0x110
[ 41.884818] [c000000150067e30] [c00000000000bd84] system_call+0x38/0xe0
[ 41.884820] Instruction dump:
[ 41.884823] 60000000 60420000 3c82ff93 3ca2ff9d 38847130 38a5b3f8 3c62ff94 7fc8f378
[ 41.884830] 7fe6fb78 38635f20 488142dd 60000000 <0fe00000> 60420000 2ba30010 409d017c
[ 41.884838] ---[ end trace c33ccad89db3894a ]---
[ 41.884840]
Copying data : [ 43.7 %] -889527+439 records in
[ 41.805683] kdump-tools[3621]: 889744+1 records out
[ 41.805920] kdump-tools[3621]: 455549276 bytes (456 MB, 434 MiB) copied, 24.5065 s, 18.6 MB/s
[ 42.263162] kdump-tools[3621]: * kdump-tools: saved vmcore in ubuntu@9.114.15.240:/home/ubuntu/9.114.15.239-201701090710
[ 42.264882] kdump-tools[3621]: * running makedumpfile --dump-dmesg /proc/vmcore /tmp/dmesg.201701090710
[ 42.268482] kdump-tools[3621]: The kernel version is not supported.
[ 42.268810] kdump-tools[3621]: The makedumpfile operation may be incomplete.
[ 42.269050] kdump-tools[3621]: The dmesg log is saved to /tmp/dmesg.201701090710.
[ 42.269236] kdump-tools[3621]: makedumpfile Completed.
[ 42.652028] kdump-tools[3621]: * kdump-tools: saved dmesg content in ubuntu@9.114.15.240:/home/ubuntu/9.114.15.239-201701090710
[ 42.654261] kdump-tools[3621]: Mon, 09 Jan 2017 07:10:38 -0500
[ 42.783431] kdump-tools[3621]: Failed to read reboot parameter file: No such file or directory
[ 42.783811] kdump-tools[3621]: Rebooting.
[ 42.864714] reboot: Restarting system

== Comment: #1 - Vaishnavi Bhat <email address hidden> - 2017-01-09 23:24:03 ==
>
> Oops: Exception in kernel mode, sig: 5 [#1]" seen during
> fadump over ssh on Alpine machine.
>

$ cat /etc/lsb-release
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=17.04
DISTRIB_CODENAME=zesty
DISTRIB_DESCRIPTION="Ubuntu Zesty Zapus (development branch)"

$ uname -a
Linux alp9 4.9.0-11-generic #12-Ubuntu SMP Mon Dec 12 16:16:45 UTC 2016 ppc64le ppc64le ppc64le GNU/Linux

$ dpkg -l | grep makedumpfile
ii makedumpfile 1:1.6.0-4 ppc64el VMcore extraction tool

== Comment: #2 - Vaishnavi Bhat <email address hidden> - 2017-01-09 23:41:44 ==
> Copying data : [ 43.7 %] -889527+439 records in
> [ 41.805683] kdump-tools[3621]: 889744+1 records out
> [ 41.805920] kdump-tools[3621]: 455549276 bytes (456 MB, 434 MiB) copied,
> 24.5065 s, 18.6 MB/s
> [ 42.263162] kdump-tools[3621]: * kdump-tools: saved vmcore in
> ubuntu@9.114.15.240:/home/ubuntu/9.114.15.239-201701090710
> [ 42.264882] kdump-tools[3621]: * running makedumpfile --dump-dmesg
> /proc/vmcore /tmp/dmesg.201701090710
> [ 42.268482] kdump-tools[3621]: The kernel version is not supported.
> [ 42.268810] kdump-tools[3621]: The makedumpfile operation may be
> incomplete.
> [ 42.269050] kdump-tools[3621]: The dmesg log is saved to
> /tmp/dmesg.201701090710.
> [ 42.269236] kdump-tools[3621]: makedumpfile Completed.
> [ 42.652028] kdump-tools[3621]: * kdump-tools: saved dmesg content in
> ubuntu@9.114.15.240:/home/ubuntu/9.114.15.239-201701090710
> [ 42.654261] kdump-tools[3621]: Mon, 09 Jan 2017 07:10:38 -0500
> [ 42.783431] kdump-tools[3621]: Failed to read reboot parameter file: No
> such file or directory
> [ 42.783811] kdump-tools[3621]: Rebooting.
> [ 42.864714] reboot: Restarting system

For the 4.8 kernel and above, we need to use makedumpfile 1.6.1 in order to avoid the "The kernel version is not supported." messages.
Mirroring to Canonical and requesting them to include the latest makedumpfile packages for Ubuntu 17.04 (4.9.0-11-generic)

bugproxy (bugproxy) on 2017-01-10
tags: added: architecture-ppc64le bugnameltc-150360 severity-high targetmilestone-inin1704
Changed in ubuntu:
assignee: nobody → Taco Screen team (taco-screen-team)
affects: ubuntu → makedumpfile (Ubuntu)

------- Comment From <email address hidden> 2017-01-11 10:55 EDT-------
Kernel commit f5509cc18daa ("mm: Hardened usercopy") introduced the
BUG() we are hitting here.

This BUG() was also hit while reading kcore, which was fixed with
kernel commit df04abfd181a ("fs/proc/kcore.c: Add bounce buffer
for ktext data"). Not convinced if similar fix is ideal here.
Working on the fix.

This issue is not just observed with ssh dump target but also with
other dump targets. Updated the bug summary accordingly..

Thanks
Hari

Manoj Iyer (manjo) wrote :

Leann, could you please get someone to pull in commit referenced in this bug to fix the oops ?

Changed in makedumpfile (Ubuntu):
assignee: Taco Screen team (taco-screen-team) → Leann Ogasawara (leannogasawara)
importance: Undecided → Critical
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2017-01-24 01:38 EDT-------
(In reply to comment #7)
> Leann, could you please get someone to pull in commit referenced in this bug
> to fix the oops ?

The commit I mentioned fixes Oops for /proc/kcore read but the issue
still remains for /proc/vmcore read. Will try and post a fix soon..

Thanks
Hari

It looks like Zesty already has commit df04abfd181a. We'll stand by for the /proc/vmcore fix. Thanks.

$ git describe --contains df04abfd181a
Ubuntu-4.8.0-22.24~506

$ git show df04abfd181a
commit df04abfd181acc276ba6762c8206891ae10ae00d
Author: Jiri Olsa <email address hidden>
Date: Thu Sep 8 09:57:08 2016 +0200

    fs/proc/kcore.c: Add bounce buffer for ktext data

Changed in makedumpfile (Ubuntu):
assignee: Leann Ogasawara (leannogasawara) → nobody
Changed in linux (Ubuntu):
assignee: nobody → Canonical Kernel Team (canonical-kernel-team)
importance: Undecided → High
status: New → Triaged

It also appears that makedumpfile for zesty is now at 1.6.1 so I'll close the makedumpfile task as Fix Released.

https://launchpad.net/ubuntu/+source/makedumpfile/1:1.6.1-1

Changed in makedumpfile (Ubuntu):
status: New → Fix Released
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2017-02-15 12:46 EDT-------
The issue is reproducible only when memory reserved for fadump is more than 50%
of the total available memory. Posted the fix patch upstream at

https://lists.ozlabs.org/pipermail/linuxppc-dev/2017-February/154111.html

Thanks
Hari

------- Comment (attachment only) From <email address hidden> 2017-02-15 12:49 EDT-------

Tim Gardner (timg-tpi) wrote :

I applied the patch in comment #7

Changed in linux (Ubuntu Zesty):
assignee: Canonical Kernel Team (canonical-kernel-team) → Tim Gardner (timg-tpi)
status: Triaged → Fix Committed

------- Comment on attachment From <email address hidden> 2017-02-27 07:07 EDT-------

Posted v2 of the patch upstream: http://patchwork.ozlabs.org/patch/732136

Attaching the backported version here. Though the patch is backported to
linux-image-4.9.0-15-generic, it should apply just fine on 4.10.0 kernel
as well as there are no changes in fadump files and no negative external
dependencies either.

Thanks
Hari

Tim Gardner (timg-tpi) wrote :

Applied V2

Launchpad Janitor (janitor) wrote :

This bug was fixed in the package linux - 4.10.0-9.11

---------------
linux (4.10.0-9.11) zesty; urgency=low

  [ Tim Gardner ]

  * Release Tracking Bug
    - LP: #1666214

  * linux: disable CONFIG_PCIEPORTBUS in the kernel (LP: #1665404)
    - [Config] CONFIG_PCIEPORTBUS=n for ppc64el

  * linux-lts-xenial 4.4.0-63.84~14.04.2 ADT test failure with linux-lts-xenial
    4.4.0-63.84~14.04.2 (LP: #1664912)
    - SAUCE: apparmor: fix link auditing failure due to, uninitialized var

  * Ubuntu 17.04: "Oops: Exception in kernel mode, sig: 5 [#1]" seen during
    fadump over ssh on Alpine machine. (LP: #1655241)
    - SAUCE: powerpc/fadump: set an upper limit for boot memory size

  * In Ubuntu 17.04 : after reboot getting message in console like Unable to
    open file: /etc/keys/x509_ima.der (-2) (LP: #1656908)
    - SAUCE: ima: Downgrade error to warning

  * NFS client : permission denied when trying to access subshare, since kernel
    4.4.0-31 (LP: #1649292)
    - fs: Better permission checking for submounts

  * Miscellaneous Ubuntu changes
    - SAUCE: (noup) Update spl to 0.6.5.9-1, zfs to 0.6.5.9-2
    - [Config] CONFIG_SCSI_HISI_SAS=m on arm64
    - d-i: Add hisi_sas_v2_hw to scsi-modules
    - d-i: Add hns_enet_drv to nic-modules
    - d-i: Add supporting modules for hns_enet_drv to nic-modules
    - rebase to v4.10

  [ Upstream Kernel Changes ]

  * rebase to v4.10

 -- Tim Gardner <email address hidden> Wed, 15 Feb 2017 11:18:07 -0700

Changed in linux (Ubuntu Zesty):
status: Fix Committed → Fix Released

------- Comment From <email address hidden> 2017-03-06 06:58 EDT-------
Hi,

Tested on 4.10.0-9-generic. Fadump is working on alpine LPAR.

Thanks,
Pavithra

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers