CAPI:Ubuntu: Kernel panic while rebooting
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
linux (Ubuntu) |
Fix Released
|
High
|
Seth Forshee | ||
Yakkety |
Fix Released
|
High
|
Seth Forshee |
Bug Description
default desc
bugproxy (bugproxy) wrote : sosreport | #1 |
tags: | added: architecture-ppc64le bugnameltc-151332 severity-high targetmilestone-inin1704 |
bugproxy (bugproxy) wrote : systemclt-a | #2 |
Changed in ubuntu: | |
assignee: | nobody → Taco Screen team (taco-screen-team) |
affects: | ubuntu → linux (Ubuntu) |
Changed in linux (Ubuntu): | |
status: | New → Incomplete |
Manoj Iyer (manjo) wrote : | #3 |
Please file bugs with bug description and process to recreate.
bugproxy (bugproxy) wrote : Comment bridged from LTC Bugzilla | #4 |
default desc
Default Comment by Bridge
Default Comment by Bridge
------- Comment From <email address hidden> 2017-02-24 03:56 EDT-------
> The fix (together with other fixes) were posted and tracked by:
> https:/
>
Note that the full series (3 patches) should be backported. That was:
https:/
https:/
https:/
All 3 are in the powerpc maintainer's next tree:
https:/
pci/hotplug/
https:/
pci/hotplug/
https:/
pci/hotplug/
Thanks!
default desc
Default Comment by Bridge
Default Comment by Bridge
------- Comment From <email address hidden> 2017-02-24 05:46 EDT-------
(In reply to comment #25)
> > The fix (together with other fixes) were posted and tracked by:
> > https:/
> >
>
> Note that the full series (3 patches) should be backported. That was:
> https:/
> https:/
> https:/
>
> All 3 are in the powerpc maintainer's next tree:
>
> https:/
> ?id=36c7c9da40c
> pci/hotplug/
>
> https:/
> ?id=303529d6ef1
> pci/hotplug/
>
> https:/
> ?id=49f4b08e615
> pci/hotplug/
>
> Thanks!
Hello Canonical,
Please include above 3 patches for Ubuntu 16.04 LTS and Ubuntu 17.04.
All 3 are already accepted by ppc maintainer.
default desc
Default Comment by Bridge
Default Comment by Bridge
Michael Hohnbaum (hohnbaum) wrote : Re: [Bug 1667599] Comment bridged from LTC Bugzilla | #5 |
Leann,
Now that there is some context in this bug, it appears to be patches for
the Kernel Team to evaluate.
On 02/24/2017 11:41 AM, bugproxy wrote:
> default desc
>
> Default Comment by Bridge
>
> Default Comment by Bridge
>
> ------- Comment From <email address hidden> 2017-02-24 03:56 EDT-------
>> The fix (together with other fixes) were posted and tracked by:
>> https:/
>>
> Note that the full series (3 patches) should be backported. That was:
> https:/
> https:/
> https:/
>
> All 3 are in the powerpc maintainer's next tree:
>
> https:/
> pci/hotplug/
>
> https:/
> pci/hotplug/
>
> https:/
> pci/hotplug/
>
> Thanks!
> default desc
>
> Default Comment by Bridge
>
> Default Comment by Bridge
>
> ------- Comment From <email address hidden> 2017-02-24 05:46 EDT-------
> (In reply to comment #25)
>>> The fix (together with other fixes) were posted and tracked by:
>>> https:/
>>>
>> Note that the full series (3 patches) should be backported. That was:
>> https:/
>> https:/
>> https:/
>>
>> All 3 are in the powerpc maintainer's next tree:
>>
>> https:/
>> ?id=36c7c9da40c
>> pci/hotplug/
>>
>> https:/
>> ?id=303529d6ef1
>> pci/hotplug/
>>
>> https:/
>> ?id=49f4b08e615
>> pci/hotplug/
>>
>> Thanks!
> Hello Canonical,
>
> Please include above 3 patches for Ubuntu 16.04 LTS and Ubuntu 17.04.
> All 3 are already accepted by ppc maintainer.
> default desc
>
> Default Comment by Bridge
>
> Default Comment by Bridge
>
--
Michael Hohnbaum
OIL Program Manager
Power (ppc64el) Development Project Manager
Canonical, Ltd.
bugproxy (bugproxy) wrote : systemclt-a | #6 |
Changed in linux (Ubuntu): | |
assignee: | Taco Screen team (taco-screen-team) → Canonical Kernel Team (canonical-kernel-team) |
importance: | Undecided → High |
status: | Incomplete → Triaged |
Vipin K Parashar (vipin-g) wrote : | #7 |
---Problem Description---
While rebooting the firestone machine which is having a CAPI card, just after the reboot command, kernel is getting panicked.
---uname output---
Linux ltc84-pkvm1 4.9.0-15-generic #16-Ubuntu SMP Fri Jan 20 15:28:49 UTC 2017 ppc64le ppc64le ppc64le GNU/Linux
---Additional Hardware Info---
0001:01:00.0 Processing accelerators: IBM Device 0477 (rev 01)
0002:00:00.0 Processing accelerators: IBM Device 4350 (rev 0a)
Machine Type = PowerNV 8335-GTA
---System Hang---
reset via ipmitool
---Steps to Reproduce---
1. on a Firestone machine with CAPI Nallatech card , run reboot command from the terminal
Contact Information = <email address hidden>
Oops output:
Ubuntu 17.04. . . .Ubuntu 17.04. . . .Ubuntu 17.04. . . .Ubuntu 17.04. . . .Ubuntu 17.04. . . .Ubuntu 17.04. . . .[ 283.410962] Oops: Exception in kernel mode, sig: 5 [#1]
[ 283.410962] Oops: Exception in kernel mode, sig: 5 [#1]
[ 283.411564] SMP NR_CPUS=2048 [ 283.411564] SMP NR_CPUS=2048 [ 283.411962] NUMA
[ 283.411962] NUMA
[ 283.412166] PowerNV
[ 283.412166] PowerNV
[ 283.412505] Modules linked in:[ 283.412505] Modules linked in: kvm_hv kvm_hv kvm_pr kvm_pr kvm kvm usb_f_tcm usb_f_tcm libcomposite libcomposite udc_core udc_core target_core_mod target_core_mod configfs configfs ip_set ip_set nfnetlink nfnetlink bridge bridge stp stp llc llc joydev joydev input_leds input_leds mac_hid mac_hid hid_generic hid_generic usbhid usbhid hid hid ofpart ofpart cmdlinepart cmdlinepart at24 at24 nvmem_core nvmem_core powernv_flash powernv_flash ipmi_powernv ipmi_powernv ipmi_msghandler ipmi_msghandler mtd mtd uio_pdrv_genirq uio_pdrv_genirq uio uio ibmpowernv ibmpowernv vmx_crypto vmx_crypto opal_prd opal_prd powernv_rng powernv_rng binfmt_misc binfmt_misc x_tables x_tables autofs4 autofs4 uas uas usb_storage usb_storage nouveau nouveau crc32c_vpmsum crc32c_vpmsum ast ast i2c_algo_bit i2c_algo_bit ttm ttm ahci ahci drm_kms_helper drm_kms_helper syscopyarea syscopyarea tg3 tg3 libahci libahci sysfillrect sysfillrect sysimgblt sysimgblt fb_sys_fops fb_sys_fops drm drm cxl cxl pnv_php pnv_php [last unloaded: ip_tables] [last unloaded: ip_tables]
[ 283.417833] CPU: 19 PID: 1 Comm: systemd-shutdow Tainted: G W 4.9.0-15-generic #16-Ubuntu
[ 283.417833] CPU: 19 PID: 1 Comm: systemd-shutdow Tainted: G W 4.9.0-15-generic #16-Ubuntu
[ 283.418795] task: c0000063bb938600 task.stack: c0000027f1188000
[ 283.418795] task: c0000063bb938600 task.stack: c0000027f1188000
[ 283.419412] NIP: c00000000064fd40 LR: c000000000630564 CTR: c0000000006304f0
[ 283.419412] NIP: c00000000064fd40 LR: c000000000630564 CTR: c0000000006304f0
[ 283.420277] REGS: c0000027f118b8b0 TRAP: 0700 Tainted: G W (4.9.0-15-generic)
[ 283.420277] REGS: c0000027f118b8b0 TRAP: 0700 Tainted: G W (4.9.0-15-generic)
[ 283.421161] MSR: 900000000282b033 [ 283.421161] MSR: 900000000282b033 <<SFSF,
[ 283.422060] CR: 44288888 XER: 00000000
[ 283.422475] CFAR: c000000000630560 [ 283.422475] CFAR: c0000000...
Vipin K Parashar (vipin-g) wrote : | #8 |
I think the issue is caused by pcieport_drv by which the MSI is enabled on 0021:02:01.0. Behind the PLX downstream port, there is surprise hotpluggable slot. When pnv-php.ko is loaded, it tries to enable MSI and the backtrace is thrown this time. Two things can be done to avoid this awkward situation:
(1) Prohibit pcieport_drv. It does nothing on PowerNV platform. pnv-php.ko doesn't depend on any functionalities exported by it.
(2) I will add some code in pnv-php.ko to skip enabling MSIx/MSI if the PLX downstream port has associated driver. With it, we won't see the backtrace hopefully. However, the surprise hotplug functionality is lost.
root@ltc84-pkvm1:~# lspci -vvvs 0021:02:01.0 | grep "Kernel driver in use"
Kernel driver in use: pcieport
root@ltc84-pkvm1:~# echo 0021:02:01.0 > /sys/bus/
root@ltc84-pkvm1:~# lspci -vvvs 0021:02:01.0 | grep "Kernel driver in use"
root@ltc84-pkvm1:~# insmod /lib/modules/
root@ltc84-pkvm1:~# dmesg | tail -30
[ 8.677661] audit: type=1400 audit(148712268
[ 8.677671] audit: type=1400 audit(148712268
[ 8.679095] audit: type=1400 audit(148712268
[ 8.751964] audit: type=1400 audit(148712268
[ 8.751974] audit: type=1400 audit(148712268
[ 8.751979] audit: type=1400 audit(148712268
[ 8.751984] audit: type=1400 audit(148712268
[ 8.781147] audit: type=1400 audit(148712268
[ 12.169618] ip6_tables: (C) 2000-2006 Netfilter Core Team
[ 12.196455] Ebtables v2.0 registered
[ 12.229787] systemd[1]: apt-daily.timer: Adding 8h 33min 26.836791s random time.
[ 12.238212] nf_conntrack version 0.5.0 (65536 buckets, 262144 max)
[ 12.367033] IPv6: ADDRCONF(
[ 12.443774] bridge: filtering via arp/ip/ip6tables is no longer available by default. Update your scripts to load br_netfilter if you need this.
[ 12.460617] systemd[1]: apt-daily.timer: Adding 1h 54min 42.960931s random time.
[ 12.483745] Netfilter messages via NETLINK v0.30.
[ 12.539679] ip_set: protoco...
Vipin K Parashar (vipin-g) wrote : | #9 |
Added bug comments as above.
Seth Forshee (sforshee) wrote : | #10 |
Setting CONFIG_
Changed in linux (Ubuntu): | |
status: | Triaged → Incomplete |
bugproxy (bugproxy) wrote : Comment bridged from LTC Bugzilla | #11 |
------- Comment From <email address hidden> 2017-02-28 00:41 EDT-------
(In reply to comment #34)
> Setting CONFIG_
> response to bug #1665404. If pcieportdrv is to blame I believe this should
> take care of the problem. Can you confirm? Thanks!
PCIE port bus driver simply exposed this issue with pnv-php module and since its not needed on ppc64 we requested it to be disabled. Since the core issue lies with pnv-phb hence ideally we would like above patches to be applied so that the issue doesnt get reproduced in any other way.
~ Vaibhav
Changed in linux (Ubuntu Yakkety): | |
assignee: | nobody → Seth Forshee (sforshee) |
importance: | Undecided → High |
status: | New → In Progress |
Changed in linux (Ubuntu): | |
assignee: | Canonical Kernel Team (canonical-kernel-team) → Seth Forshee (sforshee) |
status: | Incomplete → Fix Committed |
Seth Forshee (sforshee) wrote : | #12 |
I've applied all three patches to zesty. Only the first patch is applicable for 16.04.2 (yakkety 4.8 kernel) as it does not have 360aebd85a4c "drivers/
bugproxy (bugproxy) wrote : sosreport | #13 |
bugproxy (bugproxy) wrote : systemclt-a | #14 |
Seth Forshee (sforshee) wrote : | #15 |
bugproxy (bugproxy) wrote : Comment bridged from LTC Bugzilla | #16 |
------- Comment From <email address hidden> 2017-03-01 07:11 EDT-------
(In reply to comment #38)
> I've applied all three patches to zesty. Only the first patch is applicable
> for 16.04.2 (yakkety 4.8 kernel) as it does not have 360aebd85a4c
> "drivers/
> introduces the code being changed by the latter two patches. The 4.4 xenial
> kernel does not have the PowerNV PCI hotplug driver at all, so none of the
> patches are applicable.
I tried cherry-picking 360aebd85a4c onto 4.8 kernel and subsequently all
3 patches got applied successfully.
Seth Forshee (sforshee) wrote : Re: [Bug 1667599] Comment bridged from LTC Bugzilla | #17 |
On Wed, Mar 01, 2017 at 12:20:50PM -0000, bugproxy wrote:
> ------- Comment From <email address hidden> 2017-03-01 07:11 EDT-------
> (In reply to comment #38)
> > I've applied all three patches to zesty. Only the first patch is applicable
> > for 16.04.2 (yakkety 4.8 kernel) as it does not have 360aebd85a4c
> > "drivers/
> > introduces the code being changed by the latter two patches. The 4.4 xenial
> > kernel does not have the PowerNV PCI hotplug driver at all, so none of the
> > patches are applicable.
>
> I tried cherry-picking 360aebd85a4c onto 4.8 kernel and subsequently all
> 3 patches got applied successfully.
Yes, that commit adds the surprise hotplug feature to the PowerNV PCI
driver. Is the request then that we should add the surprise hotplug
feature to yakkety along with the subsequent bug fixes for that feature?
bugproxy (bugproxy) wrote : Comment bridged from LTC Bugzilla | #18 |
------- Comment From <email address hidden> 2017-03-02 07:20 EDT-------
(In reply to comment #41)
> Yes, that commit adds the surprise hotplug feature to the PowerNV PCI
> driver. Is the request then that we should add the surprise hotplug
> feature to yakkety along with the subsequent bug fixes for that feature?
Have checked with Gavin and he prefers having the surprise hotplug feature added to yakkety with the subsequent bug fixes for the feature. Right now trying to identify the needed patches for 4.8 kernel.
bugproxy (bugproxy) wrote : | #19 |
------- Comment From <email address hidden> 2017-03-02 22:52 EDT-------
(In reply to comment #44)
> (In reply to comment #41)
> > Yes, that commit adds the surprise hotplug feature to the PowerNV PCI
> > driver. Is the request then that we should add the surprise hotplug
> > feature to yakkety along with the subsequent bug fixes for that feature?
>
> Have checked with Gavin and he prefers having the surprise hotplug feature
> added to yakkety with the subsequent bug fixes for the feature. Right now
> trying to identify the needed patches for 4.8 kernel.
Checked with Gavin today and he is of the opinion that since the backport effort for 16.04.2 is potentially high hence it will be better to target 16.04.3 for this feature that would be based on a newer kernel (>=4.9) and most of the needed patches already present.
So requesting canonical to ignore this bug for 16.04/16.10 HWE kernel trees.
~ Vaibhav
Tim Gardner (timg-tpi) wrote : | #20 |
Marking won't fix for Yakkety according to comment #19
Changed in linux (Ubuntu Yakkety): | |
status: | In Progress → Won't Fix |
bugproxy (bugproxy) wrote : | #21 |
------- Comment From <email address hidden> 2017-03-06 23:29 EDT-------
*** Bug 152125 has been marked as a duplicate of this bug. ***
Launchpad Janitor (janitor) wrote : | #22 |
This bug was fixed in the package linux - 4.10.0-11.13
---------------
linux (4.10.0-11.13) zesty; urgency=low
[ Tim Gardner ]
* Release Tracking Bug
- LP: #1669127
* linux-tools-common should Depends: lsb-release (LP: #1667571)
- [Config] linux-tools-common depends on lsb-release
* Ubuntu (Zesty): When we miss LSI/INTx interrupts on slot, message is too
imprecise (LP: #1668382)
- of/irq: improve error report on irq discovery process failure
* Zesty update to v4.10.1 stable release (LP: #1668993)
- ptr_ring: fix race conditions when resizing
- ip: fix IP_CHECKSUM handling
- net: socket: fix recvmmsg not returning error from sock_error
- tty: serial: msm: Fix module autoload
- USB: serial: mos7840: fix another NULL-deref at open
- USB: serial: cp210x: add new IDs for GE Bx50v3 boards
- USB: serial: ftdi_sio: fix modem-status error handling
- USB: serial: ftdi_sio: fix extreme low-latency setting
- USB: serial: ftdi_sio: fix line-status over-reporting
- USB: serial: spcp8x5: fix modem-status handling
- USB: serial: opticon: fix CTS retrieval at open
- USB: serial: ark3116: fix register-accessor error handling
- USB: serial: console: fix uninitialised spinlock
- x86/platform/
- goldfish: Sanitize the broken interrupt handler
- netfilter: nf_ct_helper: warn when not applying default helper assignment
- ACPICA: Linuxize: Restore and fix Intel compiler build
- block: fix double-free in the failure path of cgwb_bdi_init()
- rtlwifi: rtl_usb: Fix for URB leaking when doing ifconfig up/down
- xfs: clear delalloc and cache on buffered write failure
- Linux 4.10.1
* [UBUNTU Zesty] mlx5 - Improve OVS offload driver (LP: #1668019)
- net/sched: cls_flower: Disallow duplicate internal elements
- net/sched: cls_flower: Properly handle classifier flags dumping
- net/sched: cls_matchall: Dump the classifier flags
- net/sched: Reflect HW offload status
- net/sched: cls_flower: Reflect HW offload status
- net/sched: cls_matchall: Reflect HW offloading status
- net/sched: cls_u32: Reflect HW offload status
- net/sched: cls_bpf: Reflect HW offload status
- net/mlx5: Push min-inline mode resolution helper into the core
- IB/mlx5: Enable Eth VFs to query their min-inline value for user-space
- net/mlx5: Use exact encap header size for the FW input buffer
- net/mlx5e: Add TC offloads matching on IPv6 encapsulation headers
- net/mlx5e: TC ipv4 tunnel encap offload cosmetic changes
- net/mlx5e: Use the full tunnel key info for encapsulation offload house- keeping
- net/mlx5e: Maximize ip tunnel key usage on the TC offloading path
- net/mlx5e: Support SRIOV TC encapsulation offloads for IPv6 tunnels
- net/mlx5: E-Switch, Enlarge the FDB size for the switchdev mode
- net/mlx5: Fix static checker warnings
* [Hyper-V] Ubuntu 14.04.2 LTS Generation 2 SCSI Errors on VSS Based Backups
(LP: #1470250)
- SAUCE: Tools: hv: vss: Thaw the filesystem and continue after freeze fails
* Ubuntu17.04: Need more patches for aacraid to bring up Bost...
Changed in linux (Ubuntu): | |
status: | Fix Committed → Fix Released |
Thadeu Lima de Souza Cascardo (cascardo) wrote : | #23 |
This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-
If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.
See https:/
tags: | added: verification-needed-yakkety |
bugproxy (bugproxy) wrote : | #24 |
------- Comment From <email address hidden> 2017-03-22 04:40 EDT-------
(In reply to comment #46)
> Marking won't fix for Yakkety according to comment #19
Hello Canonical,
This bug is marked as "won't fix" for Yaketty,
so there shouldn't be any verification needed for Yaketty.
Can you please confirm about "verification-
and fix added for Yaketty ?
Seth Forshee (sforshee) wrote : | #25 |
We did apply "pci/hotplug/
Changed in linux (Ubuntu Yakkety): | |
status: | Won't Fix → Fix Committed |
bugproxy (bugproxy) wrote : | #26 |
------- Comment From <email address hidden> 2017-03-24 10:34 EDT-------
Hello Canonical,
Same as for bug 1667239 , the fix doesn't seem to be in the yakkety -proposed kernel Ubuntu-4.8.0-44.47.
Same reason?
Seth Forshee (sforshee) wrote : | #27 |
Yes. This is going to be true for all bugs against xenial or yakkety for which verification was requested on Mar 21.
tags: | removed: verification-needed-yakkety |
Kleber Sacilotto de Souza (kleber-souza) wrote : | #28 |
This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-
If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.
See https:/
tags: | added: verification-needed-yakkety |
tags: |
added: verification-done-yakkety removed: verification-needed-yakkety |
Launchpad Janitor (janitor) wrote : | #29 |
This bug was fixed in the package linux - 4.8.0-49.52
---------------
linux (4.8.0-49.52) yakkety; urgency=low
* linux: 4.8.0-49.52 -proposed tracker (LP: #1684427)
* [Hyper-V] hv: util: move waiting for release to hv_utils_transport itself
(LP: #1682561)
- Drivers: hv: util: move waiting for release to hv_utils_transport itself
linux (4.8.0-48.51) yakkety; urgency=low
* linux: 4.8.0-48.51 -proposed tracker (LP: #1682034)
* [Hyper-V] hv: vmbus: Raise retry/wait limits in vmbus_post_msg()
(LP: #1681893)
- Drivers: hv: vmbus: Raise retry/wait limits in vmbus_post_msg()
linux (4.8.0-47.50) yakkety; urgency=low
* linux: 4.8.0-47.50 -proposed tracker (LP: #1679678)
* CVE-2017-6353
- sctp: deny peeloff operation on asocs with threads sleeping on it
* CVE-2017-5986
- sctp: avoid BUG_ON on sctp_wait_
* vfat: missing iso8859-1 charset (LP: #1677230)
- [Config] NLS_ISO8859_1=y
* [Hyper-V] pci-hyperv: Use device serial number as PCI domain (LP: #1667527)
- net/mlx4_core: Use cq quota in SRIOV when creating completion EQs
* Regression: KVM modules should be on main kernel package (LP: #1678099)
- [Config] powerpc: Add kvm-hv and kvm-pr to the generic inclusion list
* linux-lts-xenial 4.4.0-63.84~14.04.2 ADT test failure with linux-lts-xenial
4.4.
- SAUCE: apparmor: fix link auditing failure due to, uninitialized var
* regession tests failing after stackprofile test is run (LP: #1661030)
- SAUCE: fix regression with domain change in complain mode
* Permission denied and inconsistent behavior in complain mode with 'ip netns
list' command (LP: #1648903)
- SAUCE: fix regression with domain change in complain mode
* unexpected errno=13 and disconnected path when trying to open /proc/1/ns/mnt
from a unshared mount namespace (LP: #1656121)
- SAUCE: apparmor: null profiles should inherit parent control flags
* apparmor refcount leak of profile namespace when removing profiles
(LP: #1660849)
- SAUCE: apparmor: fix ns ref count link when removing profiles from policy
* tor in lxd: apparmor="DENIED" operation=
namespace=
name=
- SAUCE: apparmor: Fix no_new_privs blocking change_onexec when using stacked
namespaces
* apparmor oops in bind_mnt when dev_path lookup fails (LP: #1660840)
- SAUCE: apparmor: fix oops in bind_mnt when dev_path lookup fails
* apparmor auditing denied access of special apparmor .null fi\ le
(LP: #1660836)
- SAUCE: apparmor: Don't audit denied access of special apparmor .null file
* apparmor label leak when new label is unused (LP: #1660834)
- SAUCE: apparmor: fix label leak when new label is unused
* apparmor reference count bug in label_merge_
- SAUCE: apparmor: fix reference count bug in label_merge_
* apparmor's raw_data file in securityfs is sometimes truncated (LP: #1638996)
- SAUCE: apparmor: fix replacement race in reading rawdata
* unix domain socket cross permission check failing with n...
Changed in linux (Ubuntu Yakkety): | |
status: | Fix Committed → Fix Released |
Default Comment by Bridge