EPOW related RTAS event messages in kernel logs

Bug #1539102 reported by bugproxy on 2016-01-28
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
High
Tim Gardner
Wily
Undecided
Tim Gardner
Xenial
High
Tim Gardner

Bug Description

Problem Description
===============================
EPOW related RTAS event messages in kernel logs

Machine Type = POWER8

Steps to Reproduce
=============================
 1) Install Ubuntu14.04 on a Power VM LPAR.
2) Post reboot following events are logged into dmesg & /var/log/syslog

[216112.739470] Non critical power or cooling issue cleared
[216122.992606] RTAS: event: 187589, Type: EPOW, Severity: 1
[216122.992609] Non critical power or cooling issue cleared
[216129.240731] RTAS: event: 187590, Type: EPOW, Severity: 1
[216129.240733] Non critical power or cooling issue cleared
[216141.612689] RTAS: event: 187591, Type: EPOW, Severity: 1
[216141.612691] Non critical power or cooling issue cleared
[216147.830578] RTAS: event: 187592, Type: EPOW, Severity: 1
[216147.830580] Non critical power or cooling issue cleared
[216158.334408] RTAS: event: 187593, Type: EPOW, Severity: 1
[216158.334410] Non critical power or cooling issue cleared
[216199.883138] RTAS: event: 187594, Type: EPOW, Severity: 1
[216199.883140] Non critical power or cooling issue cleared
[216244.596335] RTAS: event: 187595, Type: EPOW, Severity: 1
[216244.596337] Non critical power or cooling issue cleared
[216250.853514] RTAS: event: 187596, Type: EPOW, Severity: 1
[216250.853516] Non critical power or cooling issue cleared
[216263.319986] RTAS: event: 187597, Type: EPOW, Severity: 1
[216263.319989] Non critical power or cooling issue cleared
[216271.438925] RTAS: event: 187598, Type: EPOW, Severity: 1

The log is filled with several such messages along with

RTAS: event: 13884, Type: Platform Error, Severity: 2
RTAS: event: 13885, Type: Platform Error, Severity: 2
RTAS: event: 13886, Type: Platform Error, Severity: 2
RTAS: event: 13887, Type: Platform Error, Severity: 2
RTAS: event: 13888, Type: Platform Error, Severity: 2
RTAS: event: 13889, Type: Platform Error, Severity: 2
RTAS: event: 13890, Type: Platform Error, Severity: 2
RTAS: event: 13891, Type: Platform Error, Severity: 2

Regrads
Praveen

== Comment: #2 - VASANT HEGDE <email address hidden> - ==
Below upstream patch fixes this issue..I believe this will apply cleanly on top of Ubuntu 14.04.04 kernel.

commit b4af279a7cba5cc1f665485e8ecdf272f1ba0cc5
Author: Vipin K Parashar <email address hidden>
Date: Tue Dec 1 16:43:42 2015 +0530

    powerpc/pseries: Limit EPOW reset event warnings

    Kernel prints respective warnings about various EPOW events for
    user information/action after parsing EPOW interrupts. At times
    below EPOW reset event warning is seen to be flooding kernel log
    over a period of time.

bugproxy (bugproxy) on 2016-01-28
tags: added: architecture-ppc64le bugnameltc-136190 severity-high targetmilestone-inin---
Changed in ubuntu:
assignee: nobody → Taco Screen team (taco-screen-team)

Thank you for taking the time to report this bug and helping to make Ubuntu better. It seems that your bug report is not filed about a specific source package though, rather it is just filed against Ubuntu in general. It is important that bug reports be filed about source packages so that people interested in the package can find the bugs about it. You can find some hints about determining what package your bug might be about at https://wiki.ubuntu.com/Bugs/FindRightPackage. You might also ask for help in the #ubuntu-bugs irc channel on Freenode.

To change the source package that this bug is filed about visit https://bugs.launchpad.net/ubuntu/+bug/1539102/+editstatus and add the package name in the text box next to the word Package.

[This is an automated message. I apologize if it reached you inappropriately; please just reply to this message indicating so.]

tags: added: bot-comment
vaishnavi (vaishnavi) on 2016-01-28
affects: ubuntu → linux (Ubuntu)
Changed in linux (Ubuntu):
assignee: Taco Screen team (taco-screen-team) → Canonical Kernel Team (canonical-kernel-team)
importance: Undecided → High
status: New → Triaged
bugproxy (bugproxy) on 2016-01-29
tags: added: targetmilestone-inin14044
removed: targetmilestone-inin---
Tim Gardner (timg-tpi) on 2016-01-29
Changed in linux (Ubuntu Wily):
assignee: nobody → Tim Gardner (timg-tpi)
status: New → In Progress
Changed in linux (Ubuntu Xenial):
assignee: Canonical Kernel Team (canonical-kernel-team) → Tim Gardner (timg-tpi)
status: Triaged → Fix Committed
Brad Figg (brad-figg) on 2016-02-01
Changed in linux (Ubuntu Wily):
status: In Progress → Fix Committed
Brad Figg (brad-figg) wrote :

This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-wily' to 'verification-done-wily'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-wily

------- Comment From <email address hidden> 2016-02-08 00:51 EDT-------
(In reply to comment #8)
> This bug is awaiting verification that the kernel in -proposed solves the
> problem. Please test the kernel and update this bug with the results. If the
> problem is solved, change the tag 'verification-needed-wily' to
> 'verification-done-wily'.
>
> If verification is not done by 5 working days from today, this fix will be
> dropped from the source code, and this bug will be closed.
>
> See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to
> enable and use -proposed. Thank you!

Hi I tried with proposed repo ,it seems me fixed

kernel : 4.4.0-guilherme+

tags: added: verification-done-wily
removed: verification-needed-wily
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2016-02-08 02:20 EDT-------
(In reply to comment #9)
> (In reply to comment #8)
> > This bug is awaiting verification that the kernel in -proposed solves the
> > problem. Please test the kernel and update this bug with the results. If the
> > problem is solved, change the tag 'verification-needed-wily' to
> > 'verification-done-wily'.
> >
> > If verification is not done by 5 working days from today, this fix will be
> > dropped from the source code, and this bug will be closed.
> >
> > See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to
> > enable and use -proposed. Thank you!
>
> Hi I tried with proposed repo ,it seems me fixed
>
> kernel : 4.4.0-guilherme+

sorry kernel version is 4.2.0-28-generic

bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2016-02-08 04:14 EDT-------
epow issue logging seems me fix but Platform Error still reported in /var/log/syslog

[ 6136.670687] RTAS: event: 190912, Type: Platform Error, Severity: 2
[ 6144.658688] RTAS: event: 190913, Type: Platform Error, Severity: 2
[ 6152.646689] RTAS: event: 190914, Type: Platform Error, Severity: 2
[ 6160.634688] RTAS: event: 190915, Type: Platform Error, Severity: 2
[ 6168.814692] RTAS: event: 190916, Type: Platform Error, Severity: 2
[ 6176.802699] RTAS: event: 190917, Type: Platform Error, Severity: 2
[ 6184.790703] RTAS: event: 190918, Type: Platform Error, Severity: 2
[ 6192.778684] RTAS: event: 190919, Type: Platform Error, Severity: 2
[ 6200.766701] RTAS: event: 190920, Type: Platform Error, Severity: 2

Regards
Praveen

bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2016-02-09 07:27 EDT-------
>> [ 6184.790703] RTAS: event: 190918, Type: Platform Error, Severity: 2
>> [ 6192.778684] RTAS: event: 190919, Type: Platform Error, Severity: 2
>> [ 6200.766701] RTAS: event: 190920, Type: Platform Error, Severity: 2 <---

ubuntu@alp11:/var/log/dump$ servicelog --query='id=2457'
Servicelog ID: 2457
Log Timestamp: Mon Feb 8 07:45:02 2016
Event Timestamp: Mon Feb 8 11:55:13 2016
Update Timestamp: Mon Feb 8 08:45:02 2016
Type: Power Platform (RTAS) Event
Severity: 4 (WARNING)
Platform: ppc64le
Model/Serial: 8408-E8E/10A7A3V
Node Name: alp11
Reference Code: 11007691
Serviceable Event: Yes
Predictive Event: No
Disposition: 2 (Unrecoverable, Bypassed)
Call Home Status: 1 (Call Home Candidate)
Status: Open
Action Flags: a804
Event Type: 224 - Platform Event
Kernel ID: 190920 <-----
Platform ID: 5001f586
Creator ID: E - Service Processor
Subsystem ID: 63 - Power/Cooling subsystem
RTAS Severity: 44 - Unrecoverable Error, bypassed with loss of redundancy
Event Subtype: 00 - Not applicable

Extended Reference Codes:
2: 003c0001 3: 00008514 4: 00000000 5: 00000000
6: 0f9b2564 7: 1010db38 8: 1010db38 9: 0000013d

Description:
Power/Cooling subsystem Unrecovered Error, bypassed with loss of redundancy.
Refer to the system service documentation for more information.

<< Callout 1 >>
Priority H
Type 16
Procedure Id:
Location: U78C7.001.RCH0042-A9
FRU: 00LU710
Serial: YSTORCAGE0A9
CCIN: 6B75

<< Callout 2 >>
Priority L
Type 16
Procedure Id:
Location: U78C7.001.RCH0042-P3
FRU: 00E3991
Serial: YA30BG4CK013
CCIN: 2CD2

ubuntu@alp11:/var/log/dump$

These events are due to Power/Cooling subsystem errors. System seems to be having some cooling issue causing these errors.

------- Comment From <email address hidden> 2016-02-09 07:30 EDT-------
Fix in this defect is for fixing below errors which were spurious in nature.
"Non critical power or cooling issue cleared"
These are no longer seen now with fix. Fix is working as designed.

Launchpad Janitor (janitor) wrote :
Download full text (4.0 KiB)

This bug was fixed in the package linux - 4.4.0-4.19

---------------
linux (4.4.0-4.19) xenial; urgency=low

  * update ZFS and SPL to 0.6.5.4 (LP: #1542296)
    - [Config] update spl/zfs version
    - SAUCE: (noup) Update spl to 0.6.5.4-0ubuntu2, zfs to 0.6.5.4-0ubuntu1
    - [Config] reconstruct -- drop links for zfs userspace components
    - [Config] reconstruct -- drop links for zfs userspace components -- restore spec links

  * recvmsg() fails SCM_CREDENTIALS request with EOPNOTSUPP. (LP: #1540731)
    - Revert "af_unix: Revert 'lock_interruptible' in stream receive code"

  * lxc: ADT exercise test failing with linux-4.4.0-3.17 (LP: #1542049)
    - Revert "UBUNTU: SAUCE: apparmor: fix sleep from invalid context"

  * WARNING: at /build/linux-lts-wily-W0lTWH/linux-lts-wily-4.2.0/net/core/skbuff.c:4174 (Travis IB) (LP: #1541326)
    - SAUCE: IB/IPoIB: Do not set skb truesize since using one linearskb

  * backport Microsoft Precision Touchpad palm rejection patch (LP: #1541671)
    - HID: multitouch: enable palm rejection if device implements confidence usage

  * [Ubuntu 16.04] Update qla2xxx driver for POWER (QLogic) (LP: #1541456)
    - qla2xxx: Remove unavailable firmware files
    - qla2xxx: Enable Extended Logins support
    - qla2xxx: Enable Exchange offload support.
    - qla2xxx: Enable Target counters in DebugFS.
    - qla2xxx: Add FW resource count in DebugFS.
    - qla2xxx: Added interface to send explicit LOGO.
    - qla2xxx: Delete session if initiator is gone from FW
    - qla2xxx: Wait for all conflicts before ack'ing PLOGI
    - qla2xxx: Replace QLA_TGT_STATE_ABORTED with a bit.
    - qla2xxx: Remove dependency on hardware_lock to reduce lock contention.
    - qla2xxx: Add irq affinity notification
    - qla2xxx: Add selective command queuing
    - qla2xxx: Move atioq to a different lock to reduce lock contention
    - qla2xxx: Disable ZIO at start time.
    - qla2xxx: Set all queues to 4k
    - qla2xxx: Check for online flag instead of active reset when transmitting responses
    - scsi: qla2xxxx: avoid type mismatch in comparison

  * [Hyper-V] PCI Passthrough (LP: #1541120)
    - x86/irq: Export functions to allow MSI domains in modules
    - genirq/msi: Export functions to allow MSI domains in modules

  * Update lpfc driver to 11.0.0.10 (LP: #1541592)
    - lpfc: Fix FCF Infinite loop in lpfc_sli4_fcf_rr_next_index_get.
    - lpfc: Fix the FLOGI discovery logic to comply with T11 standards
    - lpfc: Fix RegLogin failed error seen on Lancer FC during port bounce
    - lpfc: Fix driver crash when module parameter lpfc_fcp_io_channel set to 16
    - lpfc: Fix crash in fcp command completion path.
    - lpfc: Modularize and cleanup FDMI code in driver
    - lpfc: Fix RDP Speed reporting.
    - lpfc: Fix RDP ACC being too long.
    - lpfc: Make write check error processing more resilient
    - lpfc: Use new FDMI speed definitions for 10G, 25G and 40G FCoE.
    - lpfc: Fix mbox reuse in PLOGI completion
    - lpfc: Fix external loopback failure.
    - lpfc: Add logging for misconfigured optics.
    - lpfc: Delete unnecessary checks before the function call "mempool_destroy"
    - lpfc: Use kzalloc instead of kmalloc
...

Read more...

Changed in linux (Ubuntu Xenial):
status: Fix Committed → Fix Released
Launchpad Janitor (janitor) wrote :
Download full text (43.7 KiB)

This bug was fixed in the package linux - 4.2.0-30.35

---------------
linux (4.2.0-30.35) wily; urgency=low

  [ Seth Forshee ]

  * SAUCE: cred: Add clone_cred() interface
    - LP: #1531747, #1534961, #1535150
    - CVE-2016-1575 CVE-2016-1576
  * SAUCE: overlayfs: Use mounter's credentials instead of selectively
    raising caps
    - LP: #1531747, #1534961, #1535150
    - CVE-2016-1575 CVE-2016-1576
  * SAUCE: overlayfs: Skip permission checking for trusted.overlayfs.*
    xattrs
    - LP: #1531747, #1534961, #1535150
    - CVE-2016-1575 CVE-2016-1576
  * SAUCE: overlayfs: Be more careful about copying up sxid files
    - LP: #1534961, #1535150
    - CVE-2016-1575 CVE-2016-1576
  * SAUCE: overlayfs: Propogate nosuid from lower and upper mounts
    - LP: #1534961, #1535150
    - CVE-2016-1575 CVE-2016-1576

linux (4.2.0-29.34) wily; urgency=low

  [ Luis Henriques ]

  * Release Tracking Bug
    - LP: #1543167

  [ Brad Figg ]

  * Revert "SAUCE: apparmor: fix sleep from invalid context"
    - LP: #1542049

  [ Upstream Kernel Changes ]

  * Revert "af_unix: Revert 'lock_interruptible' in stream receive code"
    - LP: #1540731

linux (4.2.0-28.33) wily; urgency=low

  [ Brad Figg ]

  * Release Tracking Bug
    - LP: #1540634

  [ Brad Figg ]

  * CONFIG: CONFIG_DEBUG_UART_BCM63XX is not set

  [ J. R. Okajima ]

  * SAUCE: ubuntu: aufs: tiny, extract a new func xino_fwrite_wkq()
    - LP: #1533043
  * SAUCE: ubuntu: aufs: for 4.3, XINO handles EINTR from the dying process
    - LP: #1533043

  [ John Johansen ]

  * SAUCE: (no-up): apparmor: fix for failed mediation of socket that is
    being shutdown
    - LP: #1446906
  * SAUCE: apparmor: fix sleep from invalid context
    - LP: #1539349

  [ Tim Gardner ]

  * [Config] Add pvpanic to virtual flavour
    - LP: #1537923

  [ Upstream Kernel Changes ]

  * Revert "ACPI / LPSS: allow to use specific PM domain during ->probe()"
    - LP: #1540532
  * tools: Add a "make all" rule
    - LP: #1536370
  * vf610_adc: Fix internal temperature calculation
    - LP: #1536370
  * iio: lpc32xx_adc: fix warnings caused by enabling unprepared clock
    - LP: #1536370
  * iio:ad5064: Make sure ad5064_i2c_write() returns 0 on success
    - LP: #1536370
  * iio: ad5064: Fix ad5629/ad5669 shift
    - LP: #1536370
  * iio:ad7793: Fix ad7785 product ID
    - LP: #1536370
  * iio: adc: vf610_adc: Fix division by zero error
    - LP: #1536370
  * mmc: mmc: Improve reliability of mmc_select_hs200()
    - LP: #1536370
  * mmc: mmc: Fix HS setting in mmc_select_hs400()
    - LP: #1536370
  * mmc: mmc: Move mmc_switch_status()
    - LP: #1536370
  * mmc: mmc: Improve reliability of mmc_select_hs400()
    - LP: #1536370
  * crypto: qat - don't use userspace pointer
    - LP: #1536370
  * iio: si7020: Swap data byte order
    - LP: #1536370
  * iio: adc: xilinx: Fix VREFN scale
    - LP: #1536370
  * ipmi: Start the timer and thread on internal msgs
    - LP: #1536370
  * drm/i915: quirk backlight present on Macbook 4, 1
    - LP: #1536370
  * drm/i915: get runtime PM reference around GEM set_caching IOCTL
    - LP: #1536370
  * drm/radeon: Disable uncacheable CPU mappings of GTT with RV6xx
    - LP: #1536370
  *...

Changed in linux (Ubuntu Wily):
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers