[opal-prd] opal-prd is consuming 100% CPU

Bug #1765460 reported by bugproxy
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
The Ubuntu-power-systems project
Fix Released
Critical
Canonical Foundations Team
skiboot (Ubuntu)
Fix Released
Critical
Canonical Foundations Team

Bug Description

---Problem Description---
opal-prd fix for Ubuntu 18.04

 In some corner cases where opal-prd fails to access/read flash device (/dev/mtd0) we see opal-prd consuming 100% cpu. This mostly happens during boot time where opal-prd daemon tries to start even before mtd driver loaded. Its not easy to reproduce. It happens randomly.

---uname output---
Ubuntu 18.04

Machine Type = OpenPower System

Userspace tool common name: opal-prd

Below upstream patch will fix this issue.

commit cb16e55a234b91fd42112904cff15094fbae680d
Author: Vasant Hegde <email address hidden>
Date: Tue Apr 3 23:08:41 2018 +0530

    opal-prd: Insert powernv_flash module

    Explictly load powernv_flash module on BMC based system so that we are sure
    that flash device is created before starting opal-prd daemon.

-Vasant

bugproxy (bugproxy)
tags: added: architecture-ppc64le bugnameltc-166963 severity-critical targetmilestone-inin1804
Changed in ubuntu:
assignee: nobody → Ubuntu on IBM Power Systems Bug Triage (ubuntu-power-triage)
affects: ubuntu → skiboot (Ubuntu)
Frank Heimes (fheimes)
Changed in ubuntu-power-systems:
status: New → Triaged
importance: Undecided → Critical
assignee: nobody → Canonical Foundations Team (canonical-foundations)
tags: added: triage-g
Manoj Iyer (manjo)
Changed in skiboot (Ubuntu):
assignee: Ubuntu on IBM Power Systems Bug Triage (ubuntu-power-triage) → Canonical Foundations Team (canonical-foundations)
importance: Undecided → Critical
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package skiboot - 5.10~rc4-1ubuntu1

---------------
skiboot (5.10~rc4-1ubuntu1) bionic; urgency=medium

  * debian/patches/0001-opal-prd-Insert-powernv_flash-module.patch:
    cherry-pick from upstream to fix opal-prd spinning and consuming 100%
    CPU if it starts on boot before the mtd module has been loaded.
    LP: #1765460.

 -- Steve Langasek <email address hidden> Mon, 23 Apr 2018 09:50:43 -0700

Changed in skiboot (Ubuntu):
status: New → Fix Released
Changed in ubuntu-power-systems:
status: Triaged → Fix Released
Revision history for this message
bugproxy (bugproxy) wrote : Comment bridged from LTC Bugzilla
Download full text (7.1 KiB)

------- Comment From <email address hidden> 2018-04-25 02:49 EDT-------
Tested it with 5.10~rc4-1ubuntu1 package level, working fine.

root@ltc-boston125:~# service opal-prd status
? opal-prd.service - OPAL PRD daemon
Loaded: loaded (/lib/systemd/system/opal-prd.service; enabled; vendor preset: enabled)
Active: active (running) since Wed 2018-04-25 00:40:44 CDT; 1min 38s ago
Docs: man:opal-prd(8)
Main PID: 3750 (opal-prd)
Tasks: 1 (limit: 19660)
CGroup: /system.slice/opal-prd.service
??3750 /usr/sbin/opal-prd

Apr 25 00:40:46 ltc-boston125 opal-prd[3750]: HBRT: PRDF:<<PRDF::initialize()
Apr 25 00:40:46 ltc-boston125 opal-prd[3750]: HBRT: ATTN_SLOW:I>Service::enableAttns() enter
Apr 25 00:40:46 ltc-boston125 opal-prd[3750]: HBRT: TARG:[TARG] I> getNextTarget: Using next node 1
Apr 25 00:40:46 ltc-boston125 opal-prd[3750]: HBRT: TARG:[TARG] E> getNextTarget: Node 0 targets: first 0x702ede5efc04, current 0x702ede5f4ae8, last 0x702ede5f4ae8
Apr 25 00:40:46 ltc-boston125 opal-prd[3750]: HBRT: TARG:[TARG] E> getNextTarget: Target not found
Apr 25 00:40:46 ltc-boston125 opal-prd[3750]: HBRT: ATTN_SLOW:I>Service::enableAttns() exit
Apr 25 00:40:46 ltc-boston125 opal-prd[3750]: HBRT: ATTN_SLOW:I><<ATTN_RT::enableAttns rc: 0
Apr 25 00:40:46 ltc-boston125 opal-prd[3750]: HBRT: calling get_ipoll_events
Apr 25 00:40:46 ltc-boston125 opal-prd[3750]: HBRT: enabling IPOLL events 0x5b90000000000000
Apr 25 00:40:46 ltc-boston125 opal-prd[3750]: FW: writing init message
root@ltc-boston125:~# opal-prd --version
opal-prd opal-prd-5.10~rc4
root@ltc-boston125:~# dpkg -l | grep -i opal
ii opal-prd 5.10~rc4-1ubuntu1 ppc64el OPAL Processor Recovery Diagnostics daemon
ii opal-utils 5.10~rc4-1ubuntu1 ppc64el OPAL firmware utilities
root@ltc-boston125:~#

root@ltc-boston125:~# service opal-prd status
? opal-prd.service - OPAL PRD daemon
Loaded: loaded (/lib/systemd/system/opal-prd.service; enabled; vendor preset: enabled)
Active: active (running) since Wed 2018-04-25 00:40:44 CDT; 19min ago
Docs: man:opal-prd(8)
Main PID: 3750 (opal-prd)
Tasks: 1 (limit: 19660)
CGroup: /system.slice/opal-prd.service
??3750 /usr/sbin/opal-prd

Apr 25 00:40:46 ltc-boston125 opal-prd[3750]: HBRT: PRDF:<<PRDF::initialize()
Apr 25 00:40:46 ltc-boston125 opal-prd[3750]: HBRT: ATTN_SLOW:I>Service::enableAttns() enter
Apr 25 00:40:46 ltc-boston125 opal-prd[3750]: HBRT: TARG:[TARG] I> getNextTarget: Using next node 1
Apr 25 00:40:46 ltc-boston125 opal-prd[3750]: HBRT: TARG:[TARG] E> getNextTarget: Node 0 targets: first 0x702ede5efc04, current 0x702ede5f4ae8, last 0x702ede5f4ae8
Apr 25 00:40:46 ltc-boston125 opal-prd[3750]: HBRT: TARG:[TARG] E> getNextTarget: Target not found
Apr 25 00:40:46 ltc-boston125 opal-prd[3750]: HBRT: ATTN_SLOW:I>Service::enableAttns() exit
Apr 25 00:40:46 ltc-boston125 opal-prd[3750]: HBRT: ATTN_SLOW:I><<ATTN_RT::enableAttns rc: 0
Apr 25 00:40:46 ltc-boston125 opal-prd[3750]: HBRT: calling get_ipoll_events
Apr 25 00:40:46 ltc-boston125 opal-prd[3750]: HBRT: enabling IPOLL events 0x5b90000000000000
Apr 25 00:40:46 ltc-boston125 opal-prd[3750]: FW: writing init message
...

Read more...

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.