Surelock: GA2: ppc64-diag package opal_errd not autostarting on Ubuntu 15.10 (in systemd)

Bug #1505088 reported by bugproxy
16
This bug affects 1 person
Affects Status Importance Assigned to Milestone
ppc64-diag (Ubuntu)
Fix Released
Undecided
Taco Screen team
Wily
Fix Released
Critical
Steve Langasek

Bug Description

[SRU justification]
The switch from upstart to systemd in 15.10 has left the ppc64-diag package's daemons not correctly autostarting on boot. This prevents the proper operation of the package's functionality.

[Regression potential]
Because we are now running code that was previously not being autorun on Ubuntu 15.10, regressions are possible. However, this same upstream version of code is also included in trusty-updates and vivid-updates, where the services do correctly autostart because those releases use upstart rather than systemd, and there are no reports of problems against them.

[Test case]
1. Install ppc64-diag on wily. Confirm that the opal service is not running:
$ systemctl status opal_errd.service
 opal_errd.service - opal_errd (PowerNV platform error handling) Service
   Loaded: loaded (/usr/lib/systemd/system/opal_errd.service; disabled; vendor preset: enabled)
   Active: inactive (dead)
2. Install ppc64-diag from wily-proposed. Confirm that the opal service is now running.

[Original bug report]
== Comment: #0 - Application Cdeadmin <email address hidden> - 2015-09-19 18:46:05 ==

== Comment: #1 - Application Cdeadmin <email address hidden> - 2015-09-19 18:46:07 ==
==== State: Open by: belldi on 19 September 2015 17:35:01 ====

Headline
++++++++++
Surelock: GA2: z138e: System and FSP dumps are not moving from FSP to Ubuntu 15.10 partition during continous user dump test, TER 93148

Note(s)
++++++
This system is in use for Surelock GA2 Hardware Systems Test. You should contact the system tester or defect submitter before performing any disruptive actions.

Problem Description
++++++++++++++++++++
On Ubuntu 15.10 partition, I created directory /var/log/dump and installed the ppc64-diag package. I ran following commands to complete both tasks.
sudo mkdir /var/log/dump #Create dump directory
sudo apt-get install ppc64-diag #install PPC64 diagnostic package

Afterwards, I ran HTX to exercise the hardware in system, specifically focused on exercising between Corsa-CAPI cards and the flash system. Then, I started TER 93148 for continuous user dump. For TER 93148, I ran a script that continuously intiates a USER dump about every 2 hours. The system dump and fsp dump are expected to move from FSP to Ubuntu partition after system reaches runtime and boots the Ubuntu partition. However, the system and FSP dumps aren't getting moved from the FSP to the Ubuntu partition.

System and FSP dumps still present on FSP
Not getting moved to ubuntu partition despite
creating /var/log/dump/ and installing ppc64-diag
on ubuntu partition
++++++++++++++++++++++++++++++++++++++++++++++
$ rtim timeofday

System time is valid: 2015/09/19 22:05:14.744413 <-- GMT time

mchollin@cougarp01:~$ sudo systemctl status opal_errd.service
&#9679; opal_errd.service - opal_errd (PowerNV platform error handling) Service
   Loaded: loaded (/usr/lib/systemd/system/opal_errd.service; disabled; vendor preset: enabled)
   Active: inactive (dead)
mchollin@cougarp01:~$ sudo systemctl enable opal_errd.service
Synchronizing state of opal_errd.service with SysV init with /lib/systemd/systemd-sysv-install...
Executing /lib/systemd/systemd-sysv-install enable opal_errd
Created symlink from /etc/systemd/system/multi-user.target.wants/opal_errd.service to /usr/lib/systemd/system/opal_errd.service.
mchollin@cougarp01:~$ sudo systemctl status opal_errd.service
&#9679; opal_errd.service - opal_errd (PowerNV platform error handling) Service
   Loaded: loaded (/usr/lib/systemd/system/opal_errd.service; enabled; vendor preset: enabled)
   Active: inactive (dead)
mchollin@cougarp01:~$

The unit is "disabled" initially, and then after i manually run "enable" it becomes "enabled." This seems to indicate that the debs that we installed aren't quite doing the right things after their installation. This is a bug that needs to go to launchpad for fix out in ubuntu - not something that's specific to surelock.

It's a good find either way.

== Comment: #16 - Application Cdeadmin <email address hidden> - 2015-10-09 17:16:04 ==
==== State: Open by: belldi on 09 October 2015 16:05:27 ====

I'm reopening defect. Issue isn't resolved yet.

== Comment: #17 - Stewart Smith <email address hidden> - 2015-10-09 18:17:41 ==
Moving to BugsAgainstDistros for Ubuntu.

== Comment: #18 - Application Cdeadmin <email address hidden> - 2015-10-11 22:25:07 ==
==== State: Open by: belldi on 11 October 2015 12:48:45 ====

#=#=# 2015-10-11 12:48:44 (CDT) #=#=#
Action = [reopen]
#=#=#=#=#=#=#=#=#=#=#=#=#=#=#=#=#=#=#
==== State: Open by: mchollin on 11 October 2015 15:07:41 ====

#=#=# 2015-10-11 15:07:30 (CDT) #=#=#
#=#=#=#=#=#=#=#=#=#=#=#=#=#=#=#=#=#=#
==== State: Open by: mchollin on 11 October 2015 15:08:37 ====

#=#=# 2015-10-11 15:08:36 (CDT) #=#=#
#=#=#=#=#=#=#=#=#=#=#=#=#=#=#=#=#=#=#
==== State: Open by: mchollin on 11 October 2015 21:22:39 ====

Summary: ppc64-diag, when installed on Ubuntu 15.10 does not automatically "enable" the opal_errd daemon. This means that opal_errd, while installed, does not actually -start- on boot.

LTC - please move this to the appropriate component (launchpad? elsewhere?) so that the ppc64-diag package for Ubuntu 15.10 can make a fix for this. Probably any other distro that relies on systemd should be checked as well to make sure all expected daemons actually -start- when ppc64_diag is installed.

bugproxy (bugproxy)
tags: added: architecture-ppc64 bugnameltc-130889 severity-high targetmilestone-inin---
Revision history for this message
bugproxy (bugproxy) wrote : Comment bridged from LTC Bugzilla

------- Comment From <email address hidden> 2015-10-12 06:21 EDT-------
#=#=#=#=#=#=#=#=#=#=#=#=#=#=#=#=#=#=#==== State: Open by: mchollin on 11 October 2015 15:07:41 ====

#=#=#=#=#=#=#=#=#=#=#=#=#=#=#=#=#=#=#==== State: Open by: belldi on 11 October 2015 12:48:45 ====

#=#=#=#=#=#=#=#=#=#=#=#=#=#=#=#=#=#=#==== State: Open by: belldi on 09 October 2015 16:05:27 ====

== Comment: #18 - Application Cdeadmin <email address hidden> - 2015-10-11 22:25:07 ====== State: Open by: belldi on 19 September 2015 17:35:01 ====

== Comment: #16 - Application Cdeadmin <email address hidden> - 2015-10-09 17:16:04 ====== State: Open by: cde00 on 12 October 2015 01:21:01 ====

Revision history for this message
Ubuntu Foundations Team Bug Bot (crichton) wrote :

Thank you for taking the time to report this bug and helping to make Ubuntu better. It seems that your bug report is not filed about a specific source package though, rather it is just filed against Ubuntu in general. It is important that bug reports be filed about source packages so that people interested in the package can find the bugs about it. You can find some hints about determining what package your bug might be about at https://wiki.ubuntu.com/Bugs/FindRightPackage. You might also ask for help in the #ubuntu-bugs irc channel on Freenode.

To change the source package that this bug is filed about visit https://bugs.launchpad.net/ubuntu/+bug/1505088/+editstatus and add the package name in the text box next to the word Package.

[This is an automated message. I apologize if it reached you inappropriately; please just reply to this message indicating so.]

tags: added: bot-comment
Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2015-10-12 08:30 EDT-------
== Comment: #1 - Application Cdeadmin <email address hidden> - 2015-09-19 18:46:07 ====== State: Open by: mchollin on 11 October 2015 15:07:41 ====

#=#=#=#=#=#=#=#=#=#=#=#=#=#=#=#=#=#=#==== State: Open by: belldi on 09 October 2015 16:05:27 ====

== Comment: #18 - Application Cdeadmin <email address hidden> - 2015-10-11 22:25:07 ====== State: Open by: belldi on 19 September 2015 17:35:01 ====

== Comment: #16 - Application Cdeadmin <email address hidden> - 2015-10-09 17:16:04 ====== State: Open by: cde00 on 12 October 2015 03:30:32 ====

Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2015-10-12 08:45 EDT-------
==== State: Open by: cde00 on 12 October 2015 03:34:21 ====

Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2015-10-14 20:35 EDT-------
== Comment: #16 - Application Cdeadmin <email address hidden> - 2015-10-09 17:16:04 ====== State: Open by: cde00 on 12 October 2015 03:45:09 ====
==== State: Open by: belldi on 14 October 2015 15:32:35 ====

I manually enabled the opal_errd daemon on surelock02p03.aus.stglabs.ibm.com by running following commands. The opal_errd daemon starts automatically after booting Ubuntu partition. The next step is for opal_errd to start after installing ppc64-diag package.

Enable opal_errd daemon
++++++++++++++++++++++++
systemctl enable opal_errd
systemctl start opal_errd

opal_errd daemon is running
+++++++++++++++++++++++++++
ubuntu@surelock02p03:~$ ps -ef|grep -i opal_errd
root 8684 1 0 Oct12 ? 00:02:03 /usr/sbin/opal_errd
ubuntu 96160 8934 0 14:18 hvc0 00:00:00 grep --color=auto -i opal_errd

After starting opal_errd daemon, dumps are moved from FSP to /var/log/dump on partition
+++++++++++++++++++++++++++++++++++++++++++++++++++++++
ubuntu@surelock02p03:~$ ls -l /var/log/dump/
total 37392
-r--r----- 1 root root 6563460 Oct 8 11:17 FSPDUMP.101194A.6A000005.20151007022920
-r--r----- 1 root root 6683326 Oct 13 14:47 FSPDUMP.101194A.7A000005.20151013194230
-r--r----- 1 root root 12524752 Oct 8 11:17 SYSDUMP.101194A.0000000F.20151007021819
-r--r----- 1 root root 12509728 Oct 13 14:47 SYSDUMP.101194A.00000010.20151013193135

Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2015-11-04 06:01 EDT-------
Dionysius,

Can you please provide system access details (login/pw) of surelock02p03.aus.stglabs.ibm.com

Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2015-11-06 07:37 EDT-------
Going through ppc64-diag packing, I found Ubuntu specific
patch named "no-upstream-init.patch". Which comments out the
installation of opal_errd service unit, required for opal_errd
to start with machine bootup.

By installation, I am referring to the result of action, performed
while the systemctl command manually run:

# systemctl enable opal_errd.service

In brief, this command creates a symlink/symbolic link of
opal_errd.service unit into /systemd/system/multi-user.target.wants/opal_errd.service.
Which enable/runs opal_errd during bootup.

The patch mentions "Description: Don't install upstream systemd or sysvinit
scripts for now. Skip installing systemd units for now, since they also seem
to depend on installing upstream's version of the init jobs.".

I am not aware of the history about this patch. Can Ubuntu maintainer,
comment on this and is there any extra step/package which could install
systemd service/add them to init scripts.

Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2015-11-17 11:45 EDT-------
Looks like this is pretty much similar to #127390..which is launchpad #1481580

We want both daemon to be installed by default as same ISO is supported on various environment...

-Vasant

tags: added: severity-critical
removed: severity-high
Changed in ubuntu:
assignee: nobody → Taco Screen team (taco-screen-team)
Revision history for this message
Frédéric Bonnard (frediz) wrote :

Here is a patch.
Some comments :
- the Makefile installed the systemd unit files in the wrong system unit directory, so they were not detected and not enabled/started with dh_installinit so I did a small debian/patch.
- then once, the path is fixed the call to invoke-rc.d in the generated postinst did not start the daemons because "We never start disabled jobs;"(cf invoke-rc.d code) and we need to enable the daemons in the post-installation. I couldn't do it with "update-rc.d enable" in the configure step of the postinst script as "update-rc.d defaults" needs to be done before... So I used dh_systemd which uses dh_systemd_enable and dh_systemd_start.
I worked here.
Please review, thanks!

Revision history for this message
Ubuntu Foundations Team Bug Bot (crichton) wrote :

The attachment "enable-and-start-daemons.patch" seems to be a patch. If it isn't, please remove the "patch" flag from the attachment, remove the "patch" tag, and if you are a member of the ~ubuntu-reviewers, unsubscribe the team.

[This is an automated message performed by a Launchpad user owned by ~brian-murray, for any issues please contact him.]

tags: added: patch
bugproxy (bugproxy)
tags: added: targetmilestone-inin1510
removed: targetmilestone-inin---
Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2015-12-10 19:48 EDT-------
Hello Canonical,

Can you review the patch please? Thanks.

Revision history for this message
Breno Leitão (breno-leitao) wrote :

This is a high priority bug from IBM side.

Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2016-01-08 08:52 EDT-------
(In reply to comment #41)
> This is a high priority bug from IBM side.

Any Update here ?

-Vasant

Revision history for this message
Steve Langasek (vorlon) wrote :

Fix uploaded to xenial:

ppc64-diag (2.6.9-0ubuntu2) xenial; urgency=medium

  [ Frederic Bonnard ]
  * Fix failure to autostart the service on install on systemd-using systems
    (LP: #1505088):
    - debian/patches/systemd-installdir.patch: Fix path for systemd system
      services.
    - build-depend on dh-systemd.

affects: ubuntu → ppc64-diag (Ubuntu)
Changed in ppc64-diag (Ubuntu):
status: New → Fix Released
Revision history for this message
Breno Leitão (breno-leitao) wrote :

Thanks Steve.

Could you also to "backport" it to 15.10? This is the only supported OS supported for Surelock GA2, and we would like to have it fixed there (until 16.04) is still unavailable.

Revision history for this message
Steve Langasek (vorlon) wrote : Re: [Bug 1505088] Re: Surelock: GA2: ppc64-diag package opal_errd not autostarting on Ubuntu 15.10 (in systemd)

On Mon, Jan 11, 2016 at 06:12:58PM -0000, Breno Leitão wrote:
> Could you also to "backport" it to 15.10? This is the only supported OS
> supported for Surelock GA2, and we would like to have it fixed there
> (until 16.04) is still unavailable.

I'm not familiar with Surelock, but if this is a question of hardware not
supported by older kernels, will this be supported on 14.04.4, to be
released soon?

Revision history for this message
bugproxy (bugproxy) wrote : Comment bridged from LTC Bugzilla

------- Comment From <email address hidden> 2016-01-12 08:15 EDT-------
Hi Steve,

> I'm not familiar with Surelock, but if this is a question of hardware not
> supported by older kernels, will this be supported on 14.04.4, to be
> released soon?

Sorry, this is a hardware solution used with flash CAPI[1] and enabled on Ubuntu 15.10 kernel (4.2).

Ubuntu 15.10 is the only OS that supports it at the moment, and that is why we need the fix in 15.10.
Ubuntu 14.04.4 might be able to technically run the hardware, but I am not sure we will make it as a supported OS. I understand that the tests were done only in 15.10. 16.04 might be the alternative once 15.10 arrives EOL.

[1] http://www-03.ibm.com/systems/power/solutions/bigdata-analytics/data-engine/

Steve Langasek (vorlon)
Changed in ppc64-diag (Ubuntu Wily):
status: New → Triaged
assignee: nobody → Steve Langasek (vorlon)
importance: Undecided → Critical
Steve Langasek (vorlon)
description: updated
Revision history for this message
Chris J Arges (arges) wrote : Please test proposed package

Hello bugproxy, or anyone else affected,

Accepted ppc64-diag into wily-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/ppc64-diag/2.6.9-0ubuntu1.1 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, and change the tag from verification-needed to verification-done. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed. In either case, details of your testing will help us make a better decision.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

Changed in ppc64-diag (Ubuntu Wily):
status: Triaged → Fix Committed
tags: added: verification-needed
Revision history for this message
bugproxy (bugproxy) wrote : Comment bridged from LTC Bugzilla

------- Comment From <email address hidden> 2016-01-18 13:11 EDT-------
Patch provided appears to fix the problem

root@z1381:/var/log# systemctl status opal_errd.service
opal_errd.service
Loaded: masked (/dev/null)
Active: inactive (dead) since Mon 2016-01-18 11:40:12 CST; 3min 29s ago
Main PID: 1009 (code=exited, status=0/SUCCESS)

Jan 18 11:25:45 z1381 systemd[1]: Starting opal_errd (PowerNV platform erro.....
Jan 18 11:25:45 z1381 systemd[1]: Started opal_errd (PowerNV platform error...e.
Jan 18 11:31:59 z1381 ELOG[1009]: LID[50db8358]::SRC[B1763321]::Other Subsy...ed
Jan 18 11:40:12 z1381 systemd[1]: Stopping opal_errd (PowerNV platform erro.....
Jan 18 11:40:12 z1381 systemd[1]: Stopped opal_errd (PowerNV platform error...e.
Hint: Some lines were ellipsized, use -l to show in full

root@z1381:/SW322827# ls
ppc64-diag_2.6.9-0ubuntu1.1_ppc64el.deb
ppc64-diag-dbgsym_2.6.9-0ubuntu1.1_ppc64el.ddeb
root@z1381:/SW322827# dpkg -i *
Selecting previously unselected package ppc64-diag.
(Reading database ... 130376 files and directories currently installed.)
Preparing to unpack ppc64-diag_2.6.9-0ubuntu1.1_ppc64el.deb ...
Unpacking ppc64-diag (2.6.9-0ubuntu1.1) ...
Selecting previously unselected package ppc64-diag-dbgsym.
Preparing to unpack ppc64-diag-dbgsym_2.6.9-0ubuntu1.1_ppc64el.ddeb ...
Unpacking ppc64-diag-dbgsym (2.6.9-0ubuntu1.1) ...
Setting up ppc64-diag (2.6.9-0ubuntu1.1) ...
Setting up ppc64-diag-dbgsym (2.6.9-0ubuntu1.1) ...
Processing triggers for man-db (2.7.4-1) ...
Processing triggers for ureadahead (0.100.0-19) ...
Processing triggers for systemd (225-1ubuntu9) ...

root@z1381:/SW322827# systemctl status opal_errd.service
opal_errd.service - opal_errd (PowerNV platform error handling) Service
Loaded: loaded (/lib/systemd/system/opal_errd.service; enabled; vendor preset: enabled)
Active: active (running) since Mon 2016-01-18 11:44:52 CST; 56s ago
Main PID: 2552 (opal_errd)
CGroup: /system.slice/opal_errd.service
2552 /usr/sbin/opal_errd

Jan 18 11:44:52 z1381 systemd[1]: Starting opal_errd (PowerNV platform erro.....
Jan 18 11:44:52 z1381 ELOG[2552]: LID[50db8680]::SRC[B1763321]::Other Subsy...ed
Jan 18 11:44:52 z1381 systemd[1]: Started opal_errd (PowerNV platform error...e.
Hint: Some lines were ellipsized, use -l to show in full.
root@z1381:/SW322827# date
Mon Jan 18 11:45:52 CST 2016
root@z1381:/SW322827#

tags: added: verification-done
removed: verification-needed
Revision history for this message
Brian Murray (brian-murray) wrote : Update Released

The verification of the Stable Release Update for ppc64-diag has completed successfully and the package has now been released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regressions.

Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package ppc64-diag - 2.6.9-0ubuntu1.1

---------------
ppc64-diag (2.6.9-0ubuntu1.1) wily; urgency=medium

  [ Frederic Bonnard ]
  * Fix failure to autostart the service on install on systemd-using systems
    (LP: #1505088):
    - debian/patches/systemd-installdir.patch: Fix path for systemd system
      services.
    - build-depend on dh-systemd.

 -- Steve Langasek <email address hidden> Fri, 08 Jan 2016 09:59:40 -0800

Changed in ppc64-diag (Ubuntu Wily):
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.