[witherspoon] does not respond to ipmi commands

Bug #1811439 reported by Manoj Iyer
14
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Checkbox Provider - Base
Invalid
High
Unassigned
The Ubuntu-power-systems project
Invalid
High
bugproxy

Bug Description

Witherspoon BMC does not respond to IPMI commands from an external system (like MAAS).

BMC FIRMWARE VERSION: ibm-v2.1-438-g0030304-r12-0-g5ee4fb0

== From the witherspoon BMC ==

# systemctl status phosphor-ipmi-host

[[0;1;32m●[[0m phosphor-ipmi-host.service - Phosphor Inband IPMI
   Loaded: loaded (/lib/systemd/system/phosphor-ipmi-host.service; enabled; vend
or preset: enabled)
   Active: [[0;1;32mactive (running)[[0m since Wed 2019-01-09 17:27:33 UTC; 1 da
y 22h ago
 Main PID: 6094 (ipmid)
   CGroup: /system.slice/phosphor-ipmi-host.service
           └─6094 ipmid

Jan 11 13:09:31 witherspoon ipmid[6094]: Host time: 1547210155580801, Fri Jan 11
 12:35:55 2019
Jan 11 13:09:31 witherspoon ipmid[6094]: Handling Set-SEL-Time:[0xA], Cmd:[0x49]
, Data:[0x5C38904C]
Jan 11 13:09:31 witherspoon ipmid[6094]: To Set host time: 1547210828000000, Fri
 Jan 11 12:47:08 2019
Jan 11 13:09:31 witherspoon ipmid[6094]: IPMI Handling GET-SEL-TIME
Jan 11 13:09:31 witherspoon ipmid[6094]: Host time: 1547210827681434, Fri Jan 11
 12:47:07 2019
Jan 11 13:09:31 witherspoon ipmid[6094]: Handling Set-SEL-Time:[0xA], Cmd:[0x49]
, Data:[0x5C3892EC]
Jan 11 13:09:31 witherspoon ipmid[6094]: To Set host time: 1547211500000000, Fri
 Jan 11 12:58:20 2019
Jan 11 13:09:31 witherspoon ipmid[6094]: IPMI Handling GET-SEL-TIME
Jan 11 13:09:31 witherspoon ipmid[6094]: Host time: 1547211499565471, Fri Jan 11
 12:58:19 2019
Jan 11 13:09:32 witherspoon ipmid[6094]: Handling Set-SEL-Time:[0xA], Cmd:[0x49]
, Data:[0x5C38958C]

== From my local system ==
$ ipmitool -I lanplus -H 10.245.71.111 -U root -P 0penBmc sol activate
Error: Unable to establish IPMI v2 / RMCP+ session

$ ping 10.245.71.111
PING 10.245.71.111 (10.245.71.111) 56(84) bytes of data.
64 bytes from 10.245.71.111: icmp_seq=1 ttl=60 time=54.9 ms
64 bytes from 10.245.71.111: icmp_seq=2 ttl=60 time=54.2 ms
64 bytes from 10.245.71.111: icmp_seq=3 ttl=60 time=68.5 ms
64 bytes from 10.245.71.111: icmp_seq=4 ttl=60 time=60.7 ms

--- 10.245.71.111 ping statistics ---
4 packets transmitted, 4 received, 0% packet loss, time 6ms
rtt min/avg/max/mdev = 54.187/59.593/68.503/5.739 ms

I tried stopping and starting the ipmid and netipmid service on the BMC but still no response to ipmi commands.

Manoj Iyer (manjo)
description: updated
bugproxy (bugproxy)
tags: added: architecture-ppc64le bugnameltc-174679 severity-high targetmilestone-inin---
Changed in ubuntu-power-systems:
status: New → Triaged
Revision history for this message
Manoj Iyer (manjo) wrote :

Manoj will upload the firmware from ftp://testcase.software.ibm.com/fromibm/linux/
witherspoon_prod.pnor.squashfs.tar to the BMC and flash & retest.

Revision history for this message
Manoj Iyer (manjo) wrote :

Installed the latest firmware image.

root@witherspoon:~# cat /var/lib/phosphor-software-manager/pnor/ro/VERSION

IBM-witherspoon-ibm-OP9-v2.0.11-2.6-prod
 op-build-v2.0.11-4-g362008d
 buildroot-2018.05.1-9-gc99f2ee
 skiboot-00c8ab5-pdfdccea
 hostboot-3bf7fcf-pfb5d59d
 occ-3a505ff
 linux-4.17.12-openpower1-p23ee713
 petitboot-v1.7.3-pf71644b
 machine-xml-dce824c
 hostboot-binaries-hw112818a.op920
 capp-ucode-p9-dd2-v4
 sbe-7aa21f6
 hcode-hw112818a.op920

IPMI service seems to be active:

[[0;1;32m●[[0m phosphor-ipmi-host.service - Phosphor Inband IPMI
   Loaded: loaded (/lib/systemd/system/phosphor-ipmi-host.service; enabled; vend
or preset: enabled)
   Active: [[0;1;32mactive (running)[[0m since Mon 2019-01-28 15:49:58 UTC; 3min
 50s ago
 Main PID: 1323 (ipmid)
   CGroup: /system.slice/phosphor-ipmi-host.service
           └─1323 ipmid

Jan 28 15:50:43 witherspoon ipmid[1323]: Pushing cmd on to queue
Jan 28 15:50:43 witherspoon ipmid[1323]: Asserting SMS Attention
Jan 28 15:50:44 witherspoon ipmid[1323]: SMS Attention asserted
Jan 28 15:50:49 witherspoon ipmid[1323]: [[0;1;31mwd_reset: ERROR in mapper call
[[0m
Jan 28 15:50:49 witherspoon ipmid[1323]: ERROR:[0xFF] handling NetFn:[0x6], Cmd:
[0x22]
Jan 28 15:50:59 witherspoon ipmid[1323]: ERROR:[0x80] handling NetFn:[0x6], Cmd:
[0x22]
Jan 28 15:53:41 witherspoon ipmid[1323]: Pushing cmd on to queue
Jan 28 15:53:41 witherspoon ipmid[1323]: Pushing cmd on to queue
Jan 28 15:53:41 witherspoon ipmid[1323]: Asserting SMS Attention
Jan 28 15:53:43 witherspoon ipmid[1323]: SMS Attention asserted

But IPMI (ipmitool) command fails:
$ ipmitool -I lanplus -H 10.245.71.111 -U root -P 0penBmc mc info
Error: Unable to establish IPMI v2 / RMCP+ session

I connect to the BMC using the IP address 10.245.71.111 which was assigned by DHCP. I also see another IP address that he interface eth0 is assigned: 169.254.233.244
root@witherspoon:~# ifconfig
eth0 Link encap:Ethernet HWaddr 08:94:EF:80:27:30
          inet addr:169.254.233.244 Bcast:169.254.255.255 Mask:255.255.0.0
          inet6 addr: fe80::a94:efff:fe80:2730%959/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
          RX packets:1212 errors:0 dropped:14 overruns:0 frame:0
          TX packets:1611 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:150981 (147.4 KiB) TX bytes:1065143 (1.0 MiB)
          Interrupt:19

is this the interface and ip address that responds to IPMI commands? If that is the case then this could be our problem why it does not respond to ipmi reqs on 10.245.71.111.

Revision history for this message
Mike Ranweiler (mranweil) wrote :

Yes, I agree - I think it's the extra IP address and removing it should make it respond correctly.

Revision history for this message
Manoj Iyer (manjo) wrote :

IBM, This should be documented in the release notes, or in a readme.

1. I had to log into the BMC after updating the firmware and do:
root@witherspoon:~# systemctl restart systemd-networkd

It resets the eth0 address to the BMC's (web GUI) ipaddress of 10.245.71.111 but then quickly resets back to 169.254.233.244. I am not sure why it does that, but it would be useful to know what eth0 is, and why it has that address. And.. why I dont see any other interface with the 10.245.71.111 address that has the web gui.

2. The reason ipmitool commands did not work for me does not seem to depend on what the BMC's eth0 IP address is set to, but to me it looks like the default ipmi user is nameless, ie DONT use '-U' with the ipmi command.

== For example this won't work: ==
$ ipmitool -I lanplus -H 10.245.71.111 -U root -P 0penBmc chassis status
Error: Unable to establish IPMI v2 / RMCP+ session

== But this will work: ==
$ ipmitool -I lanplus -H 10.245.71.111 -P 0penBmc chassis status
System Power : on
Power Overload : false
Power Interlock : inactive
Main Power Fault : false
Power Control Fault : false
Power Restore Policy : always-off
Last Power Event :
Chassis Intrusion : inactive
Front-Panel Lockout : inactive
Drive Fault : false
Cooling/Fan Fault : false
Front Panel Control : none

Revision history for this message
Manoj Iyer (manjo) wrote :

MAAS uses ipmipower, and ipmi commands using nameless user and password does not work.

ubuntu@maas-dev:~$ ipmipower -D LAN_2_0 -p 0penBmc -h 10.245.71.111 --stat
10.245.71.111: username invalid
ubuntu@maas-dev:~$

Is there a version dependency on ipmipower? or is it known not to work with ipmipower tool?

Revision history for this message
Manoj Iyer (manjo) wrote :
Download full text (5.4 KiB)

Posting detailed output from ipmipower. Is there a way to add a named user just for IPMI commands? I could not find any on the BMC GUI and could not find a way to add using xyz/openbmc_project/user/ management either.

ubuntu@maas-dev:~$ ipmipower -D LAN_2_0 -p 0penBmc -h 10.245.71.111 --stat --debug
10.245.71.111: =====================================================
10.245.71.111: IPMI 2.0 Get Channel Authentication Capabilities Request
10.245.71.111: =====================================================
10.245.71.111: RMCP Header:
10.245.71.111: ------------
10.245.71.111: [ 6h] = version[ 8b]
10.245.71.111: [ 0h] = reserved[ 8b]
10.245.71.111: [ FFh] = sequence_number[ 8b]
10.245.71.111: [ 7h] = message_class.class[ 5b]
10.245.71.111: [ 0h] = message_class.reserved[ 2b]
10.245.71.111: [ 0h] = message_class.ack[ 1b]
10.245.71.111: IPMI Session Header:
10.245.71.111: --------------------
10.245.71.111: [ 0h] = authentication_type[ 8b]
10.245.71.111: [ 0h] = session_sequence_number[32b]
10.245.71.111: [ 0h] = session_id[32b]
10.245.71.111: [ 9h] = ipmi_msg_len[ 8b]
10.245.71.111: IPMI Message Header:
10.245.71.111: --------------------
10.245.71.111: [ 20h] = rs_addr[ 8b]
10.245.71.111: [ 0h] = rs_lun[ 2b]
10.245.71.111: [ 6h] = net_fn[ 6b]
10.245.71.111: [ C8h] = checksum1[ 8b]
10.245.71.111: [ 81h] = rq_addr[ 8b]
10.245.71.111: [ 0h] = rq_lun[ 2b]
10.245.71.111: [ 24h] = rq_seq[ 6b]
10.245.71.111: IPMI Command Data:
10.245.71.111: ------------------
10.245.71.111: [ 38h] = cmd[ 8b]
10.245.71.111: [ Eh] = channel_number[ 4b]
10.245.71.111: [ 0h] = reserved1[ 3b]
10.245.71.111: [ 1h] = get_ipmi_v2.0_extended_data[ 1b]
10.245.71.111: [ 3h] = maximum_privilege_level[ 4b]
10.245.71.111: [ 0h] = reserved2[ 4b]
10.245.71.111: IPMI Trailer:
10.245.71.111: --------------
10.245.71.111: [ 26h] = checksum2[ 8b]
10.245.71.111: =====================================================
10.245.71.111: IPMI 2.0 Get Channel Authentication Capabilities Response
10.245.71.111: =====================================================
10.245.71.111: RMCP Header:
10.245.71.111: ------------
10.245.71.111: [ 6h] = version[ 8b]
10.245.71.111: [ 0h] = reserved[ 8b]
10.245.71.111: [ FFh] = sequence_number[ 8b]
10.245.71.111: [ 7h] = message_class.class[ 5b]
10.245.71.111: [ 0h] = message_class.reserved[ 2b]
10.245.71.111: [ 0h] = message_class.ack[ 1b]
10.245.71.111: IPMI Session Header:
10.245.71.111: --------------------
10.245.71.111: [ 0h] = authentication_type[ 8b]
10.245.71.111: [ 0h] = session_sequence_number[32b]
10.245.71.111: [ 0h] = session_id[32b]
10.245.71.111: [ 10h] = ipmi_msg_len[ 8b]
10.245.71.111: IPMI Message Header:
10.245.71.111: --------------------
10.245.71.111: [ 81h] = rq_a...

Read more...

Revision history for this message
Manoj Iyer (manjo) wrote :

Also it might be worth noting that ipmipower with nameless user (null) works on bostons. It would be worth figuring out the difference in BMC firmware implementation betweeen bostons and witherspoons.

Revision history for this message
bugproxy (bugproxy) wrote : Comment bridged from LTC Bugzilla

------- Comment From <email address hidden> 2019-02-01 10:52 EDT-------
For security reasons, we recently moved OpenBMC to not support the nameless option anymore. So the -U parameter will always be required in latest firmware. Will this be ok for you or do your tools need -U to be optional? We could backport that change which accepts (and requires) -U.

Revision history for this message
Manoj Iyer (manjo) wrote :

@<email address hidden>,

The current version of the firmware does not support the -U option because the user is nameless. Using ipmipower command with no -U option fails, however ipmitools seem to work with the nameless (with no -U) user just fine.

We use ipmipower command in MAAS to power control the system, and since ipmipower fails with nameless user, we are unable to power control from MAAS. It would be ideal if we could use a named user for ipmi. ie. use the -U option.

If your latest firmware release has a named user for ipmi commands that would be good enough to solve the issues we are having witherspoon on MAAS. So having -U <username> is good.

Where can I get access to this latest firmware files (that supports the -U option) so that I can flash the BMC and test? My current BMC firmware version is as follows:

root@witherspoon:~# cat /var/lib/phosphor-software-manager/pnor/ro/VERSION
IBM-witherspoon-ibm-OP9-v2.0.10-2.15-prod
 op-build-v2.0.8-7-g607412e-dirty
 buildroot-2018.02.1-6-ga8d1126
 skiboot-a7a3d18-p95ddfaf
 hostboot-d033213-p8b99fdc
 occ-084756c
 linux-4.16.13-openpower1-p2c7a5c1
 petitboot-v1.7.2-p81d13a6
 machine-xml-7cd20a6
 hostboot-binaries-hw080418a.op920
 capp-ucode-p9-dd2-v4
 sbe-55d6eb2
 hcode-hw082318a.op920
root@witherspoon:~#

If my current firmware already has a named user (other then root) that I can use, could you please let me know that user name?

Revision history for this message
Andres Rodriguez (andreserl) wrote :

Hi all,

1. As I understand, the current whiterspoon OpenBMC supports not passing the 'user' parameter for power management (aka nameless). (e.g. ipmitool -I lanplus -H <ip> -P <password> chassis status).

2. This works with ipmitool (e.g. ipmitool -I lanplus -H <ip> -P <password> chassis status), but doesn't work with ipmipower from freeipmi-tools (e.g. ipmipower -p admin -D LAN_2_0 -h 10.245.143.113 --stat)

3. Other non-OpenBMC hardware also supports not passing the 'user' parameter. For example, P8. However, P8's work with ipmitool and ipmipower (from freeipmi-tools). FOr example, on a P8 system:

roaksoax@rivals:~/project/qa-lab-tests$ ipmipower -p admin -D LAN_2_0 -h 10.245.143.113 --stat
10.245.143.113: on
roaksoax@rivals:~/project/qa-lab-tests$ ipmitool -I lanplus -H 10.245.143.113 -P admin chassis status
System Power : on
Power Overload : false
Power Interlock : inactive
Main Power Fault : false
Power Control Fault : false
Power Restore Policy : always-off
Last Power Event :
Chassis Intrusion : inactive
Front-Panel Lockout : inactive
Drive Fault : false
Cooling/Fan Fault : false

So, given the above, have there my question here is whether there have been made any specific changes to support OpenBMC's in ipmitool? If so, does anyone know what are these?

Revision history for this message
Andres Rodriguez (andreserl) wrote :

To compare, I'm providing the debug output from ipmipower for:

1. P9 where it doesn't work: http://paste.ubuntu.com/p/G7s6FSPKQn/
2. P8 where it does work: http://paste.ubuntu.com/p/WJ6QhKdPjk/

Revision history for this message
Andres Rodriguez (andreserl) wrote :

On the output above, I see a difference:

In the P9 where this doesn't work, we see:

10.245.71.111: [ 0h] = authentication_status.anonymous_login[ 1b]
10.245.71.111: [ 0h] = authentication_status.null_username[ 1b]
10.245.71.111: [ 0h] = authentication_status.non_null_username[ 1b]

In the P8 where this does work:

10.245.143.113: [ 0h] = authentication_status.anonymous_login[ 1b]
10.245.143.113: [ 1h] = authentication_status.null_username[ 1b]
10.245.143.113: [ 1h] = authentication_status.non_null_username[ 1b]

From the IPMI spec:

If an implementation only one user with a fixed username, then the fixed username must be null (all zeros).

From the above of the IPMI spec, I would understand that given that OpenBMC doesn't support multiple users, the user should be all zeros. The question then becomes, why does it work in ipmitool and not on freeipmi-tools? could it be that the OpenBMC implementation is not actually setting zeros, rather, something else causing this mismatch in freeipmi-tools as it may be expecting 0s ?

Revision history for this message
Andres Rodriguez (andreserl) wrote :

Or, I actually wonder whether freeipmi-tools, when not passing the user parameter, it actually uses zeros to contact the BMC, and OpenBMC interprets that as a user being passed, hence fails authentication?

In comparison, ipmitool may not be sending anything at all. Or maybe, the other way around.

Revision history for this message
Newell Jensen (newell-jensen) wrote :

My take on this is that the authentication_status.null_user for the BMC's capabilities (in the response) is set to 0h and this is telling us that the BMC does not support a null username. ipmipower seems to be actually handling this properly in the sense that since a null username is provided, i.e. no username, that it errors out since the BMC does not support this. ipmitool just returns when the username is not set and does not raise an error (take a look at the function ipmi_intf_session_set_username in https://github.com/ipmitool/ipmitool/blob/master/src/plugins/ipmi_intf.c).

So, seems like the BMC is not configured correctly if it is expected to work with a null username since the capabilities are telling us that this is not supported. I think we should have the firmware for the BMC updated so that authentication_status.null_username is set to 1h and ipmipower should then work from what I can tell.

Jeff Lane  (bladernr)
Changed in plainbox-provider-checkbox:
status: New → Confirmed
Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2019-02-12 17:00 EDT-------
Upstream patches/review:
https://gerrit.openbmc-project.xyz/#/q/topic:freeipmi_support+status:open

Revision history for this message
Manoj Iyer (manjo) wrote :

Tested the following BMC firmware build from IBM, and ipmipower was found to work.

 == BMC FIRMWARE VERSION ==
ibm-v2.3-476-g2d622cb-r30-0-g862e40a

 == use ipmipower command on BMC 10.245.71.111 ==
$ ipmipower -D LAN_2_0 -p 0penBmc -h 10.245.71.111 --stat
10.245.71.111: on

$ ipmipower -D LAN_2_0 -p 0penBmc -h 10.245.71.111 --off
10.245.71.111: ok

$ ipmipower -D LAN_2_0 -p 0penBmc -h 10.245.71.111 --stat
10.245.71.111: off

$ ipmipower -D LAN_2_0 -p 0penBmc -h 10.245.71.111 --on
10.245.71.111: ok

$ ipmipower -D LAN_2_0 -p 0penBmc -h 10.245.71.111 --stat
10.245.71.111: on

Changed in ubuntu-power-systems:
status: Triaged → Fix Committed
Revision history for this message
Manoj Iyer (manjo) wrote :

After firmware update I see the following "High Error" on the BMC webui.

Hostboot has become unresponsive _PID=3897

FQPSPCR0023M: Hostboot has become unresponsive
_PID=3897 MESSAGE=org.open_power.Host.Boot.Error.WatchdogTimedOut

There are 4 such messages.

Revision history for this message
Manoj Iyer (manjo) wrote :

dmesg on the BMC has the following:

[ 4336.608602] ------------[ cut here ]------------
[ 4336.608682] WARNING: CPU: 0 PID: 7065 at /usr/src/kernel/drivers/gpu/drm/drm_atomic_helper.c:1362 drm_atomic_helper_wait_for_vblanks.part.2+0x28c/0x2a0
[ 4336.608695] [CRTC:30:crtc-0] vblank wait timed out
[ 4336.608717] CPU: 0 PID: 7065 Comm: fbterm Not tainted 4.18.12-b953727d56c4de63eb0fad1f079f41004e61d713 #1
[ 4336.608724] Hardware name: Generic DT based system
[ 4336.608779] [<80109bec>] (unwind_backtrace) from [<801075fc>] (show_stack+0x20/0x24)
[ 4336.608807] [<801075fc>] (show_stack) from [<80644628>] (dump_stack+0x20/0x28)
[ 4336.608832] [<80644628>] (dump_stack) from [<80116f4c>] (__warn+0xdc/0x104)
[ 4336.608852] [<80116f4c>] (__warn) from [<80116fc8>] (warn_slowpath_fmt+0x54/0x74)
[ 4336.608892] [<80116fc8>] (warn_slowpath_fmt) from [<803c4e74>] (drm_atomic_helper_wait_for_vblanks.part.2+0x28c/0x2a0)
[ 4336.608924] [<803c4e74>] (drm_atomic_helper_wait_for_vblanks.part.2) from [<803c6efc>] (drm_atomic_helper_commit_tail+0x64/0x74)
[ 4336.608944] [<803c6efc>] (drm_atomic_helper_commit_tail) from [<803c6f88>] (commit_tail+0x7c/0x80)
[ 4336.608961] [<803c6f88>] (commit_tail) from [<803c74ac>] (drm_atomic_helper_commit+0xc8/0x138)
[ 4336.608987] [<803c74ac>] (drm_atomic_helper_commit) from [<803e7d18>] (drm_atomic_commit+0x50/0x60)
[ 4336.609019] [<803e7d18>] (drm_atomic_commit) from [<803cb6cc>] (restore_fbdev_mode_atomic+0x1c4/0x220)
[ 4336.609050] [<803cb6cc>] (restore_fbdev_mode_atomic) from [<803cd7c8>] (drm_fb_helper_pan_display+0xb0/0x1c4)
[ 4336.609082] [<803cd7c8>] (drm_fb_helper_pan_display) from [<80387164>] (fb_pan_display+0xb8/0x13c)
[ 4336.609105] [<80387164>] (fb_pan_display) from [<80387984>] (do_fb_ioctl+0x348/0x720)
[ 4336.609124] [<80387984>] (do_fb_ioctl) from [<80387da8>] (fb_ioctl+0x4c/0x5c)
[ 4336.609149] [<80387da8>] (fb_ioctl) from [<8023f088>] (do_vfs_ioctl+0xac/0x8b0)
[ 4336.609170] [<8023f088>] (do_vfs_ioctl) from [<8023f8d0>] (ksys_ioctl+0x44/0x68)
[ 4336.609188] [<8023f8d0>] (ksys_ioctl) from [<8023f90c>] (sys_ioctl+0x18/0x1c)
[ 4336.609208] [<8023f90c>] (sys_ioctl) from [<80101000>] (ret_fast_syscall+0x0/0x54)
[ 4336.609219] Exception stack(0x9cba1fa8 to 0x9cba1ff0)
[ 4336.609236] 1fa0: 016037a8 00000001 00000003 00004606 000329f8 000329b4
[ 4336.609253] 1fc0: 016037a8 00000001 00012030 00000036 01602080 00000000 490b0fa4 00000000
[ 4336.609265] 1fe0: 000320d8 7e86bca4 0001642c 49189bdc
[ 4336.609276] ---[ end trace 5af8ac314ec99668 ]---
[ 4412.545792] ucd9000 11-0064: Device ID UCD90160A|2.4.3.0809|160916
[ 4415.197913] occ-hwmon occ-hwmon.1: host is not ready
[ 4415.678192] occ-hwmon occ-hwmon.2: host is not ready
[ 4502.167464] ucd9000 11-0064: Device ID UCD90160A|2.4.3.0809|160916
[ 4504.828296] occ-hwmon occ-hwmon.1: host is not ready
[ 4505.194613] occ-hwmon occ-hwmon.2: host is not ready

Changed in ubuntu-power-systems:
status: Fix Committed → Incomplete
Revision history for this message
Manoj Iyer (manjo) wrote :

Booting into old BMC firmware (ibm-v2.1-438-g0030304-r12-0-g5ee4fb00) I dont see these errors in BMC's dmesg output. But, I am not able to power on the server anymore.. will post further findings and datapoints soon.

Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2019-03-04 16:20 EDT-------
Hi Manjo - so the system is working, other than seeing that error? You restarted the BMC after the update, then restarted the system using ipmipower, so it looks like it's largely working.

Can you double check your PNOR level - in the GUI go to the Server Config -> Firmware and the functional Server Image - are you at IBM-witherspoon-ibm-OP9-v2.0.11-2.6-prod or something else?

------- Comment From <email address hidden> 2019-03-04 16:25 EDT-------
Manoj - could you just list the Functional and any Active for both BMC and Server images, please?

Revision history for this message
Manoj Iyer (manjo) wrote :

Switching back to old BMC firmware ibm-v2.1 I get the following error on the webui

FQPSPAA0007G: A system checkstop occurred
_PID=18366
MESSAGE=org.open_power.Host.Boot.Error.Checkstop

Power control with ipmipower/ipmitool returns ok.. but the server does not seem to be power cycling.

Revision history for this message
Manoj Iyer (manjo) wrote :
Manoj Iyer (manjo)
Changed in ubuntu-power-systems:
status: Incomplete → Fix Committed
Frank Heimes (fheimes)
Changed in plainbox-provider-checkbox:
status: Confirmed → Fix Committed
Revision history for this message
Andrew Cloke (andrew-cloke) wrote :

Awaiting f/w GA release, ETA mid May 2019.

Manoj Iyer (manjo)
Changed in plainbox-provider-checkbox:
importance: Undecided → High
Revision history for this message
Andrew Cloke (andrew-cloke) wrote :

F/w upgrade will be performed during machine's processor upgrade which requires RMA. ETA to bug closure likely to be June 2019.

Revision history for this message
Andrew Cloke (andrew-cloke) wrote :

Witherspoon has now been upgraded to the latest BMC release, but this issue is still occurring. So, the assumption is that the fix did not make it into IBM's latest BMC f/w release.

Awaiting confirmation from IBM as to which firmware release this issue will be fixed in.

Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2019-07-22 13:50 EDT-------
Could you please provide firmware level installed? cat /etc/os-release from ssh shell or firmware page in webui should show it.

I did a quick test and verified the sol activate worked ok on my system with latest OpenBMC code.

Revision history for this message
Frank Heimes (fheimes) wrote :

changed to Incomplete while waiting on (blocking) RT 120488

Changed in ubuntu-power-systems:
status: Fix Committed → Incomplete
Revision history for this message
Frank Heimes (fheimes) wrote :

System was down (powered off - for whatever reason, I saw lot's of "FQPSPAA0007G: A system checkstop occurred" in the BMC log) and got now restarted (and partly resetup - now with different address: .132).

For now it responds for me on ipmi commands, power, sol, etc.

Nevertheless here are the fw levels:

Taken from the OpenBMC web UI:

BMC images
 Functional firmware version: ibm-v2.3-476-g2d622cb-r30-0-g862e40a
Functional: ibm-v2.3-476-g2d622cb-r30-0-g862e40a
Active: ibm-v2.1-438-g0030304-r12-0-g5ee4fb0

Server images
 Functional firmware version: IBM-witherspoon-OP9-v2.2-9.56
Functional: IBM-witherspoon-OP9-v2.2-9.56
Active: IBM-witherspoon-ibm-OP9-v2.2-rc2-9.44-prod

Taken from IPMI:

$ ipmitool -I lanplus -H 10.245.71.132 -P 0penBmc mc info
Device ID : 0
Device Revision : 0
Firmware Revision : 2.03
IPMI Version : 2.0
Manufacturer ID : 42817
Manufacturer Name : Unknown (0xA741)
Product ID : 16975 (0x424f)
Product Name : Unknown (0x424F)
Device Available : yes
Provides Device SDRs : yes
Additional Device Support :
    Sensor Device
    SEL Device
    FRU Inventory Device
    Chassis Device
Aux Firmware Rev Info :
    0x04
    0x76
    0x00
    0x30

But I'm changing this ticket to Invalid since ipmi obviously works for me now:

ubuntu@P8lpar3:~$ ipmitool -I lanplus -H 10.245.71.132 -P <password> sol info
Info: SOL parameter 'Nonvolatile Bitrate (5)' not supported
Info: SOL parameter 'Volatile Bitrate (6)' not supported
Set in progress : set-complete
Enabled : true
Force Encryption : true
Force Authentication : true
Privilege Level : USER
Character Accumulate Level (ms) : 100
Character Send Threshold : 1
Retry Count : 7
Retry Interval (ms) : 100
Volatile Bit Rate (kbps) : IPMI-Over-Serial-Setting
Non-Volatile Bit Rate (kbps) : IPMI-Over-Serial-Setting
Payload Channel : 1 (0x01)
Payload Port : 623
ubuntu@P8lpar3:~$ ipmitool -I lanplus -H 10.245.71.132 -P <password> sol deactivate
ubuntu@P8lpar3:~$ ipmitool -I lanplus -H 10.245.71.132 -P <password> sol activate
[SOL Session operational. Use ~? for help]
ep 4.13
istep 4.14
istep 4.15
...

Changed in ubuntu-power-systems:
status: Incomplete → Invalid
Changed in plainbox-provider-checkbox:
status: Fix Committed → Invalid
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.