Rack controller cannot access the BMC, even if accessible on the OS level on HMC Z
Affects | Status | Importance | Assigned to | Milestone | ||
---|---|---|---|---|---|---|
MAAS | Status tracked in 3.6 | |||||
3.6 |
Invalid
|
Medium
|
Unassigned | |||
Ubuntu on IBM z Systems |
New
|
Undecided
|
Skipper Bug Screeners |
Bug Description
I'm at the beginning of setting up MAAS v3.4 on s390x, using 'Power type' "IBM Hardware Management Console (HMC) for Z" (the BMC is called HMC here).
I'm pretty sure that the data that was specified for the Power Type is correct, since on the OS level I can netcat the HMC via it's specific port:
$ nc -vz <hmc IP> 6794
Connection to <hmc IP> 6794 port [tcp/*] succeeded!
( however, ICMP is disabled, hence ping is not possible, hope that this is not needed/used:
$ sudo ping -c 3 <hmc IP>
PING <hmc IP> (<hmc IP>) 56(84) bytes of data.
--- 10.103.16.10 ping statistics ---
3 packets transmitted, 0 received, 100% packet loss, time 2073ms )
One thing is special on s390x, the Power driver does not directly connect to the HMC (BMC), but uses the intermediate 'zhmcclient' (Python API).
I have a brief Python script that I can run outside maas to check the accessibility to the HMC via the zhmcclient and it works:
$ cat test.py
#!/usr/bin/env python3
import zhmcclient
import requests.
requests.
# Set these variables for your environment:
host = "<hmc IP>"
userid = "<userid>"
password = "password"
verify_cert = False
session = zhmcclient.
client = zhmcclient.
console = client.
partitions = console.
for part in partitions:
cpc = part.manager.parent
print("{} {}".format(
$ ./test.py
P00711B8 MAAS-RRC-01
P00711B8 MAAS-RRC-01-G01
I'm obviously using the same credentials - in the Power driver and my test script.
(I just have "verify_cert = False" in my script, and changed /usr/lib/
(Note that the HMC has an IP addresses that is outside of the network the MAAS server is in, but I think that this is often the case and shouldn't matter - since connectivity is given on OS level.)
Nevertheless whenever I try to do something with a system that I've manually enlisted (via Add Hardware --> Machine) I get such error messages:
in the UI:
- on commissioning:
"Error:
No rack controllers can access the BMC of node MAASH1G1"
- on Power cycle, Power on:
"Error:
Action failed for 1 machine: MAASH1G1"
(probably because it is on, but the Power status is shown as unknown)
- on Power cycle, Power off:
"Error:
No rack controllers can access the BMC of node MAASH1G1"
(I could provide screenshots of these error messages in the UI, but I doubt that they provide further details.)
(Btw. the system that I've manually added, an LPAR, has a normal (long) name [MAAS-RRC-01-G01] and a short name [MAASH1G1]. I wasn't sure which one to take, so I tried both cases, but no luck.
Hence one will find both in the logs ...)
Further details:
Version and build:
$ apt list maas
Listing... Done
maas/jammy,now 1:3.4.2-
$ dpkg -l | grep maas
ii maas 1:3.4.2-
ii maas-cli 1:3.4.2-
ii maas-common 1:3.4.2-
ii maas-dhcp 1:3.4.2-
ii maas-netmon 1:3.4.2-
ii maas-proxy 1:3.4.2-
ii maas-rack-
ii maas-region-api 1:3.4.2-
ii maas-region-
ii python3-django-maas 1:3.4.2-
ii python3-maas-client 1:3.4.2-
ii python3-
I've attached the relevant log files here.
I did quite some investigations on this, reading in LP, discourse and other sources, and I found several cases where the message "No rack controllers can access the BMC" was mentioned, but none of them seem to be close to my case.
I don't hope that this is not a misleading message, thought about it when I came across this: https:/
(Btw. I can make this system accessible for Canonical engineers for further analysis ...)
Changed in ubuntu-z-systems: | |
assignee: | nobody → Skipper Bug Screeners (skipper-screen-team) |
summary: |
- Rack controller can access the BMC, even if accessible on the OS level + Rack controller cannot access the BMC, even if accessible on the OS + level |
Changed in maas: | |
status: | New → Triaged |
importance: | Undecided → Low |
importance: | Low → Medium |
milestone: | none → 3.5.x |
tags: | added: bug-council |
Changed in maas: | |
assignee: | nobody → Anton Troyanov (troyanov) |
summary: |
- Rack controller cannot access the BMC, even if accessible on the OS - level + [HMC Z] Rack controller cannot access the BMC, even if accessible on the + OS level |
summary: |
- [HMC Z] Rack controller cannot access the BMC, even if accessible on the - OS level + Rack controller cannot access the BMC, even if accessible on the OS + level on HMC Z |
These are the only relevant lines in the logs that I could find:
/var/log/ maas/maas. log
2024-06- 26T05:49: 39.966121+ 00:00 maas-on-z maas.rpc. rackcontrollers : message repeated 12 times: [ [info] Existing rack controller 'maas-on-z' running version 3.4.2-14353- g.5a5221d57 has connected to region 'maas-on-z'.] 26T05:51: 12.227011+ 00:00 maas-on-z maas.node: [info] MAASH1G1: Status transition from NEW to COMMISSIONING 26T05:51: 48.160046+ 00:00 maas-on-z maas.node: [warn] MAASH1G1: Could not change the power state. No rack controllers can access the BMC. 26T05:51: 48.160447+ 00:00 maas-on-z maas.node: [info] MAASH1G1: Aborting COMMISSIONING and reverted to NEW. Unable to power control the node. Please check power credentials. 26T05:51: 48.162483+ 00:00 maas-on-z maas.node: [info] MAASH1G1: Status transition from COMMISSIONING to NEW 26T05:51: 48.170507+ 00:00 maas-on-z maas.node: [error] MAASH1G1: Could not start node for commissioning: No rack controllers can access the BMC of node MAASH1G1
2024-06-
2024-06-
2024-06-
2024-06-
2024-06-