MAAS incorrectly detects / sets-up BMC information on Dell PowerEdge servers

Bug #1287964 reported by Kent Baxley
16
This bug affects 2 people
Affects Status Importance Assigned to Milestone
MAAS
Fix Released
High
Jason Hobbs
maas (Ubuntu)
Fix Released
Undecided
Jason Hobbs

Bug Description

Beginning with MAAS 1.5+bzr1977-0ubuntu2 and up to today's daily builds (1.5+bzr1977+2064+245~ppa0~ubuntu14.04.1), maas no longer seems to either detect or properly configure the BMCs on Dell PowerEdge servers during enlistment.

I've noticed two scenarios:

1) When the system enlists, the enlistment itself appears to work OK, but, when I look at the node in the MAAS WebUI, MAAS tells me that there is no power type set and that I need to set one.

2) The system enlists OK, and node's Power Parameters in the UI appear to be filled out fine. In other words, the correct Power driver is populated as well as the correct IP address for the BMC. I also have a 'maas' power user and a pre-generated password. However, if I click 'commission node' nothing happens (i.e. the node doesn't power on).

Workarounds:

In scenario #1 if I go and fill out all of the IPMI information by hand for an affected node, then commissioning, etc. works fine after that.

In scenario #2 if I change the Power User and Power Password parameters in the UI from 'maas' and the pre-generated password to what's already on the BMC (i.e. root/calvin), then commissioning, etc. works fine after that.

This is the first time in a while that PowerEdge didn't 'just work' with MAAS with regard to power settings. I know there has been some churn in the IPMI department lately for MAAS.

I'll be attaching logs for each one of the PowerEdges affected by the respective scenarios.

Tags: server-hwe

Related branches

Revision history for this message
Kent Baxley (kentb) wrote :

I'm attaching the cloud-init logs from a PowerEdge R805, which exhibits scenario #1.

Revision history for this message
Kent Baxley (kentb) wrote :

Here are the logs from a PowerEdge M610, which exhibits scenario #2.

Scenario #2 seems to be the most common problem I've been having as of late. If you need logs from the commissioning phase on one of these I can try and grab those, too.

description: updated
Changed in maas (Ubuntu):
assignee: nobody → Jason Hobbs (jason-hobbs)
Revision history for this message
Jason Hobbs (jason-hobbs) wrote :

I see this on the cloud-init-output.log for the R805:

Traceback (most recent call last):
  File "/tmp/sh.zyCerh/bin/maas-ipmi-autodetect", line 309, in <module>
    main()
  File "/tmp/sh.zyCerh/bin/maas-ipmi-autodetect", line 276, in main
    apply_ipmi_user_settings(user_settings)
  File "/tmp/sh.zyCerh/bin/maas-ipmi-autodetect", line 194, in apply_ipmi_user_settings
    verify_ipmi_user_settings(ipmi_user_number, user_settings)
  File "/tmp/sh.zyCerh/bin/maas-ipmi-autodetect", line 183, in verify_ipmi_user_settings
    raise IPMIError(message)
__main__.IPMIError: IPMI user setting verification failures: for 'Enable_User', expected 'Yes', actual 'None'.

This isn't anything that I've seen before and won't be fixed by any recent fixes. I don't see anything immediately wrong in the R610 log - please try that again with tomorrow's daily.

Changed in maas (Ubuntu):
status: New → In Progress
Revision history for this message
Julian Edwards (julian-edwards) wrote :

Please do not file project bugs on the source package, use the maas project instead.

Changed in maas (Ubuntu):
status: In Progress → Invalid
Changed in maas:
status: New → In Progress
assignee: nobody → Jason Hobbs (jason-hobbs)
importance: Undecided → High
tags: added: server-hwe
Revision history for this message
Andres Rodriguez (andreserl) wrote :

This bug also affects the source package, so updating accordingly.

Changed in maas (Ubuntu):
status: Invalid → Confirmed
Revision history for this message
Kent Baxley (kentb) wrote :

I tried 1.5+bzr1977+2072+245~ppa0~ubuntu14.04.1 just now today and the results were no different for the M610 or the R805.

For the M610 I had to power the system up manually or change the username and password parameters in the UI to get the system to commission, etc. The logs don't show anything that jumps out on systems that behave this way. MAAS appears to be OK with the settings it passes to the BMC.

Today, I found that I can run the ipmipower command manually with the generated maas username and password and that works on the M610:

$ sudo ipmipower -h 10.0.0.39 -u maas -p Hg6awJQtZ --stat
10.0.0.39: off

$ sudo ipmipower -h 10.0.0.39 -u maas -p Hg6awJQtZ --on
10.0.0.39: ok

$ sudo ipmipower -h 10.0.0.39 -u maas -p Hg6awJQtZ --off
10.0.0.39: ok

..the Web UI, for some reason, when clicking 'commission node' or 'start node' won't power up the machine unless I pass the username and password that's already on the BMC (i.e. root/calvin).

For the R805, the same traceback occurs at enlistment time (which is no surprise given what Jason had said yesterday).

I'll also try the version of maas that's in the main trusty repos that Andres uploaded (1.5+bzr1977-0ubuntu3) to see if things are any different.

Revision history for this message
Kent Baxley (kentb) wrote :

Also tested with a freshly-installed maas server using 1.5+bzr1977-0ubuntu3 on Trusty.

I am also seeing the same symptoms as originally reported. The power settings fail to get configured for the R805 and the M610 also displays the same behaviors as listed in the previous comment.

I will also test out a few more sample systems to see what things look like.

Revision history for this message
Kent Baxley (kentb) wrote :

I have one machine that actually gets set up correctly so far. The PowerEdge T410 is the system.

After enlistment, MAAS assigns the maas username and password and sets the driver type to IPMI 2.0. On this system, I can actually click "commission node" and it powers on without issues via the Web UI.

Revision history for this message
Kent Baxley (kentb) wrote :

The PowerEdge R620 also works fine with regard to enlisting and commissioning. At the end of enlistment it's also set to IPMI 2.0 and I can power it up from the WebUI.

Revision history for this message
Kent Baxley (kentb) wrote :

For the M610 using version 1.5+bzr1977-0ubuntu3 I think I found a reason why it refused to power up via the UI after enlistment.

At enlistment time, the Power Driver settings for that BMC are improperly being identified as IPMI 2.0, when they should be IPMI 1.5. If I go and set the IPMI Power Driver in the UI for the node to 1.5 and leave the maas-generated username and password alone, then the machine will power up when I click "commission node".

The odd thing, though, seems to be that after commissioning the node, the IPMI power settings do not stick...they fall back to IPMI 2.0 from 1.5 and it looks like I have a new maas-generated password on the BMC. I have to set it back to 1.5 every time before trying to power the node up again via the UI.

I also saw this with a PowerEdge M915.

So, it looks as if MAAS is setting the BMCs to IPMI 2.0 no matter what?

Revision history for this message
Kent Baxley (kentb) wrote :

PowerEdge R710 also works fine with the IPMI driver defaulted to 2.0 after enlistment.

Revision history for this message
Jason Hobbs (jason-hobbs) wrote :

MAAS should only be setting the BMCs to 2.0 if the BMC claims it has 2.0 support. IPMI settings are updated after commissioning based on what settings are changed/discovered in commissioning - see lp #1287828.

Revision history for this message
Jason Hobbs (jason-hobbs) wrote :

Kent,

Can you please get the output of the bmc-info command, from freeipmi-utils, for one of the BMCs that isn't working?

You can run it remotely like this:
bmc-info -h 192.168.12.34 -u xxx -p xxx

Or locally like this:
bmc-info

Revision history for this message
Kent Baxley (kentb) wrote :

Here you go. Attaching output from M610 and M915. Both are blades sitting in the same chassis. Each server has its own dedicated BMC and both look to be advertising IPMI 2.0. However, unless I select auto-detect or IPMI 1.5 from the UI, the systems will not power up when selecting 'commission node' or 'start node'.

This ipmipower command works:

$ sudo ipmipower -h 10.0.0.39 -u maas -p Hg6awJQtZ --on
10.0.0.39: ok

but the same command when I use the LAN_2_0 parameter doesn't:

$ sudo ipmipower -D LAN_2_0 -h 10.0.0.39 -u maas -p Hg6awJQtZ --on
10.0.0.30: BMC Error

So, for some reason even though the BMC says it can do IPMI 2.0 I can't run any commands against it using IPMI 2.0
?field.comment=Here you go. Attaching output from M610 and M915. Both are blades sitting in the same chassis. Each server has its own dedicated BMC and both look to be advertising IPMI 2.0. However, unless I select auto-detect or IPMI 1.5 from the UI, the systems will not power up when selecting 'commission node' or 'start node'.

This ipmipower command works:

$ sudo ipmipower -h 10.0.0.39 -u maas -p Hg6awJQtZ --on
10.0.0.39: ok

but the same command when I use the LAN_2_0 parameter doesn't:

$ sudo ipmipower -D LAN_2_0 -h 10.0.0.39 -u maas -p Hg6awJQtZ --on
10.0.0.30: BMC Error

So, for some reason even though the BMC says it can do IPMI 2.0 I can't run any commands against it using IPMI 2.0

Revision history for this message
Kent Baxley (kentb) wrote :
Revision history for this message
Jason Hobbs (jason-hobbs) wrote :

Kent,

This looks like bug 1288297. Can you try this:

sudo ipmipower -W opensesspriv -D LAN_2_0 -h 10.0.0.39 -u maas -p Hg6awJQtZ --on

Thanks,
Jason

Revision history for this message
Kent Baxley (kentb) wrote :

Ok. WIth the addition of the "-W opensesspriv" parameter to the command for that blade I finally get a sane status back from the BMC. On the node in question I can run --stat, --on, or --off and all three commands now work when they didn't before.

Revision history for this message
Jason Hobbs (jason-hobbs) wrote :

If you're blocked by this issue, you can workaround it by editing /etc/maas/templates/power/ipmi.template to add use of -W opensesspriv. I've tested this on a handful of systems that don't require the workaround and it seems to work fine everywhere so far.

=== modified file 'etc/maas/templates/power/ipmi.template'
--- etc/maas/templates/power/ipmi.template 2014-02-26 07:05:31 +0000
+++ etc/maas/templates/power/ipmi.template 2014-03-05 23:12:24 +0000
@@ -55,9 +55,9 @@
     fi

     echo workaround |\
- ${ipmi_chassis_config} ${driver_option} -h ${power_address} -u ${power_user} -p ${power_pass} --commit --filename ${config}
+ ${ipmi_chassis_config} -W opensesspriv ${driver_option} -h ${power_address} -u ${power_user} -p ${power_pass} --commit --filename ${config}
     echo workaround |\
- ${ipmipower} ${driver_option} -h ${power_address} -u ${power_user} -p ${power_pass} "$@"
+ ${ipmipower} -W opensesspriv ${driver_option} -h ${power_address} -u ${power_user} -p ${power_pass} "$@"
 }

 # This script deliberately does not check the current power state

Revision history for this message
Kent Baxley (kentb) wrote :

Same here. I just tested on a machine that didn't need the parameter and didn't seem to have a negative impact.

I will leverage this workaround until you guys decide the final fix / resolution. I'll let you know if there are any negative side effects along the way.

Thanks!

Revision history for this message
Jason Hobbs (jason-hobbs) wrote :

I read freeipmi's code to see what the workaround does - it's very innocuous. It boils down to IPMI sessions being opened with the privilege level required for the command being used, rather than the maximum privilege level available for the user/ciphers available. For ipmipower, freeipmi uses the OPERATOR privilege level, which matches the IPMI spec's privilege level for the chassis control commands, which are used to to do on/off operations.

So, this workaround shouldn't have any impact on a system that don't require it, unless the system breaks IPMI compliance by requiring a privilege level higher than OPERATOR for power commands.

Changed in maas:
status: In Progress → Fix Committed
Changed in maas:
milestone: none → 14.04
Changed in maas (Ubuntu):
status: Confirmed → Fix Committed
Changed in maas:
status: Fix Committed → Fix Released
Changed in maas (Ubuntu):
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.