ipmi_test crashes when ipmi_config output is not utf-8

Bug #1892983 reported by Ike Panhc
24
This bug affects 4 people
Affects Status Importance Assigned to Milestone
Checkbox Provider - Base
Expired
Medium
Jeff Lane 
MAAS
Incomplete
Medium
Unassigned

Bug Description

We have some servers which have non utf-8 bytes in ipmi-config output and cause ipmi_test to crash

The non utf-8 is in Lan_Conf_Security_Keys and Popen.communicate will try to decode it as utf-8.

$ sudo /usr/sbin/ipmi-config --checkout --lan-channel-number 1 | grep -a -A5 '^Section Lan_Conf_Security_Keys'
Section Lan_Conf_Security_Keys
 ## Give string or blank to clear. Max 20 chars
 K_R Ӷ:΁,ء�Z��*�N��
 ## Give string or blank to clear. Max 20 bytes, prefix with 0x to enter hex
 K_G 0x0000000000000000000000000000000000000000
EndSection

Traceback (most recent call last):
  File "/tmp/nest-jh8wu8ym.14373cc61f93798d1a9ac09341ae554943f71bd04dad06f96b13defe4aee3784/ipmi_test.py", line 324, in <module>
    sys.exit(main())
  File "/tmp/nest-jh8wu8ym.14373cc61f93798d1a9ac09341ae554943f71bd04dad06f96b13defe4aee3784/ipmi_test.py", line 306, in main
    results = ipmi_test.run_test()
  File "/tmp/nest-jh8wu8ym.14373cc61f93798d1a9ac09341ae554943f71bd04dad06f96b13defe4aee3784/ipmi_test.py", line 277, in run_test
    self.ipmi_channel(),
  File "/tmp/nest-jh8wu8ym.14373cc61f93798d1a9ac09341ae554943f71bd04dad06f96b13defe4aee3784/ipmi_test.py", line 205, in ipmi_channel
    self._ipmi_channel_hlpr(i, matches, channel)
  File "/tmp/nest-jh8wu8ym.14373cc61f93798d1a9ac09341ae554943f71bd04dad06f96b13defe4aee3784/ipmi_test.py", line 186, in _ipmi_channel_hlpr
    output = self._subproc_logging(cmd)
  File "/tmp/nest-jh8wu8ym.14373cc61f93798d1a9ac09341ae554943f71bd04dad06f96b13defe4aee3784/ipmi_test.py", line 91, in _subproc_logging
    output, error = process.communicate(timeout=self.subproc_timeout)
  File "/usr/lib/python3.6/subprocess.py", line 863, in communicate
    stdout, stderr = self._communicate(input, endtime, timeout)
  File "/usr/lib/python3.6/subprocess.py", line 1574, in _communicate
    self.stdout.errors)
  File "/usr/lib/python3.6/subprocess.py", line 760, in _translate_newlines
    data = data.decode(encoding, errors)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe6 in position 21281: invalid continuation byte

Revision history for this message
Ike Panhc (ikepanhc) wrote :

Manually decode it as cp437 can solve this issue. Please see the attachment

Jeff Lane  (bladernr)
Changed in plainbox-provider-checkbox:
assignee: nobody → Jeff Lane (bladernr)
importance: Undecided → Medium
status: New → In Progress
milestone: none → 0.55.0
Revision history for this message
Jeff Lane  (bladernr) wrote :

Near as I can tell, this is out of spec. IPMI v2 spec explicitly suggests UTF-8 or UNICODE or ASCII+Latin1 for encoding. Anything outside of that is not within IPMI v2 specs.

However, the section on security keys does not mention specific encoding for the key when returned. This is a bit of a grey area. I'm not a fan of changing the encoding for the entire script to something that's not UTF-8 or Unicode when every other server in the world uses that standard.

Changed in plainbox-provider-checkbox:
status: In Progress → Incomplete
Revision history for this message
Jeff Lane  (bladernr) wrote :

I'll also want to verify that Kr is even set on most of the servers I have access to....

Revision history for this message
Jeff Lane  (bladernr) wrote :

Lenovo x3650 M2:
$ sudo /usr/sbin/ipmi-config --checkout --lan-channel-number 1 | grep -a -A5 '^Section Lan_Conf_Security_Keys'
Section Lan_Conf_Security_Keys
        ## Give string or blank to clear. Max 20 chars
        K_R
        ## Give string or blank to clear. Max 20 bytes, prefix with 0x to enter hex
        K_G 0x0000000000000000000000000000000000000000
EndSection

Dell R710
$ sudo /usr/sbin/ipmi-config --checkout --lan-channel-number 1 | grep -a -A5 '^Section Lan_Conf_Security_Keys'
Section Lan_Conf_Security_Keys
        ## Give string or blank to clear. Max 20 bytes, prefix with 0x to enter hex
        K_G 0x0000000000000000000000000000000000000000
EndSection
#
# Section Lan_Conf_User_Security Comments

HP ProLiant SL230
ubuntu@gurley:~$ sudo /usr/sbin/ipmi-config --checkout --lan-channel-number 2 | grep -a -A5 '^Section Lan_Conf_Security_Keys'; sudo dmidecode |grep -A5 "System Information"
Section Lan_Conf_Security_Keys
        ## Give string or blank to clear. Max 20 bytes, prefix with 0x to enter hex
        K_G 0x0000000000000000000000000000000000000000
EndSection
#
# Section Lan_Conf_User_Security Comments

Fujitsu RX2530 M5
$ sudo /usr/sbin/ipmi-config --checkout --lan-channel-number 2 | grep -a -A5 '^Section Lan_Conf_Security_Keys'; sudo dmidecode |grep -A5 "System Information"

Section Lan_Conf_Security_Keys
        ## Give string or blank to clear. Max 20 chars
        K_R
        ## Give string or blank to clear. Max 20 bytes, prefix with 0x to enter hex
        K_G 0x0000000000000000000000000000000000000000
EndSection

Changed in plainbox-provider-checkbox:
status: Incomplete → Invalid
assignee: Jeff Lane (bladernr) → nobody
milestone: 0.55.0 → none
Revision history for this message
Jeff Lane  (bladernr) wrote :

Gonna mark this invalid... this feels more like something specific to what they've done in their firmware with non-unicode/utf-8 characters. We could revisit this later, but first I think it's a better approach to fix this in firmware (or verify this on production hardware, not the mishmash of questionable loaners we have for SoC work).

Revision history for this message
Jeff Lane  (bladernr) wrote :

Feel free to reopen if you disagree.

Revision history for this message
Jeff Lane  (bladernr) wrote :

For reference, tyhe IPMI spec only mentions UTF-8, Unicode, and ASCII + Latin 1 as valid character encodings:

file:///home/bladernr/Downloads/ipmi-second-gen-interface-spec-v2-rev1-1.pdf

Changed in plainbox-provider-checkbox:
status: Invalid → Incomplete
Revision history for this message
Jeff Lane  (bladernr) wrote :

Also, changed my mind and moved it back to Incomplete for now, plan to check with firmware first, and then revisit if that is not fixable

Revision history for this message
Ike Panhc (ikepanhc) wrote :

Thanks providing the spec. It also comes to my mind that we can try to clear the Lan security before running checkbox, since most common cases people use utf-8 char for passwords.

Let me try that first.

Revision history for this message
Andrew Cloke (andrew-cloke) wrote :

Even if the "Lan_Conf_Security_Keys" section in the IPMI configuration data has not been (re)set by the user, it would be reasonable to expect it to be empty on new systems. At the very least it shouldn't contain garbage/illegal chars.
Is that fair?

Revision history for this message
Ike Panhc (ikepanhc) wrote :

Yes, I agree. To have correct default setting is part of firmware's job.

Tried to set empty value for Lan_Conf_Security_Keys and then checkbox passed.

$ cat test.txt
Section Lan_Conf_Security_Keys
 ## Give string or blank to clear. Max 20 chars
 K_R
 ## Give string or blank to clear. Max 20 bytes, prefix with 0x to enter hex
 K_G 0x0000000000000000000000000000000000000000
EndSection
$ sudo ipmi-config --commit --lan-channel-number 1 --filename=test.txt
$ sudo /usr/lib/plainbox-provider-checkbox/bin/ipmi_test.py
## Running IPMI Tests ##
-----------------------
Verifying kernel modules:
- ipmi_si already loaded
- ipmi_devintf already loaded
* Unable to load module ipmi_powernv!
  **********************************************
  Warning: proceeding, but in-band IPMI may fail
  **********************************************
- ipmi_ssif already loaded
- ipmi_msghandler already loaded

-----------------------
Fetching chassis status:
- Fetched chassis status!

-----------------------
Fetching power status:
- Fetched power status!

-----------------------
Fetching BMC information:
- Fetched BMC information!

-----------------------
Validating IPMI version:
- IPMI version: 2.0
  IPMI version compliant!

-----------------------
Fetching IPMI channels:
- Found 1 channel(s)!
  IPMI channel(s): [1]

-----------------------
## IPMI tests passed! ##

Revision history for this message
Ike Panhc (ikepanhc) wrote :

Set status to "Invalid" because this is a firmware issue.

Changed in plainbox-provider-checkbox:
status: Incomplete → Invalid
Revision history for this message
Rod Smith (rodsmith) wrote :

Even if it's caused by firmware being out of spec, our script should probably do a better job of catching the resulting error. It could then report a test failure with a helpful message, like "unsupported encoding in IPMI output," rather than fail with a stack trace. Alternatively, if we wanted to allow out-of-spec output, the "except:" clause could try other encodings.

Revision history for this message
Jeff Lane  (bladernr) wrote :

reopening as Mao got pinged for this as well, so I'm trying some changes to ipmi_test.py to adjust codecs when necessary.

Changed in plainbox-provider-checkbox:
assignee: nobody → Jeff Lane (bladernr)
status: Invalid → In Progress
Jeff Lane  (bladernr)
tags: added: servcert-221
Seyeong Kim (seyeongkim)
tags: added: sts
Bill Wear (billwear)
Changed in maas:
status: New → Triaged
importance: Undecided → Medium
Revision history for this message
Seyeong Kim (seyeongkim) wrote :
Revision history for this message
Jeff Lane  (bladernr) wrote :

I should add that the original complaint stumbled on this bug while searching the issue. The MAAS task is new, and has been triaged. Fixing this bug from my end (the Checkbox end) WILL NOT fix the issue in MAAS so these are separate issue that have manifested from the same underlying problem with Huawei's use of non-UTF8 characters here. IMO the MAAS part should have been a separate bug.

So just to set expectations, the problem with MAAS commissioning will be addressed separately by the MAAS team and the checkbox part of this bug is very low priority and will likely not be addressed for some time to come, since we are not currently certifying Huawei hardware until we work out how to handle the spun-off x86 server line.

Revision history for this message
Seyeong Kim (seyeongkim) wrote :
Revision history for this message
Jerzy Husakowski (jhusakowski) wrote :

The MAAS issue mentioned in comments is reported and fixed here: https://bugs.launchpad.net/maas/+bug/1929478 Is this still reproducible after the fix?

Changed in maas:
status: Triaged → Incomplete
Revision history for this message
Maksim Beliaev (beliaev-maksim) wrote :

Bug was migrated to GitHub: https://github.com/canonical/checkbox/issues/116.
Bug is no more monitored here.

Changed in plainbox-provider-checkbox:
status: In Progress → Expired
Alberto Donato (ack)
Changed in maas:
status: Incomplete → New
status: New → Incomplete
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.