[xenial][maas beta5] [arm64] system tries to enlist when I commission.

Bug #1590121 reported by Manoj Iyer
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
MAAS
Invalid
Critical
Newell Jensen

Bug Description

On ARM64 system with UEFI tries to elist node again when you try to commission. It then fails as follows:

[ 121.020478] cloud-init[1267]: Processing triggers for ureadahead (0.100.0-19) ...
[ 124.739959] cloud-init[1267]: Cloud-init v. 0.7.7 running 'modules:config' at Thu, 11 Feb 2016 16:28:22 +0000. Up 86.22 seconds.
[ OK ] Started Apply the settings specified in cloud-config.
         Starting Execute cloud user/final scripts...
[ 128.784828] cloud-init[2746]: % Total % Received % Xferd Average Speed Time Time Time Current
[ 128.786270] cloud-init[2746]: Dload Upload Total Spent Left Speed
100 250 0 0 100 250 0 156 0:00:01 0:00:01 --:--:-- 156
[ 130.385886] cloud-init[2746]: curl: (22) The requested URL returned error: 400 BAD REQUEST
[ 130.515864] cloud-init[2746]: % Total % Received % Xferd Average Speed Time Time Time Current
[ 130.516862] cloud-init[2746]: Dload Upload Total Spent Left Speed
100 250 0 0 100 250 0 159 0:00:01 0:00:01 --:--:-- 159
[ 132.088443] cloud-init[2746]: curl: (22) The requested URL returned error: 400 BAD REQUEST
[ 132.135514] cloud-init[2746]: =============================================
[ 132.136606] cloud-init[2746]: failed to enlist system maas server
[ 132.137338] cloud-init[2746]: sleeping 60 seconds then poweroff
[ 132.138041] cloud-init[2746]: login with 'ubuntu:ubuntu' to debug and disable poweroff
[ 132.138982] cloud-init[2746]: =============================================

Changed in maas:
assignee: nobody → Newell Jensen (newell-jensen)
status: New → Triaged
importance: Undecided → Critical
Revision history for this message
Newell Jensen (newell-jensen) wrote :

I did tcpdump of TFTP traffic for failing and succeeding:

http://paste.ubuntu.com/17122532/

We think this is a regression in the firmware as this was working on an earlier firmware version.

Changed in maas:
status: Triaged → In Progress
Revision history for this message
Mike Pontillo (mpontillo) wrote :

Newell, could we do a full traffic capture rather than just grabbing the text-based tcpdump output? That is, you can run tcpdump as follows (from an SSH session, which is why the extra filter is in there):

    tcpdump -s 0 'port not 22' -n -w attempt1.pcap

(Of course, kill tcpdump between attempts and change the filename to get both the success and failure attempts.)

Revision history for this message
Newell Jensen (newell-jensen) wrote :

Mike,

They are already downgrading the FW to the version that was not showing this issue. In the event that this is still an issue for some reason, I can do what you mentioned. Thanks for mentioning this.

Revision history for this message
Mike Pontillo (mpontillo) wrote :

Right, I didn't realize the text output was enough to diagnose the issue this time. Thanks!

Revision history for this message
Manoj Iyer (manjo) wrote :

Grub2 fails to load the configfile with net_default_mac, althought the firmware passes in the correct value for net_default_mac. (I even exported that variable so that confifile can see it). It seems that re-requesting the same configfile a second time works. I have asked folks familiar with this fw to take a look at this from a FW perspective, because I believe this request involves FW too.

>>Start PXE over IPv4.
  Station IP address is 10.110.48.102

  Server IP address is 10.110.48.210
  NBP filename is grubaa64.efi
  NBP filesize is 1952768 Bytes
 Downloading NBP file...

  NBP file downloaded successfully.
Unknown command `#'.
Try `help' for usage
Unknown command `#'.
Try `help' for usage
DAY=9
HOUR=16
MINUTE=50
MONTH=6
SECOND=52
WEEKDAY=Thursday
YEAR=2016
check_signatures=no
cmdpath=(tftp,10.110.48.210)
color_highlight=black/light-gray
color_normal=light-gray/black
feature_200_final=y
feature_all_video_module=y
feature_chainloader_bpb=y
feature_default_font_path=y
feature_menuentry_id=y
feature_menuentry_options=y
feature_nativedisk_cmd=y
feature_ntldr=y
feature_platform_search_hint=y
feature_timeout_style=y
grub_cpu=arm64
grub_platform=efi
lang=
locale_dir=
net_default_interface=efinet0
net_default_ip=10.110.48.102
net_default_mac=e4:1d:2d:bb:68:00
net_default_server=10.110.48.210
net_efinet0_boot_file=grubaa64.efi
net_efinet0_domain=maas
net_efinet0_ip=10.110.48.102
net_efinet0_mac=e4:1d:2d:bb:68:00
pager=
prefix=(tftp,10.110.48.210)/boot/grub
pxe_default_server=10.110.48.210
root=tftp,10.110.48.210
secondary_locale_dir=
exporting variable net_default_mac
Requesting grub/grub.cfg-net-default-mac

Requesting grub/grub.cfg-net-default-mac failed
Re-try Requesting grub/grub.cfg-net-default-mac

Booting under MAAS direction...
Commission Phase
e4:1d:2d:bb:68:00
Commission Phase

Revision history for this message
Newell Jensen (newell-jensen) wrote :

Seeing that an earlier version of FW was known to work (as per what Manoj mentioned) and no changes have been made to the UEFI ARM64 boot method code, AFAICT this is a regression in the FW. I just spoke with Manoj and he is still waiting to downgrade the FW to verify.

Revision history for this message
Newell Jensen (newell-jensen) wrote :

Setting to incomplete until getting required information from Manoj. We still need to understand why there was a regression when upgrading the firmware for this hardware (it was known to work with an earlier firmware version).

Changed in maas:
status: In Progress → Incomplete
Changed in maas:
status: Incomplete → Invalid
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.