rack controller list-boot-images shows status 'unknown'

Bug #1764830 reported by John George
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
MAAS
Incomplete
Critical
Blake Rouse

Bug Description

With 2.4.0~beta2-6807-gaa74361-0ubuntu1~18.04.1~20180409~ubuntu18.04.1 the solutions QA CI timesout waiting for images to sync.

At least one rack controller's status in rack controller list-boot-images stays in 'unknown' forever.

The regiond.log has a Traceback complaining about 'simplestreams.util.SignatureMissingException: No signature found!', but this may be a red herring as we see it in successful runs too, and one of the rack controllers synced successfully, meaning the region controller was able to import images successfully. Also, this shouldn't cause a rack controller's status to be unknown.

This is the third bug in a series of "rack's never finish syncing" bugs that have been treated as seperate bugs by the MAAS team:

bug 1760958
bug 1754493

Revision history for this message
John George (jog) wrote :
summary: - Images do not sync after a simplestreams No signature found error is
- reported in the regiond.log
+ rack controller list-boot-images shows status 'unknown'
description: updated
description: updated
Revision history for this message
Andres Rodriguez (andreserl) wrote :

Im marking this a duplicate of #1760958. You are also testing against a daily build of 2.4 and does not reflect the *released* version.

Please discontinue the test of daily builds in favor of proper development releases, as daily builds is not a stable test target.

Also, the rev you using doesn’t include the fix on 1760958

Revision history for this message
Andres Rodriguez (andreserl) wrote :

Also, please feel free to re-open this if you can confirm this specific issue (rack controllers can’t sync, but regions have) if you can reproduce on 2.4~beta2 release.

Also, this could be a fallout of https://bugs.launchpad.net/bugs/1754493 e.g because si pleatreams raises an exception Maas doesn’t have all images or descriptions which causes rack to fail.

Revision history for this message
Jason Hobbs (jason-hobbs) wrote :

Actually, John's bug report is against maas git rev aa74361, which is from April 9th and which does include the fix for 1760958, which landed on April 4th in bbaa2fec.

We also hit this on maas_2.4.0~beta3-6848-gb5aad23-0ubuntu1~18.04.1~20180416~ubuntu18.04.1 - logs attached from that.

We are testing daily because the releases are broken, and you've asked us to test with fixed builds.

Changed in maas:
assignee: nobody → Blake Rouse (blake-rouse)
importance: Undecided → Critical
status: New → Triaged
milestone: none → 2.4.0rc1
status: Triaged → In Progress
Revision history for this message
Andres Rodriguez (andreserl) wrote :

After some investigation, this issue is:

1. MAAS Region attempts to download images from images.maas.io
2. The Region fails to add the boot source selections, because simplestreams is unable to download image descriptions.
3. The reason why it is unable to download image descriptions is because the proxy used doesn't allow downloading such file.
4. Since the region was unable to download images, then obviously the rack is unable to download images.

This is a side effect of https://bugs.launchpad.net/maas/+bug/1761813 . I'm setting it as incomplete for now until you guys can work around the issue of simplestreams.

Changed in maas:
status: In Progress → Incomplete
Revision history for this message
Jason Hobbs (jason-hobbs) wrote :

Andres,

Thanks for finding the proxy issue.

This does not explain why one rack controller's image status is "Unknown", while the other's is "Synced".

Changed in maas:
status: Incomplete → New
Revision history for this message
Andres Rodriguez (andreserl) wrote :

Marking this as incomplete as the latest runs, after having resolved #5 and [1], this issue has not surfaced again.

Given that this bug was filed when #5 and during investigation [1] was found, we believe that this issue could now be fixed. Marking this as incomplete until we can reproduce it and get some data with the issues already mentioned.

[1]: LP: #1765056

Changed in maas:
status: New → Incomplete
Revision history for this message
Chris Gregan (cgregan) wrote :
Revision history for this message
Chris Gregan (cgregan) wrote :

2018-04-22-09:26:47 DEBUG maas root rack-controllers read
2018-04-22-09:26:48 DEBUG maas root rack-controller list-boot-images rgcceg
2018-04-22-09:26:50 DEBUG maas root rack-controller list-boot-images 7ag7wq
2018-04-22-09:26:51 DEBUG maas root rack-controller list-boot-images bg48qb
2018-04-22-09:27:22 INFO {'synced', 'unknown'}
2018-04-22-09:27:52 DEBUG maas root rack-controllers read
2018-04-22-09:27:53 DEBUG maas root rack-controller list-boot-images rgcceg
2018-04-22-09:27:54 DEBUG maas root rack-controller list-boot-images 7ag7wq
2018-04-22-09:27:55 DEBUG maas root rack-controller list-boot-images bg48qb
2018-04-22-09:28:26 INFO {'synced', 'unknown'}
2018-04-22-09:28:56 DEBUG maas root rack-controllers read
2018-04-22-09:28:57 DEBUG maas root rack-controller list-boot-images rgcceg
2018-04-22-09:28:59 DEBUG maas root rack-controller list-boot-images 7ag7wq
2018-04-22-09:29:00 DEBUG maas root rack-controller list-boot-images bg48qb
2018-04-22-09:29:31 INFO {'synced', 'unknown'}
Traceback (most recent call last):
  File "foundation/bin/wait-for-rack-controllers", line 85, in <module>
    main(sys.argv)
  File "foundation/bin/wait-for-rack-controllers", line 80, in main
    wait_for_rack_image_sync(maas_profile)
  File "/home/ubuntu/cpe/foundation/bin/maas_cli.py", line 685, in wait_for_rack_image_sync
    sync_status))
Exception: Timed out waiting for racks to sync images: {'synced', 'unknown'}
Makefile:98: recipe for target 'infra' failed
make: *** [infra] Error 1
Connection to 10.227.81.70 closed.

Revision history for this message
Chris Gregan (cgregan) wrote :
Chris Gregan (cgregan)
Changed in maas:
status: Incomplete → Confirmed
Revision history for this message
Andres Rodriguez (andreserl) wrote :

Chris,

For future bugs like this, would it be possible for you guys to actually print the whole API output rather than just what you filter? It would be more helpful if the full API output is provided.

Revision history for this message
Andres Rodriguez (andreserl) wrote :
Download full text (5.4 KiB)

Looking that the timestamps on .31

1. At 09:02, simplestreams starts the import. It looks like at that time, only the /bootloaders/ are imported in the region:

2018-04-22 09:02:15 sstreams: [info] maas:v2:download/maas:boot:grub-efi-signed:amd64:generic:uefi: to_add=['20180224.0'] to_remove=[]
2018-04-22 09:02:15 sstreams: [info] maas:v2:download/maas:boot:grub-efi:arm64:generic:uefi: to_add=['20180224.0'] to_remove=[]
2018-04-22 09:02:15 sstreams: [info] maas:v2:download/maas:boot:grub-ieee1275:ppc64el:generic:open-firmware: to_add=['20180224.0'] to_remove=[]
2018-04-22 09:02:15 sstreams: [info] maas:v2:download/maas:boot:pxelinux:i386:generic:pxe: to_add=['20160930.0'] to_remove=[]
2018-04-22 09:02:16 sstreams: [info] maas:v2:download/maas:boot:grub-efi-signed:amd64:generic:uefi: to_add=['20180224.0'] to_remove=[]
2018-04-22 09:02:16 sstreams: [info] maas:v2:download/maas:boot:grub-efi:arm64:generic:uefi: to_add=['20180224.0'] to_remove=[]
2018-04-22 09:02:16 sstreams: [info] maas:v2:download/maas:boot:grub-ieee1275:ppc64el:generic:open-firmware: to_add=['20180224.0'] to_remove=[]
2018-04-22 09:02:16 sstreams: [info] maas:v2:download/maas:boot:pxelinux:i386:generic:pxe: to_add=['20160930.0'] to_remove=[]

2. at 09:29, the CI checks if images are imported and receives an status unknown:

2018-04-22-09:29:00 DEBUG maas root rack-controller list-boot-images bg48qb
2018-04-22-09:29:31 INFO {'synced', 'unknown'}

3. at 09:07, from the region controller logs I see this:

2018-04-22 09:07:04 sstreams: [info] com.ubuntu.maas:daily:v3:download/com.ubuntu.maas.daily:v3:boot:16.04:amd64:ga-16.04: to_add=['20180420'] to_remove=[]
2018-04-22 09:07:04 sstreams: [info] com.ubuntu.maas:daily:v3:download/com.ubuntu.maas.daily:v3:boot:16.04:amd64:ga-16.04-lowlatency: to_add=['20180420'] to_remove=[]
2018-04-22 09:07:04 sstreams: [info] com.ubuntu.maas:daily:v3:download/com.ubuntu.maas.daily:v3:boot:16.04:amd64:hwe-16.04: to_add=['20180420'] to_remove=[]
2018-04-22 09:07:04 sstreams: [info] com.ubuntu.maas:daily:v3:download/com.ubuntu.maas.daily:v3:boot:16.04:amd64:hwe-16.04-edge: to_add=['20180420'] to_remove=[]
2018-04-22 09:07:04 sstreams: [info] com.ubuntu.maas:daily:v3:download/com.ubuntu.maas.daily:v3:boot:16.04:amd64:hwe-16.04-lowlatency: to_add=['20180420'] to_remove=[]
2018-04-22 09:07:04 sstreams: [info] com.ubuntu.maas:daily:v3:download/com.ubuntu.maas.daily:v3:boot:16.04:amd64:hwe-16.04-lowlatency-edge: to_add=['20180420'] to_remove=[]
2018-04-22 09:07:04 sstreams: [info] com.ubuntu.maas:daily:1:bootloader-download/com.ubuntu.maas.daily:1:grub-efi-signed:uefi:amd64: to_add=['20180224.0'] to_remove=[]
2018-04-22 09:07:04 sstreams: [info] com.ubuntu.maas:daily:1:bootloader-download/com.ubuntu.maas.daily:1:grub-efi:uefi:arm64: to_add=['20180224.0'] to_remove=[]
2018-04-22 09:07:04 sstreams: [info] com.ubuntu.maas:daily:1:bootloader-download/com.ubuntu.maas.daily:1:grub-ieee1275:open-firmware:ppc64el: to_add=['20180224.0'] to_remove=[]
2018-04-22 09:07:04 sstreams: [info] com.ubuntu.maas:daily:1:bootloader-download/com.ubuntu.maas.daily:1:pxelinux:pxe:i386: to_add=['20160930.0'] to_remove=[]

What it looks to me in this log is that:

1. Region quic...

Read more...

Revision history for this message
Andres Rodriguez (andreserl) wrote :

Marking as incomplete as we still need logs with debug logging enabled. Beta3 was released with improved debug logging support and this has now been enabled in your CI environment.

If you reproduce this issue in your CI environment with MAAS 2.4b3, we need the logs with better debug logging.

Changed in maas:
status: Confirmed → Incomplete
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.