[2.1.1] 2.1.0 preseeds with '{{node_disable_pxe_data|escape.shell}}' cause a failed deployment if preseed wasn't upgraded on 2.1.1.

Bug #1642996 reported by Jeff Lane 
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
MAAS
Fix Released
Critical
Unassigned
2.1
Fix Released
Critical
Unassigned

Bug Description

It seems that every recent update to MAAS is modifying curtin_userdata. We have customized curtin_userdata files that do some things we need done on deployment. These are broken with the last three updates I've performed (and broken in customer environments that use our tooling as well).

I expected to have to fix things moving from MAAS 1.9 to 2.0. And we did so, and so our curtin_userdata files were based on this, from MAAS 2.0:
#cloud-config
debconf_selections:
 maas: |
  {{for line in str(curtin_preseed).splitlines()}}
  {{line}}
  {{endfor}}
{{if third_party_drivers and driver}}
early_commands:
  {{py: key_string = ''.join(['\\x%x' % x for x in map(ord, driver['key_binary'])])}}
  driver_00_get_key: /bin/echo -en '{{key_string}}' > /tmp/maas-{{driver['package']}}.gpg
  driver_01_add_key: ["apt-key", "add", "/tmp/maas-{{driver['package']}}.gpg"]
  driver_02_add: ["add-apt-repository", "-y", "deb {{driver['repository']}} {{node.get_distro_series()}} main"]
  driver_03_update_install: ["sh", "-c", "apt-get update --quiet && apt-get --assume-yes install {{driver['package']}}"]
  driver_04_load: ["sh", "-c", "depmod && modprobe {{driver['module']}}"]
{{endif}}
late_commands:
  maas: [wget, '--no-proxy', '{{node_disable_pxe_url|escape.shell}}', '--post-data', '{{node_disable_pxe_data|escape.shell}}', '-O', '/dev/null']
{{if third_party_drivers and driver}}
  driver_00_key_get: curtin in-target -- sh -c "/bin/echo -en '{{key_string}}' > /tmp/maas-{{driver['package']}}.gpg"
  driver_02_key_add: ["curtin", "in-target", "--", "apt-key", "add", "/tmp/maas-{{driver['package']}}.gpg"]
  driver_03_add: ["curtin", "in-target", "--", "add-apt-repository", "-y", "deb {{driver['repository']}} {{node.get_distro_series()}} main"]
  driver_04_update_install: ["curtin", "in-target", "--", "apt-get", "update", "--quiet"]
  driver_05_install: ["curtin", "in-target", "--", "apt-get", "-y", "install", "{{driver['package']}}"]
  driver_06_depmod: ["curtin", "in-target", "--", "depmod"]
  driver_07_update_initramfs: ["curtin", "in-target", "--", "update-initramfs", "-u"]
{{endif}}

power_state:
  mode: reboot

{{if node.split_arch()[0] in {'i386', 'amd64'} }}
apt_mirrors:
  ubuntu_archive: http://{{main_archive_hostname}}/{{main_archive_directory}}
  ubuntu_security: http://{{main_archive_hostname}}/{{main_archive_directory}}
{{else}}
apt_mirrors:
  ubuntu_archive: http://{{ports_archive_hostname}}/{{ports_archive_directory}}
  ubuntu_security: http://{{ports_archive_hostname}}/{{ports_archive_directory}}
{{endif}}

{{if enable_http_proxy}}
{{if http_proxy }}
apt_proxy: {{http_proxy}}
{{else}}
apt_proxy: http://{{server_host}}:8000/
{{endif}}
{{endif}}

Now, after doing a dist-upgrade once 2.1 appeared in maas/stable, deployments started failing. Investigation revealed this was because between 2.0 and 2.1, the last several lines of curtin_userdata, specifically these:
{{if node.split_arch()[0] in {'i386', 'amd64'} }}
apt_mirrors:
  ubuntu_archive: http://{{main_archive_hostname}}/{{main_archive_directory}}
  ubuntu_security: http://{{main_archive_hostname}}/{{main_archive_directory}}
{{else}}
apt_mirrors:
  ubuntu_archive: http://{{ports_archive_hostname}}/{{ports_archive_directory}}
  ubuntu_security: http://{{ports_archive_hostname}}/{{ports_archive_directory}}
{{endif}}

{{if enable_http_proxy}}
{{if http_proxy }}
apt_proxy: {{http_proxy}}
{{else}}
apt_proxy: http://{{server_host}}:8000/
{{endif}}
{{endif}}

were removed. So I modified just two days ago, our custom curtin_userdata files to match the ones in MAAS 2.1 (MAAS Version 2.1.0+bzr5480-0ubuntu1) which looks like this:
#cloud-config
debconf_selections:
 maas: |
  {{for line in str(curtin_preseed).splitlines()}}
  {{line}}
  {{endfor}}
{{if third_party_drivers and driver}}
early_commands:
  {{py: key_string = ''.join(['\\x%x' % x for x in map(ord, driver['key_binary'])])}}
  driver_00_get_key: /bin/echo -en '{{key_string}}' > /tmp/maas-{{driver['package']}}.gpg
  driver_01_add_key: ["apt-key", "add", "/tmp/maas-{{driver['package']}}.gpg"]
  driver_02_add: ["add-apt-repository", "-y", "deb {{driver['repository']}} {{node.get_distro_series()}} main"]
  driver_03_update_install: ["sh", "-c", "apt-get update --quiet && apt-get --assume-yes install {{driver['package']}}"]
  driver_04_load: ["sh", "-c", "depmod && modprobe {{driver['module']}}"]
{{endif}}
late_commands:
  maas: [wget, '--no-proxy', '{{node_disable_pxe_url|escape.shell}}', '--post-data', '{{node_disable_pxe_data|escape.shell}}', '-O', '/dev/null']
{{if third_party_drivers and driver}}
  driver_00_key_get: curtin in-target -- sh -c "/bin/echo -en '{{key_string}}' > /tmp/maas-{{driver['package']}}.gpg"
  driver_02_key_add: ["curtin", "in-target", "--", "apt-key", "add", "/tmp/maas-{{driver['package']}}.gpg"]
  driver_03_add: ["curtin", "in-target", "--", "add-apt-repository", "-y", "deb {{driver['repository']}} {{node.get_distro_series()}} main"]
  driver_04_update_install: ["curtin", "in-target", "--", "apt-get", "update", "--quiet"]
  driver_05_install: ["curtin", "in-target", "--", "apt-get", "-y", "install", "{{driver['package']}}"]
  driver_06_depmod: ["curtin", "in-target", "--", "depmod"]
  driver_07_update_initramfs: ["curtin", "in-target", "--", "update-initramfs", "-u"]
{{endif}}

power_state:
  mode: reboot

Today, another update appears in maas/stable and apparently Xenial Updates:
bladernr@critical-maas:~$ apt-cache policy maas
maas:
  Installed: 2.1.1+bzr5544-0ubuntu1~16.04.1
  Candidate: 2.1.1+bzr5544-0ubuntu1~16.04.1
  Version table:
 *** 2.1.1+bzr5544-0ubuntu1~16.04.1 500
        500 http://us.archive.ubuntu.com/ubuntu xenial-updates/main amd64 Packages
        500 http://us.archive.ubuntu.com/ubuntu xenial-updates/main i386 Packages
        100 /var/lib/dpkg/status
     2.1.1+bzr5544-0ubuntu1~16.04.1 500
        500 http://ppa.launchpad.net/maas/stable/ubuntu xenial/main amd64 Packages
        500 http://ppa.launchpad.net/maas/stable/ubuntu xenial/main i386 Packages

And that one makes further changes to curtin_userdata which once again, break deployments. NOW curtin_userdata looks like so:
#cloud-config
debconf_selections:
 maas: |
  {{for line in str(curtin_preseed).splitlines()}}
  {{line}}
  {{endfor}}
{{if third_party_drivers and driver}}
early_commands:
  {{py: key_string = ''.join(['\\x%x' % x for x in map(ord, driver['key_binary'])])}}
  driver_00_get_key: /bin/echo -en '{{key_string}}' > /tmp/maas-{{driver['package']}}.gpg
  driver_01_add_key: ["apt-key", "add", "/tmp/maas-{{driver['package']}}.gpg"]
  driver_02_add: ["add-apt-repository", "-y", "deb {{driver['repository']}} {{node.get_distro_series()}} main"]
  driver_03_update_install: ["sh", "-c", "apt-get update --quiet && apt-get --assume-yes install {{driver['package']}}"]
  driver_04_load: ["sh", "-c", "depmod && modprobe {{driver['module']}}"]
{{endif}}
late_commands:
  maas: [wget, '--no-proxy', {{node_disable_pxe_url|escape.json}}, '--post-data', {{node_disable_pxe_data|escape.json}}, '-O', '/dev/null']
{{if third_party_drivers and driver}}
  driver_00_key_get: curtin in-target -- sh -c "/bin/echo -en '{{key_string}}' > /tmp/maas-{{driver['package']}}.gpg"
  driver_02_key_add: ["curtin", "in-target", "--", "apt-key", "add", "/tmp/maas-{{driver['package']}}.gpg"]
  driver_03_add: ["curtin", "in-target", "--", "add-apt-repository", "-y", "deb {{driver['repository']}} {{node.get_distro_series()}} main"]
  driver_04_update_install: ["curtin", "in-target", "--", "apt-get", "update", "--quiet"]
  driver_05_install: ["curtin", "in-target", "--", "apt-get", "-y", "install", "{{driver['package']}}"]
  driver_06_depmod: ["curtin", "in-target", "--", "depmod"]
  driver_07_update_initramfs: ["curtin", "in-target", "--", "update-initramfs", "-u"]
{{endif}}

power_state:
  mode: reboot

That's twice in a week I've had to fix problems in our version of curtin_userdata due to changes pushed into Xenial updates.

This time, the culprit is this:
Old version:
maas: [wget, '--no-proxy', '{{node_disable_pxe_url|escape.shell}}', '--post-data', '{{node_disable_pxe_data|escape.shell}}', '-O', '/dev/null']
New version:
maas: [wget, '--no-proxy', {{node_disable_pxe_url|escape.json}}, '--post-data', {{node_disable_pxe_data|escape.json}}, '-O', '/dev/null']

SO... we need a way to allow for customizations to curtin_userdata that don't break every time there's a MAAS update pushed to stable or updates.

Perhaps there's a way to have a default curtin_userdata that imports things in a curtin_userdata.d directory? Or a method in the MAAS UI to add customizations that MAAS then injects into the stock curtin_userdata template?

Or perhaps another solution. I understand that we have a fork of curtin_userdata, but I have a hard time believing that Cert is the only customer who has curtin customizations for deployment.

Related branches

Jeff Lane  (bladernr)
description: updated
Revision history for this message
Andres Rodriguez (andreserl) wrote :
Changed in maas:
status: New → Won't Fix
Revision history for this message
Andres Rodriguez (andreserl) wrote :
Revision history for this message
Jeff Lane  (bladernr) wrote :

That is not a valid answer. That does not address the issue.

1: user-provided-preseeds does NOT address the fact that this line of code in today's MAAS:

maas: [wget, '--no-proxy', {{node_disable_pxe_url|escape.json}}, '--post-data', {{node_disable_pxe_data|escape.json}}, '-O', '/dev/null']

does not get imported into a user override (e.g. curtin_userdata_ubuntu)

2: and this line in curtin_userdata_ubuntu from MAAS earlier this week still breaks todays MAAS deployments:

maas: [wget, '--no-proxy', {{node_disable_pxe_url|escape.json}}, '--post-data', {{node_disable_pxe_data|escape.json}}, '-O', '/dev/null']

Let's start with the stock Ubuntu images. and lets assume I have the following on Tuesday:
MAAS 2.1.0 from maas/stable or Xenial updates
curtin_userdata (stock from MAAS 2.1.0)
curtin_userdata_ubuntu (my customized user-provided-preseed that takes curtin_userdata and adds some extra lines to install certain packages and is based on the stock file from MAAS 2.1.0)

And I did deployments and MAAS uses curtin_userdata_ubuntu and deployments work fine.

Overnight, MAAS 2.1.1 is pushed to stable or Xenial updates and I end up with this on Friday:
MAAS 2.1.1 from maas/stable or Xenial updates
curtin_userdata (stock from MAAS 2.1.1)
curtin_userdata_ubuntu (same customized file, still based on the stock file from MAAS 2.1.0)

NOW when I do deployments, just as I did on Tuesday and Wednesday and Thursday, MAAS uses the curtin_userdata_ubuntu file and deployments fail across the board.

Because now the code in curtin_userdata in 2.1.1 is expected to be different from the code in curtin_userdata in 2.1.0.

I apologise if I'm not adequately describing the problem here. So this is what the expectation should be:

the documented user-provided-preseeds should NOT BREAK between minor releases of MAAS. As I tried to explain before, I would expect breakage going from 1.x to 2.x. I do NOT expect breakage going from 2.X to 2.Y, nor from 2.X.a to 2.X.b

And it's entirely possible that the fault lies in Curtin instead if it's a change in curtin that now breaks things and if so, then a Curtin task should be added after some triage/investigation.

So lets forget at all that there are customizations in the file. Because ultimately, the customizations are not the issue here, they are a red herring. The issue is that customizations set up a very fast avenue to having out of sync default code in curtin_userdata* that break deployments instantly.

The real issue at hand is that the code in curtin_userdata from 2.1.0 should NOT break deployments in 2.1.1.

Revision history for this message
Andres Rodriguez (andreserl) wrote :

Ok, I think I see what you mean now. You mean that preseeds with:

in 2.1.0 that had the following:

 maas: [wget, '--no-proxy', '{{node_disable_pxe_url|escape.shell}}', '--post-data', '{{node_disable_pxe_data|escape.shell}}', '-O', '/dev/null']

No longer succeed to deploy, and as such, you had to update to what 2.1.1 provides now, which is:

maas: [wget, '--no-proxy', {{node_disable_pxe_url|escape.json}}, '--post-data', {{node_disable_pxe_data|escape.json}}, '-O', '/dev/null']

 ?

summary: - Every update to MAAS seems to change curtin_userdata, breaking
- deployments using older versions wtih customizations
+ [2.1.1] 2.1.0 preseeds with '{{node_disable_pxe_data|escape.shell}}'
+ cause a failed deployment if preseed wasn't upgraded on 2.1.1.
Changed in maas:
importance: Undecided → Critical
status: Won't Fix → New
milestone: none → 2.2.0
Revision history for this message
Jeff Lane  (bladernr) wrote :

Yes, exactly. Sorry for the confusion with the talk about customizations. The diff between the stock curtin_userdata and our version is only the addition of some late commands that install packages and configure NICs. we don't change ANY of the default lines in curtin_userdata.

Theres also an added stanza of python code that gets some info about MAAS itself and passes that along to the node during late commands, but even that is just an addition that doesn't affect usability, not a modification to the base curtin_userdata code that MAAS uses by default.

Revision history for this message
Jeff Lane  (bladernr) wrote :

I don't have an easy way to test this, unfortunately, because it requires rebuildng my maas server (I really need to sort out running MAAS in containers so I can bounce between them more quickly)...

But this is what I've experienced so far this week:

MAAS 2.0 + curtin_userdata from MAAS 2.0: works
MAAS 2.1.0 + curtin_userdata from MAAS 2.0: broken
MAAS 2.1.0 + curtin_userdata from MAAS 2.1.0: works
MAAS 2.1.1 + curtin_userdata from MAAS 2.1.0: broken
MAAS 2.1.1 + curtin_userdata from MAAS 2.1.1: works
MAAS 2.0 + curtin_userdata from MAAS 2.1.1: broken

(That last one was tested by Rod the others were ones I've hit over the last couple weeks as my server was updated from 2.0 to 2.1 to 2.1.1)

Revision history for this message
Andres Rodriguez (andreserl) wrote : Re: [Bug 1642996] Re: [2.1.1] 2.1.0 preseeds with '{{node_disable_pxe_data|escape.shell}}' cause a failed deployment if preseed wasn't upgraded on 2.1.1.

Thanks Jeff. We understand what the problem is so we will have a fix for
backwards compat.

Thanks for testing.

On Fri, Nov 18, 2016 at 3:11 PM Jeff Lane <email address hidden>
wrote:

> I don't have an easy way to test this, unfortunately, because it
> requires rebuildng my maas server (I really need to sort out running
> MAAS in containers so I can bounce between them more quickly)...
>
> But this is what I've experienced so far this week:
>
> MAAS 2.0 + curtin_userdata from MAAS 2.0: works
> MAAS 2.1.0 + curtin_userdata from MAAS 2.0: broken
> MAAS 2.1.0 + curtin_userdata from MAAS 2.1.0: works
> MAAS 2.1.1 + curtin_userdata from MAAS 2.1.0: broken
> MAAS 2.1.1 + curtin_userdata from MAAS 2.1.1: works
> MAAS 2.0 + curtin_userdata from MAAS 2.1.1: broken
>
> (That last one was tested by Rod the others were ones I've hit over the
> last couple weeks as my server was updated from 2.0 to 2.1 to 2.1.1)
>
> --
> You received this bug notification because you are subscribed to MAAS.
> https://bugs.launchpad.net/bugs/1642996
>
> Title:
> [2.1.1] 2.1.0 preseeds with '{{node_disable_pxe_data|escape.shell}}'
> cause a failed deployment if preseed wasn't upgraded on 2.1.1.
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/maas/+bug/1642996/+subscriptions
>

Changed in maas:
status: New → Fix Committed
Changed in maas:
milestone: 2.2.0 → none
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.