Bootstrap fails with enable-os-refresh-update/upgrade = false

Bug #1819219 reported by Adam Israel
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Canonical Juju
Expired
Medium
Unassigned

Bug Description

tl;dr: Juju bootstrap to LXD will fail when enable-os-refresh-update and enable-os-upgrade are false.

# Summary

I recently tried to bootstrap a controller with enable-os-refresh-update and enable-os-upgrade set to false, which led to the discovery of a few issues involving these flags.

I tested Xenial and Bionic, but not Trusty.

# Xenial
With Xenial, the bootstrap seemed to be taking longer than normal on the "Running machine configuration script..." step. I inspected /var/log/cloud-init-output.log on the controller and noticed it was stuck in a loop, trying to install packages that were not found in the repository.

Log:
Ign:2 http://security.ubuntu.com/ubuntu xenial-security/main amd64 libnss3 amd64 2:3.28.4-0ubuntu0.16.04.4
Err:1 http://security.ubuntu.com/ubuntu xenial-security/main amd64 libnss3-nssdb all 2:3.28.4-0ubuntu0.16.04.4
  404 Not Found [IP: 91.189.92.201 80]
Err:2 http://security.ubuntu.com/ubuntu xenial-security/main amd64 libnss3 amd64 2:3.28.4-0ubuntu0.16.04.4
  404 Not Found [IP: 91.189.92.201 80]
E: Failed to fetch http://security.ubuntu.com/ubuntu/pool/main/n/nss/libnss3-nssdb_3.28.4-0ubuntu0.16.04.4_all.deb 404 Not Found [IP: 91.189.92.201 80]

E: Failed to fetch http://security.ubuntu.com/ubuntu/pool/main/n/nss/libnss3_3.28.4-0ubuntu0.16.04.4_amd64.deb 404 Not Found [IP: 91.189.92.201 80]

E: Unable to fetch some archives, maybe run apt-get update or try with --fix-missing?

I manually ran `apt-get update`, resolving the problem. The bootstrap then completed.

The LXD image used was current:

$ lxc image list
+-------------------+--------------+--------+-----------------------------------------------+--------+----------+-----------------------------+
| ALIAS | FINGERPRINT | PUBLIC | DESCRIPTION | ARCH | SIZE | UPLOAD DATE |
+-------------------+--------------+--------+-----------------------------------------------+--------+----------+-----------------------------+
| juju/bionic/amd64 | 0b969c724698 | no | ubuntu 18.04 LTS amd64 (release) (20190307) | x86_64 | 177.72MB | Mar 8, 2019 at 4:22pm (UTC) |
+-------------------+--------------+--------+-----------------------------------------------+--------+----------+-----------------------------+
| juju/xenial/amd64 | f43e709ea5b9 | no | ubuntu 16.04 LTS amd64 (release) (20190212) | x86_64 | 158.27MB | Mar 8, 2019 at 7:10pm (UTC) |
+-------------------+--------------+--------+-----------------------------------------------+--------+----------+-----------------------------+
| | 35f6bff57c25 | no | ubuntu 18.04 LTS amd64 (release) (20190212.1) | x86_64 | 177.63MB | Mar 5, 2019 at 6:38pm (UTC) |
+-------------------+--------------+--------+-----------------------------------------------+--------+----------+-----------------------------+

# Bionic

When bootstrapping a Bionic controller, I hit a different error:

$ juju bootstrap lxd osm --config enable-os-refresh-update=false --config enable-os-upgrade=false

[...]
Fetching Juju agent version 2.5.1 for amd64
Attempt 1 to download agent binaries from https://streams.canonical.com/juju/tools/agent/2.5.1/juju-2.5.1-ubuntu-amd64.tgz...
agent binaries from https://streams.canonical.com/juju/tools/agent/2.5.1/juju-2.5.1-ubuntu-amd64.tgz downloaded: HTTP 200; time 21.701269s; size 29110296 bytes; speed 1341426.000 bytes/s Agent binaries downloaded successfully.
21b3e45522657aaa76491e493ba3ace92e713aa4b960cdd41d7e14e3d851e32f /var/lib/juju/tools/2.5.1-bionic-amd64/tools.tar.gz
f98f377d076dda630d80edc37f6fb6c475cac7f65d1dde4b525128d575db1192 /var/lib/juju/gui/gui.tar.bz2
Installing Juju machine agent
2019-03-08 16:24:33 INFO juju.cmd supercommand.go:57 running jujud [2.5.1 gc go1.11.5]
2019-03-08 16:24:34 INFO juju.agent identity.go:22 writing system identity file
2019-03-08 16:24:34 ERROR juju.mongo mongo.go:567 could not set the value of "/proc/sys/net/core/netdev_max_backlog" to "1000" because of: "/proc/sys/net/core/netdev_max_backlog" does not exist, will not set "1000"
2019-03-08 16:24:34 ERROR juju.mongo mongo.go:567 could not set the value of "/proc/sys/net/ipv4/tcp_fin_timeout" to "30" because of: "/proc/sys/net/ipv4/tcp_fin_timeout" does not exist, will not set "30"
2019-03-08 16:24:34 ERROR juju.mongo mongo.go:567 could not set the value of "/sys/kernel/mm/transparent_hugepage/enabled" to "never" because of: open /sys/kernel/mm/transparent_hugepage/enabled: permission denied
2019-03-08 16:24:34 ERROR juju.mongo mongo.go:567 could not set the value of "/sys/kernel/mm/transparent_hugepage/defrag" to "never" because of: open /sys/kernel/mm/transparent_hugepage/defrag: permission denied
2019-03-08 16:24:34 ERROR juju.mongo mongo.go:567 could not set the value of "/proc/sys/net/ipv4/tcp_max_syn_backlog" to "4096" because of: "/proc/sys/net/ipv4/tcp_max_syn_backlog" does not exist, will not set "4096"
2019-03-08 16:24:34 ERROR juju.mongo mongo.go:567 could not set the value of "/proc/sys/net/core/somaxconn" to "16384" because of: "/proc/sys/net/core/somaxconn" does not exist, will not set "16384"
2019-03-08 16:24:34 INFO juju.mongo mongo.go:439 Ensuring mongo server is running; data directory /var/lib/juju; port 37017
2019-03-08 16:24:34 INFO juju.mongo mongo.go:640 installing [mongodb-server-core mongodb-clients]
2019-03-08 16:24:34 INFO juju.packaging.manager utils.go:64 Running: apt-get --option=Dpkg::Options::=--force-confold --option=Dpkg::options::=--force-unsafe-io --assume-yes --quiet install mongodb-server-core
2019-03-08 16:24:34 ERROR juju.packaging.manager utils.go:109 packaging command failed: encountered fatal error: unable to locate package; cmd: "apt-get --option=Dpkg::Options::=--force-confold --option=Dpkg::options::=--force-unsafe-io --assume-yes --quiet install mongod
b-server-core"; output: Reading package lists...
Building dependency tree...
Reading state information...
E: Unable to locate package mongodb-server-core

2019-03-08 16:24:34 ERROR juju.mongo mongo.go:463 cannot install/upgrade mongod (will proceed anyway): packaging command failed: encountered fatal error: unable to locate package
ERROR failed to start mongo: could not find a viable 'mongod' not found
ERROR failed to bootstrap model: subprocess encountered error code 1

# Conclusion

Bootstrapping a Juju controller to LXD, targeting either Xenial or Bionic, will not work with os update and upgrade disabled, as the LXD image package cache may be stale (or missing Universe, in the case of Bionic), even with a freshly downloaded image.

As a workaround, and perhaps better practice, we can bootstrap without those parameters and then use `juju model-defaults` to change those flags for subsequent models.

The initial approach, however, used to work and probably still should.

Tags: osm
Revision history for this message
John A Meinel (jameinel) wrote : Re: [Bug 1819219] [NEW] Bootstrap fails with enable-os-refresh-update/upgrade = false
Download full text (7.7 KiB)

Isn't that a case of needing better default images if you want to bootstrap
with "enable-os-update=false" ? We have that flag so that if you *know*
that your Apt cache information is up to date, you can ask us to skip that
step. If that isn't the case, then you shouldn't be setting the flag to
false.

On Sat, Mar 9, 2019 at 12:35 AM Adam Israel <email address hidden>
wrote:

> Public bug reported:
>
> tl;dr: Juju bootstrap to LXD will fail when enable-os-refresh-update and
> enable-os-upgrade are false.
>
> # Summary
>
> I recently tried to bootstrap a controller with enable-os-refresh-update
> and enable-os-upgrade set to false, which led to the discovery of a few
> issues involving these flags.
>
> I tested Xenial and Bionic, but not Trusty.
>
>
> # Xenial
> With Xenial, the bootstrap seemed to be taking longer than normal on the
> "Running machine configuration script..." step. I inspected
> /var/log/cloud-init-output.log on the controller and noticed it was stuck
> in a loop, trying to install packages that were not found in the repository.
>
> Log:
> Ign:2 http://security.ubuntu.com/ubuntu xenial-security/main amd64
> libnss3 amd64 2:3.28.4-0ubuntu0.16.04.4
> Err:1 http://security.ubuntu.com/ubuntu xenial-security/main amd64
> libnss3-nssdb all 2:3.28.4-0ubuntu0.16.04.4
> 404 Not Found [IP: 91.189.92.201 80]
> Err:2 http://security.ubuntu.com/ubuntu xenial-security/main amd64
> libnss3 amd64 2:3.28.4-0ubuntu0.16.04.4
> 404 Not Found [IP: 91.189.92.201 80]
> E: Failed to fetch
> http://security.ubuntu.com/ubuntu/pool/main/n/nss/libnss3-nssdb_3.28.4-0ubuntu0.16.04.4_all.deb
> 404 Not Found [IP: 91.189.92.201 80]
>
> E: Failed to fetch
>
> http://security.ubuntu.com/ubuntu/pool/main/n/nss/libnss3_3.28.4-0ubuntu0.16.04.4_amd64.deb
> 404 Not Found [IP: 91.189.92.201 80]
>
> E: Unable to fetch some archives, maybe run apt-get update or try with
> --fix-missing?
>
> I manually ran `apt-get update`, resolving the problem. The bootstrap
> then completed.
>
> The LXD image used was current:
>
> $ lxc image list
>
> +-------------------+--------------+--------+-----------------------------------------------+--------+----------+-----------------------------+
> | ALIAS | FINGERPRINT | PUBLIC |
> DESCRIPTION | ARCH | SIZE | UPLOAD DATE
> |
>
> +-------------------+--------------+--------+-----------------------------------------------+--------+----------+-----------------------------+
> | juju/bionic/amd64 | 0b969c724698 | no | ubuntu 18.04 LTS amd64
> (release) (20190307) | x86_64 | 177.72MB | Mar 8, 2019 at 4:22pm (UTC) |
>
> +-------------------+--------------+--------+-----------------------------------------------+--------+----------+-----------------------------+
> | juju/xenial/amd64 | f43e709ea5b9 | no | ubuntu 16.04 LTS amd64
> (release) (20190212) | x86_64 | 158.27MB | Mar 8, 2019 at 7:10pm (UTC) |
>
> +-------------------+--------------+--------+-----------------------------------------------+--------+----------+-----------------------------+
> | | 35f6bff57c25 | no | ubuntu 18.04 LTS amd64
> (release) (20190212.1) | x86_64 | 177.63MB | Mar 5, 201...

Read more...

Revision history for this message
John A Meinel (jameinel) wrote :

I feel like this is functionally "images being published have incomplete archive details".
It is likely we can only "juju bootstrap --config enable-os-upgrade=false" which is actually a setting I use in LXD.
I don't think there is anything *juju* can do about stale images.

Changed in juju:
importance: Undecided → Medium
status: New → Incomplete
Revision history for this message
John A Meinel (jameinel) wrote :

If I try to bootstrap with "enable-os-update-refresh=false" I end up with errors like this:
E: Failed to fetch http://archive.ubuntu.com/ubuntu/pool/main/c/ceph/librados2_12.2.12-0ubuntu0.18.04.1_amd64.deb 404 Not Found [IP: 91.189.88.24 80]

I don't see that if I only set "enable-os-upgrade=false". It seems that something with the base images is caching archive IP addresses that are not guaranteed to be available everywhere.

Revision history for this message
John A Meinel (jameinel) wrote :

What is odd is that 91.189.88.24 is still listed as one of the IP addresses that resolve for 'archive.ubuntu.com'. And I can hit that URL, and see a website. (Though I'm currently at a canonical conference which ends up redirecting archive.ubuntu.com to archive.conference to achieve local downloads of archive.)

Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for juju because there has been no activity for 60 days.]

Changed in juju:
status: Incomplete → Expired
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.