Occasional stucking of manual provisioning

Bug #1904877 reported by David
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Canonical Juju
Triaged
Low
Unassigned

Bug Description

Hello,

I have seen an issue in the manual provisioning of the charms, that happens very occasionally.

The issue is basically that the charm machine is completely stuck in pending state.

Trying to debug it, I found out that in the cloud.logs there was a message saying there's no cpu-checker candidate. And doing an apt-update the problem was solved.

(Sorry I don't have the logs)

I think that this issue could be solved by checking the provisioning script that juju generates, and probably there's something we could improve there to make it more robust. I imagine that may be the first apt-update is not finishing well... I'm not sure.

summary: - Ocasional stucking of manual provisioning
+ Occasional stucking of manual provisioning
Revision history for this message
Heather Lanigan (hmlanigan) wrote :

Interesting, when I just provisioned a manual bionic machine, `apt-get update` and `apt-get upgrade` was run.

What are the enable-os-refresh-update and enable-os-upgrade model config values? If both are set to false, juju will not apt update. They are true by default.

What series is the manual machine?

It'd be helpful to have the /var/log/cloud-init-output.log when this occurs.

Changed in juju:
status: New → Incomplete
Revision history for this message
David (davigar15) wrote :

Here are some more logs: https://pastebin.canonical.com/p/GnbYpqTN95/

You can see these in the logs:

E: Package 'cpu-checker' has no installation candidate
Reading package lists...
Building dependency tree...
Reading state information...
Package cpu-checker is not available, but is referred to by another package.
This may mean that the package is missing, has been obsoleted, or
is only available from another source

it will be stuck there forever, but when I did apt update in the machine, then it finished the provisioning successfully.

Revision history for this message
Pen Gale (pengale) wrote :

I'm not sure that the solution here lies within Juju. With manual provisioning, you're taking on a certain amount of responsibility for assuring that the machine can come up to a ready state. It sounds like you need to troubleshoot what's breaking w/ apt on these machines, and fix it. The Juju machine agent can't do too much about it ...

Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for juju because there has been no activity for 60 days.]

Changed in juju:
status: Incomplete → Expired
Revision history for this message
David (davigar15) wrote :
Download full text (4.8 KiB)

According to the script and the logs, the update is not well performed. If I login to the VM to do a simple apt update, it the script gets unblocked,

The fact that the script provided by the controller for the provisioning sometimes is inconsistent got me thinking this might be an issue on the juju side (at least the script)

Provisioning script:

```bash
echo 'Running apt-get update' >&$JUJU_PROGRESS_FD
package_manager_loop apt-get --option=Dpkg::Options::=--force-confold --option=Dpkg::Options::=--force-unsafe-io --assume-yes --quiet update
echo 'Running apt-get upgrade' >&$JUJU_PROGRESS_FD
package_manager_loop apt-get --option=Dpkg::Options::=--force-confold --option=Dpkg::Options::=--force-unsafe-io --assume-yes --quiet upgrade
echo 'Installing curl, cpu-checker, bridge-utils, tmux, ubuntu-fan' >&$JUJU_PROGRESS_FD
package_manager_loop apt-get --option=Dpkg::Options::=--force-confold --option=Dpkg::Options::=--force-unsafe-io --assume-yes --quiet install curl
```

Output of /var/log/cloud-init-output.log

```
Running apt-get update
Get:1 http://security.ubuntu.com/ubuntu bionic-security InRelease [88.7 kB]
Hit:2 http://archive.ubuntu.com/ubuntu bionic InRelease
Get:3 http://archive.ubuntu.com/ubuntu bionic-updates InRelease [88.7 kB]
Get:4 http://archive.ubuntu.com/ubuntu bionic-backports InRelease [74.6 kB]
Get:5 http://security.ubuntu.com/ubuntu bionic-security/main amd64 Packages [1573 kB]
Get:6 http://archive.ubuntu.com/ubuntu bionic/universe amd64 Packages [8570 kB]
Get:7 http://security.ubuntu.com/ubuntu bionic-security/universe amd64 Packages [1113 kB]
Cloud-init v. 20.4.1-0ubuntu1~18.04.1 running 'modules:final' at Mon, 15 Mar 2021 15:36:02 +0000. Up 23.46 seconds.
Cloud-init v. 20.4.1-0ubuntu1~18.04.1 finished at Mon, 15 Mar 2021 15:36:02 +0000. Datasource DataSourceOpenStackLocal [net,ver=2]. Up 23.58 seconds
Get:8 http://security.ubuntu.com/ubuntu bionic-security/universe Translation-en [249 kB]
Get:9 http://security.ubuntu.com/ubuntu bionic-security/multiverse amd64 Packages [19.1 kB]
Get:10 http://security.ubuntu.com/ubuntu bionic-security/multiverse Translation-en [4412 B]
Get:11 http://archive.ubuntu.com/ubuntu bionic/universe Translation-en [4941 kB]
Get:12 http://archive.ubuntu.com/ubuntu bionic/multiverse amd64 Packages [151 kB]
Get:13 http://archive.ubuntu.com/ubuntu bionic/multiverse Translation-en [108 kB]
Get:14 http://archive.ubuntu.com/ubuntu bionic-updates/main amd64 Packages [1937 kB]
Get:15 http://archive.ubuntu.com/ubuntu bionic-updates/main Translation-en [397 kB]
Get:16 http://archive.ubuntu.com/ubuntu bionic-updates/restricted amd64 Packages [274 kB]
Get:17 http://archive.ubuntu.com/ubuntu bionic-updates/restricted Translation-en [36.5 kB]
Get:18 http://archive.ubuntu.com/ubuntu bionic-updates/universe amd64 Packages [1720 kB]
Get:19 http://archive.ubuntu.com/ubuntu bionic-updates/universe Translation-en [364 kB]
Get:20 http://archive.ubuntu.com/ubuntu bionic-updates/multiverse amd64 Packages [24.9 kB]
Get:21 http://archive.ubuntu.com/ubuntu bionic-updates/multiverse Translation-en [6464 B]
Get:22 http://archive.ubuntu.com/ubuntu bionic-backports/main amd64 Packages [10.0 kB]
Get:23 http://archive.ubuntu.co...

Read more...

Revision history for this message
David (davigar15) wrote :

This has not happened again, because we executed this command before the provisioning

```
cloud-init status --wait
```

I wonder if this could we simply added as part of the provisioning script, rather than putting that responsibility in the users side.

Revision history for this message
Pen Gale (pengale) wrote :

Re-opening as bitesize, as I like the above suggestion, and it should be trivial to implement.

Changed in juju:
status: Expired → Triaged
importance: Undecided → Medium
tags: added: bitesize
Revision history for this message
Canonical Juju QA Bot (juju-qa-bot) wrote :

This Medium-priority bug has not been updated in 60 days, so we're marking it Low importance. If you believe this is incorrect, please update the importance.

Changed in juju:
importance: Medium → Low
tags: added: expirebugs-bot
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.