unit failing to get unit-get private-address in the install hook

Bug #1577556 reported by David Ames on 2016-05-02
36
This bug affects 4 people
Affects Status Importance Assigned to Milestone
OpenStack Charm Test Infra
High
Unassigned
juju
High
Unassigned
juju-core
Critical
Unassigned
1.25
Critical
Unassigned
ubuntu-openstack-ci
Critical
Unassigned
mysql (Juju Charms Collection)
High
Unassigned

Bug Description

2016-05-02 19:33:02 INFO install error: private-address not set
2016-05-02 19:33:02 INFO install Traceback (most recent call last):
2016-05-02 19:33:02 INFO install File "/var/lib/juju/agents/unit-mysql-5/charm/hooks/install.real", line 6, in <module>
2016-05-02 19:33:02 INFO install import lib.utils as utils
2016-05-02 19:33:02 INFO install File "/var/lib/juju/agents/unit-mysql-5/charm/hooks/lib/utils.py", line 151, in <module>
2016-05-02 19:33:02 INFO install def get_host_ip(hostname=unit_get('private-address')):
2016-05-02 19:33:02 INFO install File "/var/lib/juju/agents/unit-mysql-5/charm/hooks/lib/utils.py", line 126, in unit_get
2016-05-02 19:33:02 INFO install value = subprocess.check_output(cmd).strip() # IGNORE:E1103
2016-05-02 19:33:02 INFO install File "/usr/lib/python2.7/subprocess.py", line 573, in check_output
2016-05-02 19:33:02 INFO install raise CalledProcessError(retcode, cmd, output=output)
2016-05-02 19:33:02 INFO install subprocess.CalledProcessError: Command '['unit-get', 'private-address']' returned non-zero exit status 1
2016-05-02 19:33:02 ERROR juju.worker.uniter.operation runhook.go:107 hook "install" failed: exit status 1

Related branches

David Ames (thedac) on 2016-05-02
Changed in mysql (Juju Charms Collection):
status: New → Triaged
importance: Undecided → High
milestone: none → 16.07
Marco Ceppi (marcoceppi) wrote :

What version of Juju is this?

David Ames (thedac) wrote :

Marco,

This is 1.25 on xenial. It is possible this is a juju bug, though I have only seen this on the mysql charm so far.

1.25.5-xenial-amd64

Trent Lloyd (lathiat) wrote :

I saw this same issue with someone trying to deploy openstack-base in #juju.. same issue and error on multiple charms. Xenial+1.25
It's unclear to me why private-address would fail to fetch, works OK when I try it manually and just returned the public-address

One thing I also saw that may or may not be related, particularly noted with the lxd provider, is that if you have both IPv4+IPv6 then the private/public address randomly has either the IPv4 or IPv6 address - if you deploy 4 units you might get 1 IPv6 and 3 IPv4 or some other permuation. This causes issues with this code where get_host_ip will fail, but the juju get-address part works fine. So probably it's a separate issue but I mention it just in case it twigs something here.

James Page (james-page) on 2016-05-03
Changed in mysql (Juju Charms Collection):
status: Triaged → Fix Released
Edward Hope-Morley (hopem) wrote :
Changed in juju-core:
status: New → Triaged
importance: Undecided → High
Changed in juju-core:
milestone: none → 1.25.6
Ryan Beisner (1chb1n) wrote :

FYI: This continues to impact openstack charm test automation, in all charms. It's not specific to percona or mysql.

James Page (james-page) wrote :

Example error from UOSCI:

Example:

https://openstack-ci-reports.ubuntu.com/artifacts/test_charm_pipeline_amulet_full/openstack/charm-ceph-osd/298412/7/2016-07-06_18-12-23/index.html

Symptomatically the deployment fails with a hook execution error; juju status reveals that the unit is in a problem state:

mysql/0 error idle 1.25.5 10 3306/tcp hook failed: "shared-db-relation-changed" for glance:shared-db

...

10 pending 1.25.5 a8a71066-18db-4d94-83ee-c0bb0de97635 trusty arch=amd64 cpu-cores=1 mem=1536M root-disk=10240M availability-zone=nova

Ryan Beisner (1chb1n) wrote :

Changed topic as this is not specific to any one charm.

summary: - mysql charm is failing to get unit-get private-address in the install
- hook
+ unit failing to get unit-get private-address in the install hook
James Page (james-page) on 2016-07-11
Changed in ubuntu-openstack-ci:
status: New → Triaged
importance: Undecided → Critical
Curtis Hovey (sinzui) on 2016-07-14
Changed in juju-core:
milestone: 1.25.6 → 1.25.7
Martin Packman (gz) on 2016-07-21
tags: added: intermittent-failure network
tags: added: kanban-cross-team
tags: removed: kanban-cross-team
Curtis Hovey (sinzui) on 2016-08-23
Changed in juju-core:
milestone: 1.25.7 → none
importance: High → Undecided
status: Triaged → Incomplete
Changed in juju:
status: New → Triaged
importance: Undecided → High
milestone: none → 2.0-beta18
Changed in juju-core:
status: Incomplete → Invalid
Curtis Hovey (sinzui) on 2016-09-09
Changed in juju:
milestone: 2.0-beta18 → 2.0-beta19
Changed in juju:
milestone: 2.0-beta19 → 2.0-rc1
Changed in juju:
milestone: 2.0-rc1 → 2.0-rc2
Changed in juju:
milestone: 2.0-rc2 → 2.0.1
Changed in juju-core:
status: Invalid → Triaged
importance: Undecided → Critical
Curtis Hovey (sinzui) on 2016-10-28
Changed in juju:
milestone: 2.0.1 → none
Changed in juju:
milestone: none → 2.1.0
Ryan Beisner (1chb1n) on 2016-11-28
Changed in charm-test-infra:
status: New → Confirmed
importance: Undecided → High
tags: added: uosci
Ryan Beisner (1chb1n) wrote :

In ServerStack, this is observable as follows. Juju status output indicates:

 - A machine is stuck in a pending state.

 - The machine has a instance ID.

 - The machine instance is alive and well from the cloud and cloudinit perspective.

 - The unit is reachable and the console log shows normal acquisition of addresses and metadata.

 - The unit has no address in juju status.

 - The unit is in an error workload state, with a failed hook.

It appears that hooks are firing before private addresses are available from Juju.

Changed in juju-core:
milestone: none → 1.25.11
John A Meinel (jameinel) wrote :

"The unit is reachable and the console log ..." do you mean the machine? (Generally unit has to do with a particular installation of an application.)

It would be interesting to have the context of cloud-init-output.log and machine-X.log as well as the 'machine-0.log' from the controller.

I don't think we can fire any hooks until we have addresses, because the machine agent is the thing that starts the unit agent, and it would have reported its addresses as the first action.

Its possible it is something like the "juju agent fails to get installed", or "the juju agent fails to contact the controller", which leaves us in a state where we know the machine we want to put the Unit on, but as far as we can tell that machine is not ready for us. (As signaled by the machine agent calling back to the controller.)

As soon as you do a "deploy" or an "add-unit", Juju has a record of the Unit you want to be adding, so just having that unit in "pending" state or waiting for "allocation" doesn't mean we've actually gotten to the point where we can do anything with that unit.

Other sources of context are things like "juju status --format=yaml" which includes a lot of details that don't fit well in tabular view.

Changed in juju:
status: Triaged → Incomplete
Changed in juju-core:
status: Triaged → Incomplete
Changed in juju:
milestone: 2.1.0 → none
Changed in juju-core:
milestone: 1.25.11 → none
Ryan Beisner (1chb1n) on 2017-02-15
tags: added: repro-needed
Ryan Beisner (1chb1n) on 2017-10-13
Changed in charm-test-infra:
status: Confirmed → Invalid
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Duplicates of this bug

Other bug subscribers