unit "(AnyCharm)" is not assigned to a machine when deploying with juju 1.22-beta5

Bug #1430049 reported by Jason Hobbs
12
This bug affects 2 people
Affects Status Importance Assigned to Milestone
juju-core
Fix Released
Critical
Andrew Wilkins

Bug Description

I hit this error in jujuclient.py when deploying with juju-core 1.22-beta5:

  File "/usr/lib/python2.7/dist-packages/jujuclient.py", line 281, in _rpc
    raise EnvError(result)
EnvError: <Env Error - Details:
 { u'Error': u'unit "ceph/0" is not assigned to a machine',
    u'ErrorCode': u'not assigned',
    u'RequestId': 1,
    u'Response': { }}
 >

Longer back trace:
https://pastebin.canonical.com/127213/

It did not show up with 1.21.3 or 1.22-beta4.

Here's my debug all-machines.log:
https://pastebin.canonical.com/127212/

Revision history for this message
Jason Hobbs (jason-hobbs) wrote :

I've tested twice now and hit this both times.

Here's my juju related versions:
ii juju-core 1.22-beta5-0ubuntu1~12.04.1~juju1 Juju is devops distilled - client
ii juju-deployer 0.4.3-0ubuntu1~ubuntu12.04.1~ppa1 Deploy complex stacks of services using Juju
ii python-jujuclient 0.50.1-2 Python API client for juju-core

tags: added: oil
Revision history for this message
Kapil Thangavelu (hazmat) wrote : Re: [Bug 1430049] Re: unit "ceph/0" is not assigned to a machine when deploying with juju 1.22-beta5

can you paste that to a public pastebin, else its private to canonical
employees only, and there isn't enough information in this bug to evaluate
whats going on.

On Mon, Mar 9, 2015 at 7:11 PM, Jason Hobbs <email address hidden>
wrote:

> I've tested twice now and hit this both times.
>
> Here's my juju related versions:
> ii juju-core 1.22-beta5-0ubuntu1~12.04.1~juju1
> Juju is devops distilled - client
> ii juju-deployer 0.4.3-0ubuntu1~ubuntu12.04.1~ppa1
> Deploy complex stacks of services using Juju
> ii python-jujuclient 0.50.1-2
> Python API client for juju-core
>
>
> ** Tags added: oil
>
> --
> You received this bug notification because you are subscribed to juju-
> core.
> https://bugs.launchpad.net/bugs/1430049
>
> Title:
> unit "ceph/0" is not assigned to a machine when deploying with juju
> 1.22-beta5
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/juju-core/+bug/1430049/+subscriptions
>

Revision history for this message
Jason Hobbs (jason-hobbs) wrote : Re: unit "ceph/0" is not assigned to a machine when deploying with juju 1.22-beta5

The debug log has auth keys in it so I can't post it without figuring out how to filter them all. Here's the traceback though:

http://pastebin.ubuntu.com/10571920/

Revision history for this message
Ian Booth (wallyworld) wrote :

Deployer traceback:

http://pastebin.ubuntu.com/10571882/

all-machines.log is full of credentials so can't be pasted publically

Revision history for this message
Ian Booth (wallyworld) wrote :

Here's a pastebin of a snippet from the all machines log

http://pastebin.ubuntu.com/10571986/

It looks like the allwatcher and other juju infrastructure is trying to access ceph/0 before it is deployed. The deployment startes to happen after that.

Revision history for this message
Jason Hobbs (jason-hobbs) wrote :

FWIW I tested a third time and it failed the same way - so this is definitely a blocker for OIL as it seems to fail every time.

Revision history for this message
Ian Booth (wallyworld) wrote :

The only 2 changes juju-core changes between beta4 and beta5 were: a fix to allow lxc containers to start on PPC64 hosts, and this one: https://github.com/juju/juju/pull/1707/files

I don't know much about how the deployer interfaces with the allwatcher in core, but this line from the log seems suspicious:

user-admin@local 153.352us {"RequestId":1,"Error":"unit \"ceph/0\" is not assigned to a machine","ErrorCode":"not assigned","Response":{}} AllWatcher["1"].Next

Getting the next unit from the watcher will trigger the unit's open ports to be loaded which will fail if the unit is not assigned, but the code seems to handle that and continues without error.

The changes in PR 1707 seem innocuous in relation to the issue but would be interesting to test with that change backed out to see if the issue goes away.

There were some changes to introduce a new NotAssigned error but these were done around 10th Feb, well before beta4 was released.

Curtis Hovey (sinzui)
Changed in juju-core:
importance: Undecided → Critical
status: New → Triaged
Revision history for this message
Andrew Wilkins (axwalk) wrote :

Looks like the bug is here: https://github.com/juju/juju/pull/1707/files#diff-a3159d5e7a710a6e3b9c026dbd3d5a2eL118
For some reason there was a change from `IsNotAssigned(errors.Cause(err))` to `IsNotAssigned(err)`. IsNotAssigned in 1.22 does not automatically extract the cause, so the backport is broken.

Andrew Wilkins (axwalk)
Changed in juju-core:
status: Triaged → In Progress
assignee: nobody → Andrew Wilkins (axwalk)
milestone: none → 1.22-beta6
Revision history for this message
Dimiter Naydenov (dimitern) wrote :

Thanks for fixing this Andrew! Since I've reviewed the backport, I should've checked the behavior of IsNotAssigned differs between trunk and 1.22.

Andrew Wilkins (axwalk)
Changed in juju-core:
status: In Progress → Fix Committed
Revision history for this message
Ryan Beisner (1chb1n) wrote :

We are seeing the same in UOSCI with 1.22beta5. About 2 of 10 deployer deployments result in something like:
http://paste.ubuntu.com/10575332/

Revision history for this message
Ryan Beisner (1chb1n) wrote :

FYI, an example deployer loop & check showing 1 of 10 failing:
http://paste.ubuntu.com/10575487/

See L785 Iteration 8.

Revision history for this message
Curtis Hovey (sinzui) wrote :

We have test debs located at
     http://juju-ci.vapour.ws:8080/job/publish-revision/1565/artifact/

Contact sinzui/curtis to get credentials if needed. and I can also provide newer debs if CI has built them.

Revision history for this message
Ryan Beisner (1chb1n) wrote :

Issue still exists with 1.22-beta6-0ubuntu1~14.04.1~juju1

4 of 25 deploys exhibited the symptom.

Tested with debs from http://juju-ci.vapour.ws:8080/view/Juju%20Revisions/job/publish-revision/1566/artifact/.

Reproducer and full output @ http://paste.ubuntu.com/10578007/.

Summary:

juju-core 1.22-beta6-0ubuntu1~14.04.1~juju1
juju-deployer 0.3.6-0ubuntu2
python-jujuclient 0.17.5-0ubuntu2

2015-03-10 21:51:37 [DEBUG] deployer.import: Adding units...
Traceback (most recent call last):
  File "/usr/bin/juju-deployer", line 9, in <module>
    load_entry_point('juju-deployer==0.3.6', 'console_scripts', 'juju-deployer')()
  File "/usr/lib/python2.7/dist-packages/deployer/cli.py", line 127, in main
    run()
  File "/usr/lib/python2.7/dist-packages/deployer/cli.py", line 225, in run
    importer.Importer(env, deployment, options).run()
  File "/usr/lib/python2.7/dist-packages/deployer/action/importer.py", line 195, in run
    self.add_units()
  File "/usr/lib/python2.7/dist-packages/deployer/action/importer.py", line 22, in add_units
    env_status = self.env.status()
  File "/usr/lib/python2.7/dist-packages/deployer/env/go.py", line 244, in status
    return self.client.get_stat()
  File "/usr/lib/python2.7/dist-packages/jujuclient.py", line 497, in get_stat
    return StatusTranslator().run(watch)
  File "/usr/lib/python2.7/dist-packages/jujuclient.py", line 886, in run
    change_set = watch.next()
  File "/usr/lib/python2.7/dist-packages/jujuclient.py", line 224, in next
    'Id': self.watcher_id})
  File "/usr/lib/python2.7/dist-packages/jujuclient.py", line 152, in _rpc
    raise EnvError(result)
jujuclient.EnvError: <Env Error - Details:
 { u'Error': u'unit "ubuntu/1" is not assigned to a machine',
    u'ErrorCode': u'not assigned',
    u'RequestId': 1,
    u'Response': { }}

Revision history for this message
Andrew Wilkins (axwalk) wrote :

For posterity, the environment is running 1.22-beta5 still, as can be seen in the output of "juju status". Ryan is testing with --upload-tools, and will check back in on it.

Revision history for this message
Ryan Beisner (1chb1n) wrote :

Ah yes, I didn't use upload-tools in that last run. With 1.22beta6 tools, this bug appears to be resolved.

25 of 25 deploys succeeded.

Reproducer script & full output: http://paste.ubuntu.com/10579638/

Ryan Beisner (1chb1n)
tags: added: openstack uosci
summary: - unit "ceph/0" is not assigned to a machine when deploying with juju
+ unit "(AnyCharm)" is not assigned to a machine when deploying with juju
1.22-beta5
Revision history for this message
Andrew Wilkins (axwalk) wrote :

Excellent, thanks for the update.

Curtis Hovey (sinzui)
Changed in juju-core:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.