juju add-storage doesn't always grab ebs volumes on aws

Bug #1692729 reported by Adam Stokes
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Canonical Juju
Fix Released
High
Andrew Wilkins

Bug Description

Using Juju 2.2rc1 from the develop branch of today 5/22/17

Reproducer (run all in sequence):
juju bootstrap aws/us-east-1
juju deploy canonical-kubernetes
juju deploy cs:ceph-mon -n 3
juju deploy cs:ceph-osd -n 3
juju add-relation ceph-mon ceph-osd
juju add-storage ceph-osd/0 osd-devices=ebs,10G,1
juju add-storage ceph-osd/1 osd-devices=ebs,10G,1
juju add-storage ceph-osd/2 osd-devices=ebs,10G,1
juju add-relation kubernetes-master ceph-mon

Problem:
Sometimes ebs volumes get attached other times it uses the machines loop device:

[adam:~] $ juju storage
[Storage]
Unit Id Type Pool Provider id Size Status Message
ceph-osd/0 osd-devices/0 block loop volume-12-0 1.0GiB attached
ceph-osd/1 osd-devices/1 block ebs vol-0aaf312a37afd4181 10GiB attached
ceph-osd/2 osd-devices/2 block ebs vol-0a10b11e58c0140f0 10GiB attached

Juju status:
http://paste.ubuntu.com/24627335/

unit-ceph-osd-0.log
http://paste.ubuntu.com/24627399/

There are times where this problem doesn't present itself as well, and other times when all ebs volumes fail to attach.

One additional thing to note is if we wait for canonical-kubernetes to become completely ready and attempt to add storage after that then it tends to work 100% of the time. So could be some kind of possible race condition?

description: updated
Andrew Wilkins (axwalk)
Changed in juju:
status: New → Triaged
importance: Undecided → High
assignee: nobody → Andrew Wilkins (axwalk)
milestone: none → 2.2-rc1
Andrew Wilkins (axwalk)
Changed in juju:
status: Triaged → In Progress
Revision history for this message
Andrew Wilkins (axwalk) wrote :

I think the trigger for this is the unit not yet being assigned to a machine. i.e. if you add a unit and then immediately run "juju add-storage" to that unit, and it hasn't yet been assigned a machine, then you'll get "loop" storage.

Revision history for this message
Andrew Wilkins (axwalk) wrote :

OK I can see the problem. We're throwing away the storage constraints if the unit is not assigned to a machine when we run add-storage. We then fetch the constraints from the application's storage constraints, which in this case will be the defaults (1x1G loop device). We just need to record the constraints on the storage instance, and use that.

Revision history for this message
Andrew Wilkins (axwalk) wrote :
Andrew Wilkins (axwalk)
Changed in juju:
status: In Progress → Fix Committed
Revision history for this message
Adam Stokes (adam-stokes) wrote :

Thanks Andrew, will test today

Changed in juju:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.