Race condition: rbd device not yet available when mkfs is called

Bug #1210267 reported by Andreas Hasenack
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
rabbitmq-server (Juju Charms Collection)
Fix Released
Undecided
Andreas Hasenack

Bug Description

When using rabbitmq-server with ceph, there is a race condition with the device setup:

2013-08-08 17:58:24 INFO juju server.go:105 worker/uniter/jujuc: running hook tool "juju-log" ["--log-level" "INFO" "ceph: Creating RBD image (rabbitmq1)."]
2013-08-08 17:58:24 DEBUG juju server.go:106 worker/uniter/jujuc: hook context id "rabbitmq-server/0:ceph-relation-changed:4966816608950231225"; dir "/var/lib/juju/agents/unit-rabbitmq-server-0/charm"
2013-08-08 17:58:24 INFO juju juju-log.go:64 rabbitmq-server/0 ceph:9: ceph: Creating RBD image (rabbitmq1).
2013-08-08 17:58:24 INFO juju server.go:105 worker/uniter/jujuc: running hook tool "juju-log" ["--log-level" "INFO" "ceph: Mapping RBD Image as a Block Device."]
2013-08-08 17:58:24 DEBUG juju server.go:106 worker/uniter/jujuc: hook context id "rabbitmq-server/0:ceph-relation-changed:4966816608950231225"; dir "/var/lib/juju/agents/unit-rabbitmq-server-0/charm"
2013-08-08 17:58:24 INFO juju juju-log.go:64 rabbitmq-server/0 ceph:9: ceph: Mapping RBD Image as a Block Device.
2013-08-08 17:58:24 INFO juju server.go:105 worker/uniter/jujuc: running hook tool "juju-log" ["--log-level" "INFO" "ceph: Formatting block device /dev/rbd/rabbitmq-server/rabbitmq1 as filesystem ext4."]
2013-08-08 17:58:24 DEBUG juju server.go:106 worker/uniter/jujuc: hook context id "rabbitmq-server/0:ceph-relation-changed:4966816608950231225"; dir "/var/lib/juju/agents/unit-rabbitmq-server-0/charm"
2013-08-08 17:58:24 INFO juju juju-log.go:64 rabbitmq-server/0 ceph:9: ceph: Formatting block device /dev/rbd/rabbitmq-server/rabbitmq1 as filesystem ext4.
2013-08-08 17:58:24 INFO juju context.go:235 worker/uniter: HOOK mke2fs 1.42 (29-Nov-2011)
2013-08-08 17:58:24 INFO juju context.go:235 worker/uniter: HOOK Could not stat /dev/rbd/rabbitmq-server/rabbitmq1 --- No such file or directory
2013-08-08 17:58:24 INFO juju context.go:235 worker/uniter: HOOK
2013-08-08 17:58:24 INFO juju context.go:235 worker/uniter: HOOK The device apparently does not exist; did you specify it correctly?
2013-08-08 17:58:24 INFO juju context.go:235 worker/uniter: HOOK Traceback (most recent call last):
2013-08-08 17:58:24 INFO juju context.go:235 worker/uniter: HOOK File "/var/lib/juju/agents/unit-rabbitmq-server-0/charm/hooks/ceph-relation-changed", line 302, in <module>
2013-08-08 17:58:24 INFO juju context.go:235 worker/uniter: HOOK utils.do_hooks(hooks)
2013-08-08 17:58:24 INFO juju context.go:235 worker/uniter: HOOK File "/var/lib/juju/agents/unit-rabbitmq-server-0/charm/hooks/lib/utils.py", line 28, in do_hooks
2013-08-08 17:58:24 INFO juju context.go:235 worker/uniter: HOOK hook_func()
2013-08-08 17:58:24 INFO juju context.go:235 worker/uniter: HOOK File "/var/lib/juju/agents/unit-rabbitmq-server-0/charm/hooks/ceph-relation-changed", line 222, in ceph_changed
2013-08-08 17:58:24 INFO juju context.go:235 worker/uniter: HOOK system_services=['rabbitmq-server'])
2013-08-08 17:58:24 INFO juju context.go:235 worker/uniter: HOOK File "/var/lib/juju/agents/unit-rabbitmq-server-0/charm/hooks/lib/ceph_utils.py", line 244, in ensure_ceph_storage
2013-08-08 17:58:24 INFO juju context.go:235 worker/uniter: HOOK make_filesystem(blk_device, fstype)
2013-08-08 17:58:24 INFO juju context.go:235 worker/uniter: HOOK File "/var/lib/juju/agents/unit-rabbitmq-server-0/charm/hooks/lib/ceph_utils.py", line 160, in make_filesystem
2013-08-08 17:58:24 INFO juju context.go:235 worker/uniter: HOOK execute(cmd)
2013-08-08 17:58:24 INFO juju context.go:235 worker/uniter: HOOK File "/var/lib/juju/agents/unit-rabbitmq-server-0/charm/hooks/lib/ceph_utils.py", line 28, in execute
2013-08-08 17:58:24 INFO juju context.go:235 worker/uniter: HOOK subprocess.check_call(cmd)
2013-08-08 17:58:24 INFO juju context.go:235 worker/uniter: HOOK File "/usr/lib/python2.7/subprocess.py", line 511, in check_call
2013-08-08 17:58:24 INFO juju context.go:235 worker/uniter: HOOK raise CalledProcessError(retcode, cmd)
2013-08-08 17:58:24 INFO juju context.go:235 worker/uniter: HOOK subprocess.CalledProcessError: Command '['mkfs', '-t', 'ext4', u'/dev/rbd/rabbitmq-server/rabbitmq1']' returned non-zero exit status 1
2013-08-08 17:58:24 ERROR juju uniter.go:352 worker/uniter: hook failed: exit status 1

In summary:
1) create pool
2) create image
3) map block storage <--- triggers the device appearing in /dev/rbd/pool/image
4) mkfs

If (4) is done too quickly after (3), the device might not be there yet, resulting in the above error. Step (4) should wait a bit until the device appears before attempting to mkfs it.

Tags: landscape

Related branches

Changed in rabbitmq-server (Juju Charms Collection):
assignee: nobody → Andreas Hasenack (ahasenack)
status: New → In Progress
Changed in rabbitmq-server (Juju Charms Collection):
status: In Progress → Fix Released
tags: added: landscape
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.