OpenStack Compute (nova)

Bug #673756
Comment #3

Comment 3 for bug 673756

Revision history for this message

Mark McLoughlin (markmc) wrote on 2012-02-02:

Ok, so there's two things allegedly doing on here:

1) A volume is marked as 'available' before it is actually possible to attach a volume

2) Attaching a volume sometimes succeeds even when we actually failed to attach because it wasn't ready

Now, in VolumeManager.create_volume(), we mark the volume as available just before returning successfully. That implies to me that the call shouldn't return until the volume is available. And, as Soren points out, that means create_export() should block until it's ready.

That means that this loop in test_002_can_attach_volume() doesn't make sense:

        for x in xrange(10):
            volume.update()
            if volume.status.startswith('available'):
                break
            time.sleep(1)
        else:
            self.fail('cannot attach volume with state %s' % volume.status)

i.e. create_volume() has already allegedly succeeded, so the volume should have 'available' status

To reproduce the bug, I think you just need to remove the two five second sleeps between create_volume() and attach_volume() in the test. I haven't tried reproducing yet, so it'd be nice to know what cases are failing - iSCSI only, or all volume drivers? tgtadm or ietadm? Xen/KVM? With an instance running on the volumes host, or remotely, or both?

In the case of ISCSIDriver, the last thing we do in create_export() is run 'tgtdadm --op new --mode=logicalunit ...'. So, we need to understand the semantics of that. Is it supposed to block until the LUN is available? If not, how do we poll for completion?

And we also need to figure out why attach_volume returns successfully when it has failed.