Bug #1427060 “Unnecessary bdm entry is created if same volume is...” : Bugs : OpenStack Compute (nova)

Revision history for this message

jichenjc (jichenjc) wrote on 2015-03-06:

#1

If we take a look at following code, line 3023,3024 can be moved to previous line before reserve bdm
because it's only a status check , I will try to submit a patch and see whether they are some traps of
concurrency issue there

3007 def _attach_volume(self, context, instance, volume_id, device,
3008 disk_bus, device_type):
3009 """Attach an existing volume to an existing instance.
3010
3011 This method is separated to make it possible for cells version
3012 to override it.
3013 """
3014 # NOTE(vish): This is done on the compute host because we want
3015 # to avoid a race where two devices are requested at
3016 # the same time. When db access is removed from
3017 # compute, the bdm will be created here and we will
3018 # have to make sure that they are assigned atomically.
3019 volume_bdm = self.compute_rpcapi.reserve_block_device_name(
3020 context, instance, device, volume_id, disk_bus=disk_bus,
3021 device_type=device_type)
3022 try:
3023 volume = self.volume_api.get(context, volume_id)
3024 self.volume_api.check_attach(context, volume, instance=instance)
3025 self.volume_api.reserve_volume(context, volume_id)
3026 self.compute_rpcapi.attach_volume(context, instance=instance,
3027 volume_id=volume_id, mountpoint=device, bdm=volume_bdm)
3028 except Exception:
3029 with excutils.save_and_reraise_exception():
3030 volume_bdm.destroy()
3031

Changed in nova:
status:	New → Confirmed
assignee:	nobody → jichenjc (jichenjc)

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2015-03-10: Fix proposed to nova (master)

#2

Fix proposed to branch: master
Review: https://review.openstack.org/163177

Changed in nova:
status:	Confirmed → In Progress

Davanum Srinivas (DIMS) (dims-v) on 2015-03-14

Changed in nova:
importance:	Undecided → Low

melanie witt (melwitt) on 2015-04-20

tags:

added: volumes
removed: ntt

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2015-10-08: Change abandoned on nova (master)

#3

Change abandoned by jichenjc (<email address hidden>) on branch: master
Review: https://review.openstack.org/163177
Reason: seems an old bug exists and no better way now..

Davanum Srinivas (DIMS) (dims-v) on 2016-03-06

Changed in nova:
assignee:	jichenjc (jichenjc) → nobody
status:	In Progress → Confirmed

Revision history for this message

Nikola Đipanov (ndipanov) wrote on 2016-03-09:

#4

So this behavior can have pretty bad consequences, if we fail to delete the created BDM.

If in normal operation once the BDM record is created reserving the device name on the compute host, we will attempt to mark it as 'attaching' in Cinder. This fails if the volume is already attached, and as part of the cleanup we delete the BDM record we created.

If something goes wrong with the reply message however (there is a problem with the network, or for some reason the API worker that accepted the attach request goes away in the meantime are some of the problems I can think of), the cleanup code never runs and we are stuck with an invalid BDM, which we have to clean up manually. This is not ideal obviously but is made worse, by the fact that when looking for the BDM record, we commonly select records only by instance and volume IDs which will not be unique in this case, and Nova may attempt to use the wrong one by chance which causes things like detach, migrate etc. to fail until manual cleanup

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2016-03-09: Fix proposed to nova (master)

#5

Fix proposed to branch: master
Review: https://review.openstack.org/290793

Changed in nova:
assignee:	nobody → Nikola Đipanov (ndipanov)
status:	Confirmed → In Progress

Revision history for this message

Matt Riedemann (mriedem) wrote on 2016-03-11:

#6

Per comment 4, we had some patches from Dan Smith at one point to create a uuid column in the block_device_mappings table and then use that in the objects, so we could avoid issues with bdm_update_or_create in the DB API. Those never landed because they were written for a latent boot from volume race with cells v1 and we decided to just not fix cells v1, but could still be useful for situations like this.

Changed in nova:
importance:	Low → Medium

Revision history for this message

Matt Riedemann (mriedem) wrote on 2016-03-11:

#7

This is Dan's BDM object change to use a uuid:

https://review.openstack.org/#/c/242603/

Revision history for this message

Andrey Volkov (avolkov) wrote on 2016-08-17:

#8

https://review.openstack.org/#/c/290793 is updated according to changes in volume API (reserve call is added).
Order of creating BDM and volume reserving is reversed due guess that remote HTTP call can fail more frequently than DB request.

OpenStack Infra (hudson-openstack) on 2016-10-06

Changed in nova:
assignee:	Nikola Đipanov (ndipanov) → Lee Yarwood (lyarwood)

Revision history for this message

swathi chittajallu (swathichittajallu) wrote on 2017-02-01:

#9

Hi Lee Yarwood, any updates on this bug? If not, I would like to work on it.

Revision history for this message

Lee Yarwood (lyarwood) wrote on 2017-02-01:

#10

I'm going to revisit the following stack of patches in Pike and hopefully close this out finally :

https://review.openstack.org/#/q/topic:bug/1427060+status:open

Please feel free to review and rebase these if you have time but I'd like to retain ownership of the bug as a reminder to revisit this when the next cycle opens up.

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2017-04-05: Related fix proposed to nova (master)

#11

Related fix proposed to branch: master
Review: https://review.openstack.org/453451

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2017-04-05: Change abandoned on nova (master)

#12

Change abandoned by Zhang Hua (<email address hidden>) on branch: master
Review: https://review.openstack.org/453451
Reason: sorry, this patch can't stop creating new dirty BDM record when happen to fail to run 'volume_bdm.destroy()' - https://github.com/openstack/nova/blob/master/nova/compute/api.py#L3715, so abandon it

OpenStack Infra (hudson-openstack) on 2017-06-27

Changed in nova:
assignee:	Lee Yarwood (lyarwood) → Rikimaru Honjo (honjo-rikimaru-c6)

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2018-03-03:

#13

Change abandoned by Dan Smith (<email address hidden>) on branch: master
Review: https://review.openstack.org/290793
Reason: So I think Lee's comment means this should be abandoned. Probably makes sense so current owners can "own" the resulting patch. Let me know if this isn't right and I can un-abandon this.

Revision history for this message

Vishakha Agarwal (vishakha.agarwal) wrote on 2018-05-03:

#14

Hi,

Is anybody working on it? Can I reassign it to me?

Thanks

Vishakha Agarwal (vishakha.agarwal) on 2018-05-07

Changed in nova:
assignee:	Rikimaru Honjo (honjo-rikimaru-c6) → Vishakha Agarwal (vishakha.agarwal)

Revision history for this message

Vishakha Agarwal (vishakha.agarwal) wrote on 2018-05-07:

#15

Download full text (11.9 KiB)

I request the bug reporter to close the bug as this bug is already fixed in latest(Rock). Here is my analysis.

select * from block_device_mapping where instance_uuid='d709f9d8-e9ea-4c07-a83c-350689d27919';
+---------------------+---------------------+------------+----+-------------+-----------------------+-------------+--------------------------------------+-------------+-----------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+--------------------------------------+---------+-------------+------------------+--------------+-------------+----------+------------+----------+------+--------------------------------------+--------------------------------------+
| created_at | updated_at | deleted_at | id | device_name | delete_on_termination | snapshot_id | volume_id | volume_size | no_device | connection_info | instance_uuid | deleted | source_type | destination_type | guest_format | device_type | disk_bus | boot_index | image_id | tag | attachment_id | uuid |
+---------------------+---------------------+------------+----+-------------+-----------------------+-------------+--------------------------------------+-------------+-----------+------------------------------------------------------------------------------------------------------...

I request the bug reporter to close the bug as this bug is already fixed in latest(Rock). Here is my analysis.

select * from block_device_mapping where instance_uuid='d709f9d8-e9ea-4c07-a83c-350689d27919';
+---------------------+---------------------+------------+----+-------------+-----------------------+-------------+--------------------------------------+-------------+-----------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+--------------------------------------+---------+-------------+------------------+--------------+-------------+----------+------------+----------+------+--------------------------------------+--------------------------------------+
| created_at          | updated_at          | deleted_at | id | device_name | delete_on_termination | snapshot_id | volume_id                            | volume_size | no_device | connection_info                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               | instance_uuid                        | deleted | source_type | destination_type | guest_format | device_type | disk_bus | boot_index | image_id | tag  | attachment_id                        | uuid                                 |
+---------------------+---------------------+------------+----+-------------+-----------------------+-------------+--------------------------------------+-------------+-----------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+--------------------------------------+---------+-------------+------------------+--------------+-------------+----------+------------+----------+------+--------------------------------------+--------------------------------------+
| 2018-05-07 08:26:27 | 2018-05-07 08:26:35 | NULL       |  7 | /dev/vdb    |                     0 | NULL        | 54bc2524-a0c5-4908-9ea2-726708667702 |           4 |      NULL | {"status": "reserved", "detached_at": "", "volume_id": "54bc2524-a0c5-4908-9ea2-726708667702", "attach_mode": null, "driver_volume_type": "iscsi", "instance": "d709f9d8-e9ea-4c07-a83c-350689d27919", "attached_at": "", "serial": "54bc2524-a0c5-4908-9ea2-726708667702", "data": {"access_mode": "rw", "target_discovered": false, "encrypted": false, "qos_specs": null, "target_iqn": "iqn.2010-10.org.openstack:volume-54bc2524-a0c5-4908-9ea2-726708667702", "target_portal": "127.0.0.1:3260", "volume_id": "54bc2524-a0c5-4908-9ea2-726708667702", "target_lun": 1, "device_path": "/dev/sdb", "auth_password": "sbU7oRZje9snVL9U", "auth_username": "tr5KqceTr7vAjiXQV22W", "auth_method": "CHAP"}} | d709f9d8-e9ea-4c07-a83c-350689d27919 |       0 | volume      | volume           | NULL         | NULL        | NULL     |       NULL | NULL     | NULL | b2e440e2-690e-472c-9c77-5deadf723243 | 68d0593b-d4fa-47b9-9428-f57f8da1c5ee |
+---------------------+---------------------+------------+----+-------------+-----------------------+-------------+--------------------------------------+-------------+-----------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+--------------------------------------+---------+-------------+------------------+--------------+-------------+----------+------------+----------+------+--------------------------------------+--------------------------------------+
1 row in set (0.00 sec)

again attaching the instance with same volume ID

nova volume-attach d709f9d8-e9ea-4c07-a83c-350689d27919 54bc2524-a0c5-4908-9ea2-726708667702
ERROR (BadRequest): Invalid volume: volume 54bc2524-a0c5-4908-9ea2-726708667702 already attached (HTTP 400) (Request-ID: req-6a3052b6-fa92-4542-b0bb-80195c567f42)

select * from block_device_mapping where instance_uuid=e_uuid='d709f9d8-e9ea-4c07-a83c-350689d27919';
+---------------------+---------------------+------------+----+-------------+-----------------------+-------------+--------------------------------------+-------------+-----------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+--------------------------------------+---------+-------------+------------------+--------------+-------------+----------+------------+----------+------+--------------------------------------+--------------------------------------+
| created_at          | updated_at          | deleted_at | id | device_name | delete_on_termination | snapshot_id | volume_id                            | volume_size | no_device | connection_info                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               | instance_uuid                        | deleted | source_type | destination_type | guest_format | device_type | disk_bus | boot_index | image_id | tag  | attachment_id                        | uuid                                 |
+---------------------+---------------------+------------+----+-------------+-----------------------+-------------+--------------------------------------+-------------+-----------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+--------------------------------------+---------+-------------+------------------+--------------+-------------+----------+------------+----------+------+--------------------------------------+--------------------------------------+
| 2018-05-07 08:26:27 | 2018-05-07 08:26:35 | NULL       |  7 | /dev/vdb    |                     0 | NULL        | 54bc2524-a0c5-4908-9ea2-726708667702 |           4 |      NULL | {"status": "reserved", "detached_at": "", "volume_id": "54bc2524-a0c5-4908-9ea2-726708667702", "attach_mode": null, "driver_volume_type": "iscsi", "instance": "d709f9d8-e9ea-4c07-a83c-350689d27919", "attached_at": "", "serial": "54bc2524-a0c5-4908-9ea2-726708667702", "data": {"access_mode": "rw", "target_discovered": false, "encrypted": false, "qos_specs": null, "target_iqn": "iqn.2010-10.org.openstack:volume-54bc2524-a0c5-4908-9ea2-726708667702", "target_portal": "127.0.0.1:3260", "volume_id": "54bc2524-a0c5-4908-9ea2-726708667702", "target_lun": 1, "device_path": "/dev/sdb", "auth_password": "sbU7oRZje9snVL9U", "auth_username": "tr5KqceTr7vAjiXQV22W", "auth_method": "CHAP"}} | d709f9d8-e9ea-4c07-a83c-350689d27919 |       0 | volume      | volume           | NULL         | NULL        | NULL     |       NULL | NULL     | NULL | b2e440e2-690e-472c-9c77-5deadf723243 | 68d0593b-d4fa-47b9-9428-f57f8da1c5ee |
+---------------------+---------------------+------------+----+-------------+-----------------------+-------------+--------------------------------------+-------------+-----------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+--------------------------------------+---------+-------------+------------------+--------------+-------------+----------+------------+----------+------+--------------------------------------+--------------------------------------+
1 row in set (0.00 sec)

Working Fine. Thanks

Changed in nova:
assignee:	Vishakha Agarwal (vishakha.agarwal) → nobody

Revision history for this message

Brin Zhang (zhangbailin) wrote on 2019-07-18:

#16

Repeat this process, there is no such problem in the current branch, invalid bug.

[root@localhost ~]# nova volume-attach ec939a46-9e1e-4126-8214-6ceb3b95676a df191675-0ac1-4101-a590-7ad2f7607e89
ERROR (BadRequest): Invalid volume: volume df191675-0ac1-4101-a590-7ad2f7607e89 already attached (HTTP 400) (Request-ID: req-cb0d0d98-8035-4a9e-b055-ef515235e8aa)

MariaDB [nova_cell1]> select volume_id,volume_size,device_name,attachment_id,deleted_at from block_device_mapping where instance_uuid='ec939a46-9e1e-4126-8214-6ceb3b95676a';
+--------------------------------------+-------------+-------------+---------------+------------+
| volume_id | volume_size | device_name | attachment_id | deleted_at |
+--------------------------------------+-------------+-------------+---------------+------------+
| df191675-0ac1-4101-a590-7ad2f7607e89 | 1 | /dev/vda | NULL | NULL |
+--------------------------------------+-------------+-------------+---------------+------------+
1 row in set (0.00 sec)

Changed in nova:
status:	In Progress → Invalid

Revision history for this message

Qiu Fossen (fossen123) wrote on 2019-09-30:

#17

Did the Rocky version solve the problem of the drive letter display?

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2019-11-05: Related fix proposed to nova (master)

#18

Related fix proposed to branch: master
Review: https://review.opendev.org/692940

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2020-04-18: Change abandoned on nova (master)

#19

Change abandoned by Matt Riedemann (<email address hidden>) on branch: master
Review: https://review.opendev.org/692940

OpenStack Compute (nova)

Unnecessary bdm entry is created if same volume is attached twice to an instance

Bug Description

Other bug subscribers

Related blueprints

Remote bug watches