Bug #1452641 “Static Ceph mon IP addresses in connection_info ca...” : Bugs : OpenStack Compute (nova)

Revision history for this message

Josh Durgin (jdurgin) wrote on 2015-05-07:

#1

Nova stores the volume connection info in its db, so updating that
would be a workaround to allow restart/migration of vms to work.
Otherwise running vms shouldn't be affected, since they'll notice any
new or deleted monitors through their existing connection to the
monitor cluster.

Perhaps the most general way to fix this would be for cinder to return
any monitor hosts listed in ceph.conf (as they are listed, so they may
be hostnames or ips) in addition to the ips from the current monmap
(the current behavior).

That way an out of date ceph.conf is less likely to cause problems,
and multiple clusters could still be used with the same nova node.

Changed in cinder:
importance:	Undecided → Medium
status:	New → Confirmed

Eric Harney (eharney) on 2015-05-07

tags:

added: ceph

Revision history for this message

Dan van der Ster (dan-vanderster) wrote on 2015-05-11:

#2

The problem with adding hosts to the list in Cinder is that those previous mon hosts might be re-used in another Ceph clusters, thereby causing an authentication error when a VM tries an incorrect mon host at boot time. (This is due to the Ceph client behaviour not to try another monitor after authentication error... which I think is rather sane).

Bin Zhou (binzhou) on 2016-03-07

Changed in cinder:
assignee:	nobody → Bin Zhou (binzhou)

Revision history for this message

Sean McGinnis (sean-mcginnis) wrote on 2016-10-07: Owner Expired

#4

Unassigning due to no activity.

Changed in cinder:
assignee:	Bin Zhou (binzhou) → nobody

Eric Harney (eharney) on 2016-11-08

tags:	added: drivers
Changed in cinder:
assignee:	nobody → Jon Bernard (jbernard)

Revision history for this message

Kevin Fox (kevpn) wrote on 2017-02-17:

#5

How are you supposed to deal with needing to re'ip mons?

Revision history for this message

Sean McGinnis (sean-mcginnis) wrote on 2017-09-01: Bug Assignee Expired

#6

Unassigning due to no activity for > 6 months.

Changed in cinder:
assignee:	Jon Bernard (jbernard) → nobody

Revision history for this message

Matt Riedemann (mriedem) wrote on 2017-09-14:

#7

Talked about this at the queens ptg, notes are in here:

https://etherpad.openstack.org/p/cinder-ptg-queens

Changed in nova:
status:	New → Confirmed
importance:	Undecided → Medium
no longer affects:	cinder
tags:	added: volumes removed: drivers

Revision history for this message

Matt Riedemann (mriedem) wrote on 2017-09-14:

#8

Summary from the queens PTG: http://lists.openstack.org/pipermail/openstack-dev/2017-September/122170.html

Revision history for this message

Walt Boring (walter-boring) wrote on 2018-01-04:

#9

virt/libvirt/migration.py patch Edit (2.7 KiB, text/plain)

I have a customer that is seeing something similar to this. I thought about filing a new bug, but this might be sufficient to just piggy back this one.

They have running VMs that are boot from ceph volume and also has attached ceph volumes.
He adds a new monitor to his ceph cluster and updates ceph.conf on all of the openstack nodes to reflect the new monitor IP.

He does a live migration to try and get nova to update the libvirt.xml and it seems that only the volumes section is updated, not the vms section.

He added a patch to migration.py to fix this, but wasn't sure it was the right thing to do. I have added his patch as an attachment here.
Let me know if this might be ok, and I can submit the patch to gerrit.

This is a copy of xml after the live migrate.

I have a customer that is seeing something similar to this.  I thought about filing a new bug, but this might be sufficient to just piggy back this one.

They have running VMs that are boot from ceph volume and also has attached ceph volumes.
He adds a new monitor to his ceph cluster and updates ceph.conf on all of the openstack nodes to reflect the new monitor IP.

He does a live migration to try and get nova to update the libvirt.xml and it seems that only the volumes section is updated, not the vms section.

He added a patch to migration.py to fix this, but wasn't sure it was the right thing to do.   I have added his patch as an attachment here.
Let me know if this might be ok, and I can submit the patch to gerrit.

This is a copy of xml after the live migrate.

Revision history for this message

Matt Riedemann (mriedem) wrote on 2018-01-04:

#10

That patch is way too rbd specific I think. Here is a more detailed conversation we had in IRC and also goes over some of what was discussed at the Queens PTG:

http://eavesdrop.openstack.org/irclogs/%23openstack-nova/%23openstack-nova.2018-01-04.log.html#t2018-01-04T22:26:24

Revision history for this message

Lee Yarwood (lyarwood) wrote on 2018-01-04:

#11

~~~
<source protocol='rbd' name='vms/3b97914e-3f9b-410a-b3d9-6c1a83244136_disk'> <-- this one is NOT changed, old ips
        <host name='192.168.200.12' port='6789'/>
        <host name='192.168.200.14' port='6789'/>
        <host name='192.168.200.24' port='6789'/>
        <host name='192.168.240.17' port='6789'/>
        <host name='192.168.240.23' port='6789'/>
</source>
~~~

For ephemeral rbd images we fetch the mon ips during the initial instance creation but don't refresh this during LM [1]. IMHO this is a separate issue to the volume connection_info refresh problem being discussed in this bug.

[1] https://github.com/openstack/nova/blob/master/nova/virt/libvirt/storage/rbd_utils.py#L163

Revision history for this message

Walt Boring (walter-boring) wrote on 2018-01-04:

#12

Thanks Lee,
I filed a separate bug for updating the rbd images here:
https://bugs.launchpad.net/nova/+bug/1741364

Xav Paice (xavpaice) on 2018-06-07

tags:

added: canonical-bootstack

Revision history for this message

Xav Paice (xavpaice) wrote on 2018-06-07:

#13

This manifested itself again on a Mitaka cloud, we had moved the Ceph mons and existing, running, instances were fine, fresh new instances were fine, but when we stopped instances via nova, then started them again, they failed to start. Editing the xml didn't fix anything of course because Nova overwrite the xml on machine start.

I ended up fixing the nova db:

update block_device_mapping set connection_info = replace(connection_info, '"a.b.c.d", "a.b.c.e", "a.b.c.f"', '"a.b.c.foo", "a.b.c.bar", "a.b.c.baz"') where connection_info like '%a.b.c.d%'
and deleted_at is NULL;

The select query could have been better (don't copy me!) but you get the point.

Subscribing field-high because this is something that will continue to bite folks every time ceph-mon hosts are moved around.

Revision history for this message

James Page (james-page) wrote on 2018-06-11:

#14

I guess the alternative is to update the mapping for the block device on a stop/start nova operation.

Revision history for this message

Ubuntu Foundations Team Bug Bot (crichton) wrote on 2018-06-11:

#15

The attachment "virt/libvirt/migration.py patch" seems to be a patch. If it isn't, please remove the "patch" flag from the attachment, remove the "patch" tag, and if you are a member of the ~ubuntu-reviewers, unsubscribe the team.

[This is an automated message performed by a Launchpad user owned by ~brian-murray, for any issues please contact him.]

tags:

added: patch

Revision history for this message

Corey Bryant (corey.bryant) wrote on 2018-06-11:

#16

Just to summarize my understanding, and perhaps clarify for others, this bug is focused on stale connection_info for rbd volumes (not rbd images). rbd images have a related issue during live migration that is being handled in a separate bug (see comment 12 above).

Focusing on connection_info for rbd volumes now (and thanks to Matt Riedemann's comments for the tips here). connection_info appears to be properly refreshed for live migration in pre_live_migration() where _get_instance_block_device_info() is called with refresh_conn_info=True (see comment 9 above and https://github.com/openstack/nova/blob/stable/queens/nova/compute/manager.py#L5977).

Is the fix as simple as flipping refresh_conn_info=False to True for some of the other calls to _get_instance_block_device_info()? Below is an audit of the _get_instance_block_device_info() calls.

Calls to _get_instance_block_device_info() with refresh_conn_info=False:
  _destroy_evacuated_instances()
  _init_instance()
  _resume_guests_state()
  _shutdown_instance()
  _power_on()
  _do_rebuild_instance()
  reboot_instance()
  revert_resize()
  _resize_instance()
  resume_instance()
  shelve_offload_instance()
  check_can_live_migrate_source()
  _do_live_migration()
  _post_live_migration()
  post_live_migration_at_destination()
  rollback_live_migration_at_destination()

Calls to _get_instance_block_device_info() with refresh_conn_info=True:
  finish_revert_resize()
  _finish_resize()
  pre_live_migration()

Based on xavpaice's comments in (see comment 13 above -- "... existing, running, instances were fine, fresh new instances were fine, but when we stopped instances via nova, then started them again, they failed to start ..."), it would seem that the following should also have refresh_conn_info=True:
  _power_on() # solves xavpaice's scenario?
  _do_rebuild_instance()
  reboot_instance()

Revision history for this message

Xav Paice (xavpaice) wrote on 2018-06-11:

#17

FWIW, in the cloud we saw this, migrating the (stopped) instance also updated the connection info - it was just that migrating hundreds of instances wasn't practical.

Corey Bryant (corey.bryant) on 2018-06-12

Changed in nova:
assignee:	nobody → Corey Bryant (corey.bryant)
Changed in nova (Ubuntu):
assignee:	nobody → Corey Bryant (corey.bryant)

Revision history for this message

Corey Bryant (corey.bryant) wrote on 2018-06-14:

#18

I did some initial testing with the default parameter value for refresh_conn_info set to True in _get_instance_block_device_info() and unfortunately an instance with rbd volume attached does not successfully stop/start after ceph-mon's are moved to new IP addresses.

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2018-06-28: Fix proposed to nova (master)

#19

Fix proposed to branch: master
Review: https://review.openstack.org/579004

Changed in nova:
status:	Confirmed → In Progress

Corey Bryant (corey.bryant) on 2018-07-02

Changed in nova (Ubuntu):
status:	New → In Progress
importance:	Undecided → Medium

Revision history for this message

Xav Paice (xavpaice) wrote on 2018-09-20:

#20

Just a clarification on the process to 'move' ceph-mon units. I added ceph mons to the cluster, and removed the old ones - in this case it was a 'juju add-unit' and 'juju remove-unit' but any process to achieve the same thing would have the same result - the mons are now all on different addresses.

OpenStack Infra (hudson-openstack) on 2018-10-23

Changed in nova:
assignee:	Corey Bryant (corey.bryant) → Seyeong Kim (xtrusia)

OpenStack Infra (hudson-openstack) on 2018-12-18

Changed in nova:
assignee:	Seyeong Kim (xtrusia) → Lee Yarwood (lyarwood)

OpenStack Infra (hudson-openstack) on 2019-04-21

Changed in nova:
assignee:	Lee Yarwood (lyarwood) → Seyeong Kim (xtrusia)

OpenStack Infra (hudson-openstack) on 2019-05-25

Changed in nova:
assignee:	Seyeong Kim (xtrusia) → Lee Yarwood (lyarwood)

OpenStack Infra (hudson-openstack) on 2019-09-27

Changed in nova:
assignee:	Lee Yarwood (lyarwood) → Seyeong Kim (xtrusia)

Corey Bryant (corey.bryant) on 2020-01-23

Changed in nova (Ubuntu):
assignee:	Corey Bryant (corey.bryant) → nobody

Corey Bryant (corey.bryant) on 2020-01-23

Changed in nova (Ubuntu):
assignee:	nobody → Seyeong Kim (xtrusia)

Seyeong Kim (seyeongkim) on 2020-04-10

Changed in nova:
assignee:	Seyeong Kim (xtrusia) → nobody
Changed in nova (Ubuntu):
assignee:	Seyeong Kim (xtrusia) → nobody

James Page (james-page) on 2020-07-22

Changed in nova (Ubuntu):
status:	In Progress → Triaged

Revision history for this message

Paul Peereboom (peereb) wrote on 2020-08-18:

#21

We're changed our ceph-monitor ip's and we're running into this issue unfortunately. It is fixed on the cinder side, but still broken on the nova side. We could fix it with patch in of Walt (comment #9) but there must be more users running into this issue unknowingly.

Revision history for this message

Tyler Stachecki (tstachecki) wrote on 2020-10-23:

#22

We have also been bitten by this. Apologies if this does not help solve the bug, but this issue has been floating for quite awhile and the following may help future cloud operators...

In our case, we trying to re-IP ALL of our Ceph Mons. As Corey mentioned, this bug report is for *Cinder volumes*... but note that all of our instances were observed to make use of RBD-backed configuration drives which suffered the same problem as the images... so you may suffer from both problems even if you exclusively boot all instances from volume!

* RBD config drives AND Glance/image-based RBD volumes DID NOT have their Ceph Mon addresses updated as part of a live-migration, even with the patch in #9. The Ceph Mon addresses for these types in volumes IN PARTICULAR are NOT stored anywhere in a database and rather seem to be derived as needed when certain actions occur and otherwise carted around from hyp to hyp by way of the libvirt domain XML. Again, see the other LP bug for this.

* Trying to 'fix up' the Ceph Mon addresses via 'virsh edit' or comparable and then trying to live-migrate an instance to have those changes reflected is futile, because the Ceph Mon address changes are not reflected until a hard bounce of the VMM for that instance AND nova-compute uses the running copy of libvirt domain XML when shipping a copy to a destination hypervisor, NOT the copy on disk.

What we may end up doing (that worked in a lab environment) is to respin a patch off #9 that is applied to all worknode. It searches for all instances of './devices/disk/source' in the XML document which have an 'rbd' protocol. For each entry, we replace the current host subelements with our new Ceph Mon addresses. Then live-migrate every VM exactly once.

This works for all kinds of RBD volumes and, unlike 'virsh edit', works because the in-memory libvirt domain XML is rewritten prior to the VMM starting up on the destination host. Note that while you are doing the LMs and updating the domain XMLs, you must keep at least one of the old and new Ceph Mons accessible at all times.

We have also been bitten by this.  Apologies if this does not help solve the bug, but this issue has been floating for quite awhile and the following may help future cloud operators...

In our case, we trying to re-IP ALL of our Ceph Mons.  As Corey mentioned, this bug report is for *Cinder volumes*... but note that all of our instances were observed to make use of RBD-backed configuration drives which suffered the same problem as the images... so you may suffer from both problems even if you exclusively boot all instances from volume!

* RBD config drives AND Glance/image-based RBD volumes DID NOT have their Ceph Mon addresses updated as part of a live-migration, even with the patch in #9.  The Ceph Mon addresses for these types in volumes IN PARTICULAR are NOT stored anywhere in a database and rather seem to be derived as needed  when certain actions occur and otherwise carted around from hyp to hyp by way of the libvirt domain XML.  Again, see the other LP bug for this.

* Trying to 'fix up' the Ceph Mon addresses via 'virsh edit' or comparable and then trying to live-migrate an instance to have those changes reflected is futile, because the Ceph Mon address changes are not reflected until a hard bounce of the VMM for that instance AND nova-compute uses the running copy of libvirt domain XML when shipping a copy to a destination hypervisor, NOT the copy on disk.

What we may end up doing (that worked in a lab environment) is to respin a patch off #9 that is applied to all worknode.  It searches for all instances of './devices/disk/source' in the XML document which have an 'rbd' protocol.  For each entry, we replace the current host subelements with our new Ceph Mon addresses.  Then live-migrate every VM exactly once.

This works for all kinds of RBD volumes and, unlike 'virsh edit', works because the in-memory libvirt domain XML is rewritten prior to the VMM starting up on the destination host.  Note that while you are doing the LMs and updating the domain XMLs, you must keep at least one of the old and new Ceph Mons accessible at all times.

Revision history for this message

Peter (fazy) wrote on 2021-06-24:

#23

We have the same issue with Rocky.
One of my SQL wizard colleague helped me with some query, which can change the block_device_mapping table, and the RBD host/username/ports (if you change the number of ceph monitors, you'll need it)

Since we have multiple Zones, and our change will only affect Zone1, and since we have iSCSI storage too, we needed a bit more precise query.

Also my colleague pointed out, that the connection_info is JSON, and since the MariaDB 10.2.3 have support for json, he used them, just to be sure not to mess up the syntax.

So the three query (use with caution, and - of course - your own risk!):

update block_device_mapping as b set connection_info = json_replace(connection_info, '$.data.auth_username', 'dev-r1z1-c4e') where instance_uuid in (select i.uuid from instances as i where i.deleted_at is null and i.availability_zone = 'Zone1') AND JSON_EXISTS(b.connection_info, '$.data.hosts') = 1 and b.deleted_at is NULL;

update block_device_mapping as b set connection_info = json_replace(connection_info, '$.data.hosts', JSON_ARRAY("10.1.58.156", "10.1.58.157", "10.1.58.158")) where instance_uuid in (select i.uuid from instances as i where i.deleted_at is null and i.availability_zone = 'Zone1') AND JSON_EXISTS(b.connection_info, '$.data.hosts') = 1 and b.deleted_at is NULL;

update block_device_mapping as b set connection_info = json_replace(connection_info, '$.data.ports', JSON_ARRAY("6789", "6789", "6789")) where instance_uuid in (select i.uuid from instances as i where i.deleted_at is null and i.availability_zone = 'Zone1') AND JSON_EXISTS(b.connection_info, '$.data.hosts') = 1 and b.deleted_at is NULL;

Revision history for this message

Chase Peterson (chasepeterson) wrote on 2021-10-08:

#24

We are also experiencing this issue with Stein.

We are deployed using rook-ceph as our ceph deployment and the monitors IP address will rotate if any of the monitors are down for more than 10 minutes. Which means anytime we do maintenance on a node running one of rook-ceph's monitors the ip address will change and we have to go into the database and update all the mon_hosts for the VMs just so they are able to start up again.

Chase Peterson (chasepeterson) on 2021-10-19

information type:

Public → Public Security

Revision history for this message

Jeremy Stanley (fungi) wrote on 2021-10-20:

#25

Please don't set OpenStack bugs to Public Security without some explanation as to why you believe this to be an exploitable risk which needs attention from the OpenStack vulnerability managers for coordinating a possible security advisory. I'm switching this back to a normal Public bug for now, but if you suspect this report represents an actual security risk then please explain and set it to Public Security again. Thanks!

information type:

Public Security → Public

Revision history for this message

Maximilian Stinsky (mstinsky) wrote on 2021-12-07:

#26

While we had to migrate/change ceph mon ip's in our openstack deployment, which is stein at the moment, we got hit by the same problem.

We fixed it also by manually changing all db entries that required an update.
Comment https://bugs.launchpad.net/nova/+bug/1452641/comments/23 helped us a ton with creating a sensible database query.

Revision history for this message

Jadon Naas (jadonn) wrote on 2023-07-10:

#27

This bug has not had any activity in a few years. Is anyone still seeing this in releases of OpenStack newer than Stein?

Revision history for this message

Billy Olsen (billy-olsen) wrote on 2023-07-10:

#28

This is not a charm bug, its a limitation/bug in the way that nova handles the BDM devices.

Changed in nova:
status:	In Progress → Invalid

Revision history for this message

Arne Wiebalck (arne-wiebalck) wrote on 2023-07-11:

#29

Jadon: with Nova in Train and Cinder in Wallaby the issue is still there (and we use a workaround to update the mon IPs in the cell DBs).

Revision history for this message

Sven Kieske (s-kieske) wrote on 2023-07-11:

#30

billy-olsen: from the launchpad bug status description: "invalid" means: "the report describes the software's normal behaviour, or is unsuitable for any other reason."

I'd argue that it is not normal behaviour but a missing feature or a bug, depending on how you look at it.

imho it's clearly a nova shortcoming.

if nova doesn't want this to be fixed please close it as "won't fix" instead.

thanks.

Changed in nova:
status:	Invalid → Confirmed

Revision history for this message

melanie witt (melwitt) wrote on 2023-07-11:

#31

FWIW, as of the Xena release [1], there are nova-manage commands available for the purpose of pulling fresh volume connection_info from Cinder API and updating instance block device mapping records in the database without disrupting the instance:

https://docs.openstack.org/nova/latest/cli/nova-manage.html#volume-attachment-refresh

As for automatically refreshing connection_info, at the Queens PTG we had consensus to add refresh of connection_info to various server action paths that generate new libvirt XML [2] but we did not have anyone available to do the work at the time.

Some had concerns about the overhead of always querying Cinder API for new connection_info on particular server actions when the changing of connection_info is a relatively rare event, which likely informed the work on the 'nova-manage volume_attachment' commands.

If the nova-manage commands are not sufficient for operator use cases, we would like to know that along with how high of priority it is from operators' perspectives.

[1] https://docs.openstack.org/releasenotes/nova/xena.html#relnotes-24-0-0-stable-xena-new-features
[2] https://lists.openstack.org/pipermail/openstack-dev/2017-September/122170.html

Revision history for this message

Billy Olsen (billy-olsen) wrote on 2023-07-12:

#32

Apologies for marking this incorrectly as invalid. I was looking at it through the lens of an OpenStack Charms bug, which was incorrect.

Revision history for this message

Gui Maluf Balzana (guimalufb) wrote on 2023-08-29:

#33

After replacing all ceph-mon units on my bionic/queens openstack cloud I faced the same issue on a instance.

openstack shelve $instance_uuid
openstack unshelve $instance_uuid

was a workaround that brought the instance back to life

Affects		Status	Importance	Assigned to	Milestone
	OpenStack Compute (nova)	Confirmed	Medium	Unassigned
	nova (Ubuntu)	Triaged	Medium	Unassigned

OpenStack Compute (nova)

Static Ceph mon IP addresses in connection_info can prevent VM startup

Bug Description

Duplicates of this bug

Other bug subscribers

Patches

Remote bug watches