nova placement problems with low compute node storage available

Bug #1661772 reported by Drew Freiberger
18
This bug affects 2 people
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Confirmed
Medium
Unassigned
OpenStack Nova Cloud Controller Charm
Invalid
Undecided
Unassigned
nova-cloud-controller (Juju Charms Collection)
Invalid
Undecided
Unassigned

Bug Description

Nova-scheduler cannot identify placement host when raw image size is greater than available local storage, even when using raw image, show_image_direct_url=1, and ceph ephemeral storage, such that glance images would never land on local node storage.

Have following configuration for glance:
  juju config glance api-config-flags='show_image_direct_url=true'
  juju config glance registry-config-flags='show_image_direct_url=true'

Worked around issue by allowing storage over-commit using:
  juju config nova-cloud-controller config-flags="disk_allocation_ratio=5.0"

This is workaround is not ideal when working with qcow images which will require local nova-compute node storage for raw conversion.

nova-cloud-controller and nova-compute logs attached

tags: added: canonical-bootstack config juju nova-cloud-controller
Revision history for this message
Drew Freiberger (afreiberger) wrote :

For clarity, this issue was found on Mitaka

James Page (james-page)
Changed in nova-cloud-controller (Juju Charms Collection):
status: New → Invalid
Revision history for this message
James Page (james-page) wrote :

Pretty sure this is not a charm specific issue, so raising bug task against nova.

Changed in charm-nova-cloud-controller:
status: New → Invalid
Revision history for this message
Andrey Volkov (avolkov) wrote :

Could you please provide logs and steps for reproducing?

Changed in nova:
status: New → Incomplete
Revision history for this message
Drew Freiberger (afreiberger) wrote :

Sorry for lag in response. Logs not handy, but to reproduce:

Set libvirt-image-backend for nova to rbd (ceph)

Create raw bootable image in glance of size 100GB (ensuring glance is ceph rbd backed)

Fill all available hypervisors such that /var/lib/nova/instances/_base has less than 100GB free

set disk_allocation_ratio = 1.0 or less

recycle nova-scheduler/nova-conductor services and attempt to boot image as new server and you'll find that it can't find a suitable hypervisor to deploy.

If you kick disk_allocation_ratio to something that would allow the 100GB image to fit within the free-space with overcommit, then the same image will boot and glance rbd and nova rbd backend work together to never move data to /var/lib/nova/instances/_base for conversion (as it just creates an rbd snapshot+COW for the new nova rbd object.

If glance and nova libvirt backend both == rbd and show_image_direct_url == true and image-format == raw, disk_allocation_ratio check should not be made

Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for OpenStack Compute (nova) because there has been no activity for 60 days.]

Changed in nova:
status: Incomplete → Expired
Alvaro Uria (aluria)
Changed in nova:
status: Expired → New
Revision history for this message
melanie witt (melwitt) wrote :

This is unfortunately a known issue in nova for several years now, as nova historically has not had proper support for shared storage (there are many gaps).

We had the same issue for boot-from-volume servers until recently [1] and there's a bug open related to it [2].

For shared storage support, we have been working on it but making slow progress. You can find some information about it in this bug: https://bugs.launchpad.net/nova/+bug/1784020 and on our Stein PTG etherpad at L219: https://etherpad.openstack.org/p/nova-ptg-stein

Because of higher priority work happening this cycle and the fact that this issue is latent, I'm going to set this bug to Medium importance.

[1] https://launchpad.net/bugs/1469179
[2] https://launchpad.net/bugs/1796737

Changed in nova:
importance: Undecided → Medium
status: New → Confirmed
tags: added: shared-storage
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.