Ironic: Invalid hypervisor stats info while instance running

Bug #1637449 reported by Tuan on 2016-10-28
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Ironic
Invalid
Undecided
Unassigned
OpenStack Compute (nova)
Undecided
Tuan

Bug Description

Description
===========

hypervisor-stats of nova showing wrong information of ironic node resource.

Steps to reproduce
==================
Environment was setup following http://docs.openstack.org/developer/ironic/dev/dev-quickstart.html#deploying-ironic-with-devstack

After delpoy 3 ironic-nodes, each has 1 cpu, 1024mb mem, 1gb disk, 2 instances running:
#nova hypervisor-stats
+----------------------+-------+
| Property | Value |
+----------------------+-------+
| count | 3 |
| current_workload | 1 |
| disk_available_least | -10 |
| free_disk_gb | 10 |
| free_ram_mb | 1024 |
| local_gb | 10 |
| local_gb_used | 20 |
| memory_mb | 1024 |
| memory_mb_used | 2048 |
| running_vms | 2 |
| vcpus | 1 |
| vcpus_used | 2 |
+----------------------+-------+

Expected result
===============

vcpus should be 3.
memory_mb should be 3072.
local_gb should be 30.

Tuan (tuanla) on 2016-10-28
Changed in ironic:
assignee: nobody → Tuan (tuanla)
Changed in nova:
assignee: nobody → Tuan (tuanla)

Fix proposed to branch: master
Review: https://review.openstack.org/391415

Changed in nova:
status: New → In Progress
joel (uestcjoel) on 2016-10-28
Changed in nova:
assignee: Tuan (tuanla) → joel (uestcjoel)
assignee: joel (uestcjoel) → nobody
Changed in nova:
assignee: nobody → Tuan (tuanla)
Dmitry Tantsur (divius) wrote :

Thanks for reporting it, I think I've seen this problem myself. However, it's not related to the Ironic service, so I'm closing the Ironic part of this bug.

Changed in ironic:
status: New → Invalid
Tuan (tuanla) on 2016-11-01
Changed in ironic:
assignee: Tuan (tuanla) → nobody
Changed in nova:
assignee: Tuan (tuanla) → Dao Cong Tien (tiendc)
Changed in nova:
assignee: Dao Cong Tien (tiendc) → Tuan (tuanla)
Vladyslav Drok (vdrok) wrote :

The logic here [0] indeed seems to be incorrect, eg in case of there is
a node in available state with instance_uuid set, first the driver will
report vcpus=vcpus_used=properties['vcpus'] and then will set vcpus=0
leaving vcpus_used intact.

My proposal here is the following:

* If there is an instance_uuid on the node, no matter what provision/power
  state it's in, consider the resources as used. In case it's an orphan,
  an admin will need to take some manual action anyway.

* If there is no instance_uuid and a node is in cleaning/clean wait after
  tear down, it is a part of normal node lifecycle, report all resources
  as used. This means we need a way to determine if it's a manual or
  automated clean.

* If there is no instance_uuid, and a node:
  - has a bad power state or
  - is in maintenance
  - manual clean is happening
  or actually in any other case, consider it unavailable, report available
  resources = used resources = 0. Provision state does not matter in this
  logic, all cases that we wanted to take into account are described in
  the first two bullets.

[0] https://github.com/openstack/nova/blob/1506c36b4446f6ba1487a2d68e4b23cb3fca44cb/nova/virt/ironic/driver.py#L262

Change abandoned by Tuan Luong-Anh (<email address hidden>) on branch: master
Review: https://review.openstack.org/391415

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers