VMWARE: Intermittent problem with stats reporting

Bug #1252827 reported by Sreeram Yerrapragada on 2013-11-19
10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
High
Sabari Murugesan
Havana
High
Gary Kotton
VMwareAPI-Team
Critical
Sabari Murugesan

Bug Description

I see that sometimes vmware driver reports 0 stats. Please take a look at the following log file for more information: http://162.209.83.206/logs/51404/6/screen-n-cpu.txt.gz

excerpts from log file:
2013-11-18 15:41:03.994 20162 WARNING nova.virt.vmwareapi.vim_util [-] Unable to retrieve value for datastore Reason: None
2013-11-18 15:41:04.029 20162 WARNING nova.virt.vmwareapi.vim_util [-] Unable to retrieve value for host Reason: None
2013-11-18 15:41:04.029 20162 WARNING nova.virt.vmwareapi.vim_util [-] Unable to retrieve value for resourcePool Reason: None
2013-11-18 15:41:04.029 20162 DEBUG nova.compute.resource_tracker [-] Hypervisor: free ram (MB): 0 _report_hypervisor_resource_view /opt/stack/nova/nova/compute/resource_tracker.py:389
2013-11-18 15:41:04.029 20162 DEBUG nova.compute.resource_tracker [-] Hypervisor: free disk (GB): 0 _report_hypervisor_resource_view /opt/stack/nova/nova/compute/resource_tracker.py:390
2013-11-18 15:41:04.030 20162 DEBUG nova.compute.resource_tracker [-] Hypervisor: VCPU information unavailable _report_hypervisor_resource_view /opt/stack/nova/nova/compute/resource_tracker.py:397

During this time we cannot spawn any server. Look at the http://162.209.83.206/logs/51404/6/screen-n-sch.txt.gz

excerpts from log file:
2013-11-18 15:41:52.475 DEBUG nova.filters [req-dc82a954-3cc5-4627-ae01-b3d1ec2155af InstanceActionsTestXML-tempest-716947327-user InstanceActionsTestXML-tempest-716947327-tenant] Filter AvailabilityZoneFilter returned 1 host(s) get_filtered_objects /opt/stack/nova/nova/filters.py:88
2013-11-18 15:41:52.476 DEBUG nova.scheduler.filters.ram_filter [req-dc82a954-3cc5-4627-ae01-b3d1ec2155af InstanceActionsTestXML-tempest-716947327-user InstanceActionsTestXML-tempest-716947327-tenant] (Ubuntu1204Server, domain-c26(c1)) ram:-576 disk:0 io_ops:0 instances:1 does not have 64 MB usable ram, it only has -576.0 MB usable ram. host_passes /opt/stack/nova/nova/scheduler/filters/ram_filter.py:60
2013-11-18 15:41:52.476 INFO nova.filters [req-dc82a954-3cc5-4627-ae01-b3d1ec2155af InstanceActionsTestXML-tempest-716947327-user InstanceActionsTestXML-tempest-716947327-tenant] Filter RamFilter returned 0 hosts
2013-11-18 15:41:52.477 WARNING nova.scheduler.driver [req-dc82a954-3cc5-4627-ae01-b3d1ec2155af InstanceActionsTestXML-tempest-716947327-user InstanceActionsTestXML-tempest-716947327-tenant] [instance: 1a648022-1783-4874-8b41-c3f4c89d8500] Setting instance to ERROR state.

Ryan Hsu (rhsu) on 2013-11-19
affects: barbican → nova
Changed in nova:
status: New → Confirmed
importance: Undecided → Critical
Changed in openstack-vmwareapi-team:
importance: Undecided → Critical
status: New → Confirmed
Changed in nova:
importance: Critical → High
Tracy Jones (tjones-i) wrote :

going to run the debug code on one of the CI slaves to hopefully repo

Gary Kotton (garyk) wrote :

Moved to critical - when the problem occurs the VM's cannot be booted.

Changed in nova:
importance: High → Critical
Gary Kotton (garyk) on 2013-11-27
Changed in nova:
milestone: none → icehouse-1
assignee: nobody → Gary Kotton (garyk)
Changed in nova:
status: Confirmed → In Progress
Shawn Hartsock (hartsock) wrote :

Please tell me I don't need to ask Russell Bryant to come here and explain priorities to you.

Changed in nova:
importance: Critical → High
Changed in openstack-vmwareapi-team:
status: Confirmed → In Progress
Gary Kotton (garyk) wrote :

Yeah, an explanation of the priorities would be nice. If I was unable to deploy a VM I would consider that a critical problem, but if high is what we need to settle for then great.

Problem is address by: https://review.openstack.org/#/c/58890/ the patch https://review.openstack.org/#/c/58705/ was a quick fix until we found the real issue

Changed in nova:
assignee: Gary Kotton (garyk) → Sabari Kumar Murugesan (smurugesan)
dan wendlandt (danwent) wrote :

no need for a priority fight here. In terms of the priority within the nova project, i believe this should be 'high', as its impact is limited to a single driver, and I think critical is reserved for general items (at least this is what I have been told... not sure if it is strictly enforced). For the vmwareapi project, I would consider this critical.

tags: added: havana-backport-potential
Changed in nova:
milestone: icehouse-1 → icehouse-2
Changed in openstack-vmwareapi-team:
assignee: nobody → Sabari Kumar Murugesan (smurugesan)

Reviewed: https://review.openstack.org/58890
Committed: http://github.com/openstack/nova/commit/6471776b6b25bb4062238f7c1b732b2d6999ec65
Submitter: Jenkins
Branch: master

commit 6471776b6b25bb4062238f7c1b732b2d6999ec65
Author: Sabari Kumar Murugesan <email address hidden>
Date: Wed Nov 27 16:10:59 2013 -0800

    VMware: Fix unhandled session failure issues

    VMware driver has a re-try mechanism to handle session expiration
    failures. Due to a minor bug in the exception handling module, this
    failure was unhandled.

    The patch fixes this issue and has added tests.

    Closes-Bug: #1252827
    Change-Id: Ie91adb4b4b57b7cefeed855cdbe4710da86294f0

Changed in nova:
status: In Progress → Fix Committed
Alan Pevec (apevec) on 2013-12-07
tags: removed: havana-backport-potential

Reviewed: https://review.openstack.org/60651
Committed: http://github.com/openstack/nova/commit/c2278faae1248ecbd149d0750ab1e27d53ded62d
Submitter: Jenkins
Branch: stable/havana

commit c2278faae1248ecbd149d0750ab1e27d53ded62d
Author: Sabari Kumar Murugesan <email address hidden>
Date: Wed Nov 27 16:10:59 2013 -0800

    VMware: Fix unhandled session failure issues

    VMware driver has a re-try mechanism to handle session expiration
    failures. Due to a minor bug in the exception handling module, this
    failure was unhandled.

    The patch fixes this issue and has added tests.

    Closes-Bug: #1252827
    (cherry picked from commit 6471776b6b25bb4062238f7c1b732b2d6999ec65)

    Conflicts:

     nova/tests/virt/vmwareapi/test_vmwareapi_vim_util.py

    Change-Id: I6b2e0ce664c0f6b479475a4bbc80947e5a1f9101

Thierry Carrez (ttx) on 2014-01-22
Changed in nova:
status: Fix Committed → Fix Released
Changed in openstack-vmwareapi-team:
status: In Progress → Fix Released
Thierry Carrez (ttx) on 2014-04-17
Changed in nova:
milestone: icehouse-2 → 2014.1
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers