Spurious error from ComputeManager._run_image_cache_manager_pass

Bug #1261442 reported by Thomas Herve
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Fix Released
Low
jichenjc

Bug Description

I got the following errors in a tempest test at http://logs.openstack.org/97/62397/1/check/check-tempest-dsvm-full/3f7b8c3:

2013-12-16 16:07:52.189 27621 DEBUG nova.openstack.common.processutils [-] Result was 1 execute /opt/stack/new/nova/nova/openstack/common/processutils.py:172
2013-12-16 16:07:52.189 27621 ERROR nova.openstack.common.periodic_task [-] Error during ComputeManager._run_image_cache_manager_pass: Unexpected error while running command.
Command: env LC_ALL=C LANG=C qemu-img info /opt/stack/data/nova/instances/95dbf14f-3abb-42c7-94c5-dd7355ecd78a/disk
Exit code: 1
Stdout: ''
Stderr: "qemu-img: Could not open '/opt/stack/data/nova/instances/95dbf14f-3abb-42c7-94c5-dd7355ecd78a/disk': No such file or directory\n"
2013-12-16 16:07:52.189 27621 TRACE nova.openstack.common.periodic_task Traceback (most recent call last):
2013-12-16 16:07:52.189 27621 TRACE nova.openstack.common.periodic_task File "/opt/stack/new/nova/nova/openstack/common/periodic_task.py", line 180, in run_periodic_tasks
2013-12-16 16:07:52.189 27621 TRACE nova.openstack.common.periodic_task task(self, context)
2013-12-16 16:07:52.189 27621 TRACE nova.openstack.common.periodic_task File "/opt/stack/new/nova/nova/compute/manager.py", line 5210, in _run_image_cache_manager_pass
2013-12-16 16:07:52.189 27621 TRACE nova.openstack.common.periodic_task self.driver.manage_image_cache(context, filtered_instances)
2013-12-16 16:07:52.189 27621 TRACE nova.openstack.common.periodic_task File "/opt/stack/new/nova/nova/virt/libvirt/driver.py", line 4650, in manage_image_cache
2013-12-16 16:07:52.189 27621 TRACE nova.openstack.common.periodic_task self.image_cache_manager.verify_base_images(context, all_instances)
2013-12-16 16:07:52.189 27621 TRACE nova.openstack.common.periodic_task File "/opt/stack/new/nova/nova/virt/libvirt/imagecache.py", line 603, in verify_base_images
2013-12-16 16:07:52.189 27621 TRACE nova.openstack.common.periodic_task inuse_backing_images = self._list_backing_images()
2013-12-16 16:07:52.189 27621 TRACE nova.openstack.common.periodic_task File "/opt/stack/new/nova/nova/virt/libvirt/imagecache.py", line 345, in _list_backing_images
2013-12-16 16:07:52.189 27621 TRACE nova.openstack.common.periodic_task backing_file = virtutils.get_disk_backing_file(disk_path)
2013-12-16 16:07:52.189 27621 TRACE nova.openstack.common.periodic_task File "/opt/stack/new/nova/nova/virt/libvirt/utils.py", line 442, in get_disk_backing_file
2013-12-16 16:07:52.189 27621 TRACE nova.openstack.common.periodic_task backing_file = images.qemu_img_info(path).backing_file
2013-12-16 16:07:52.189 27621 TRACE nova.openstack.common.periodic_task File "/opt/stack/new/nova/nova/virt/images.py", line 56, in qemu_img_info
2013-12-16 16:07:52.189 27621 TRACE nova.openstack.common.periodic_task 'qemu-img', 'info', path)
2013-12-16 16:07:52.189 27621 TRACE nova.openstack.common.periodic_task File "/opt/stack/new/nova/nova/utils.py", line 175, in execute
2013-12-16 16:07:52.189 27621 TRACE nova.openstack.common.periodic_task return processutils.execute(*cmd, **kwargs)
2013-12-16 16:07:52.189 27621 TRACE nova.openstack.common.periodic_task File "/opt/stack/new/nova/nova/openstack/common/processutils.py", line 178, in execute
2013-12-16 16:07:52.189 27621 TRACE nova.openstack.common.periodic_task cmd=' '.join(cmd))
2013-12-16 16:07:52.189 27621 TRACE nova.openstack.common.periodic_task ProcessExecutionError: Unexpected error while running command.
2013-12-16 16:07:52.189 27621 TRACE nova.openstack.common.periodic_task Command: env LC_ALL=C LANG=C qemu-img info /opt/stack/data/nova/instances/95dbf14f-3abb-42c7-94c5-dd7355ecd78a/disk
2013-12-16 16:07:52.189 27621 TRACE nova.openstack.common.periodic_task Exit code: 1
2013-12-16 16:07:52.189 27621 TRACE nova.openstack.common.periodic_task Stdout: ''
2013-12-16 16:07:52.189 27621 TRACE nova.openstack.common.periodic_task Stderr: "qemu-img: Could not open '/opt/stack/data/nova/instances/95dbf14f-3abb-42c7-94c5-dd7355ecd78a/disk': No such file or directory\n"
2013-12-16 16:07:52.189 27621 TRACE nova.openstack.common.periodic_task

Changed in nova:
status: New → Triaged
importance: Undecided → Low
jichenjc (jichenjc)
Changed in nova:
assignee: nobody → jichencom (jichenjc)
Revision history for this message
jichenjc (jichenjc) wrote :

from the log screen-n-cpu.txt

2013-12-16 16:07:52.132</a> 27621 DEBUG nova.openstack.common.processutils [-] Running cmd (subprocess): env LC_ALL=C LANG=C qemu-img info /opt/stack/data/nova/instances/95dbf14f-3abb-42c7-94c5-dd7355ecd78a/disk execute /opt/stack/new/nova/nova/openstack/common/processutils.py:147
2013-12-16 16:07:52.181[instance: 95dbf14f-3abb-42c7-94c5-dd7355ecd78a] Deleting instance files /opt/stack/data/nova/instances/95dbf14f-3abb-42c7-94c5-dd7355ecd78a
2013-12-16 16:07:52.185 [instance: 95dbf14f-3abb-42c7-94c5-dd7355ecd78a] Deletion of /opt/stack/data/nova/instances/95dbf14f-3abb-42c7-94c5-dd7355ecd78a complete
</span><span class='DEBUG _2013-12-16_16_07_52_189'><a name='_2013-12-16_16_07_52_189' class='date' href='#_2013-12-16_16_07_52_189'>2013-12-16 16:07:52.189</a> 27621 DEBUG nova.openstack.common.processutils [-] Result was 1 execute /opt/stack/new/nova/nova/openstack/common/processutils.py:172

this is a very rare case that the instance 95dbf14f-3abb-42c7-94c5-dd7355ecd78a was deleted between check disk file and get qemu info from it , at 16:07:52.132, t's still there (because the file exist check logic before), but at 16:07:52.181, it was deleted
then the process switch back to original one , it find the file was gone and return a error

the solution with lock might be a candidate, but catch the exception and ignore it is much more simple

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (master)

Fix proposed to branch: master
Review: https://review.openstack.org/64277

Changed in nova:
status: Triaged → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (master)

Reviewed: https://review.openstack.org/64277
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=52f9994aab8b60de2ba22579f923beb050e92bca
Submitter: Jenkins
Branch: master

commit 52f9994aab8b60de2ba22579f923beb050e92bca
Author: jichenjc <email address hidden>
Date: Sun Dec 22 06:52:27 2013 +0800

    Fix image cache periodic task concurrent access bug

    This fix catch the exception and ignore a disk file absent
    exception. It will defer the image cache clean to next round
    image cache periodic task because it's not harmful.
    Refer to detailed info, log analysis and root cause in defect.

    Change-Id: I3a431bf389658f976edf5c847c04b7e0aac496fd
    Closes-Bug: #1261442

Changed in nova:
status: In Progress → Fix Committed
Changed in nova:
milestone: none → icehouse-2
Thierry Carrez (ttx)
Changed in nova:
status: Fix Committed → Fix Released
Thierry Carrez (ttx)
Changed in nova:
milestone: icehouse-2 → 2014.1
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.