nova-grenade-live-migration intermittently fails with "Error monitoring migration: Timed out during operation: cannot acquire state change lock (held by remoteDispatchDomainMigratePerform3Params)"

Bug #1840159 reported by Matt Riedemann on 2019-08-14
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Low
Unassigned

Bug Description

Seen here:

https://logs.opendev.org/21/655721/14/check/nova-grenade-live-migration/2ee634d/logs/subnode-2/screen-n-cpu.txt.gz?level=TRACE#_Aug_13_10_03_49_974378

Aug 13 10:03:49.974378 ubuntu-bionic-limestone-regionone-0010083920 nova-compute[25863]: WARNING nova.virt.libvirt.driver [-] [instance: a1637e8b-6f2d-4127-9799-31cefb3f43a6] Error monitoring migration: Timed out during operation: cannot acquire state change lock (held by remoteDispatchDomainMigratePerform3Params): libvirtError: Timed out during operation: cannot acquire state change lock (held by remoteDispatchDomainMigratePerform3Params)
Aug 13 10:03:49.974378 ubuntu-bionic-limestone-regionone-0010083920 nova-compute[25863]: ERROR nova.virt.libvirt.driver [instance: a1637e8b-6f2d-4127-9799-31cefb3f43a6] Traceback (most recent call last):
Aug 13 10:03:49.974378 ubuntu-bionic-limestone-regionone-0010083920 nova-compute[25863]: ERROR nova.virt.libvirt.driver [instance: a1637e8b-6f2d-4127-9799-31cefb3f43a6] File "/opt/stack/old/nova/nova/virt/libvirt/driver.py", line 8052, in _live_migration
Aug 13 10:03:49.974378 ubuntu-bionic-limestone-regionone-0010083920 nova-compute[25863]: ERROR nova.virt.libvirt.driver [instance: a1637e8b-6f2d-4127-9799-31cefb3f43a6] finish_event, disk_paths)
Aug 13 10:03:49.974378 ubuntu-bionic-limestone-regionone-0010083920 nova-compute[25863]: ERROR nova.virt.libvirt.driver [instance: a1637e8b-6f2d-4127-9799-31cefb3f43a6] File "/opt/stack/old/nova/nova/virt/libvirt/driver.py", line 7857, in _live_migration_monitor
Aug 13 10:03:49.974378 ubuntu-bionic-limestone-regionone-0010083920 nova-compute[25863]: ERROR nova.virt.libvirt.driver [instance: a1637e8b-6f2d-4127-9799-31cefb3f43a6] info = guest.get_job_info()
Aug 13 10:03:49.974378 ubuntu-bionic-limestone-regionone-0010083920 nova-compute[25863]: ERROR nova.virt.libvirt.driver [instance: a1637e8b-6f2d-4127-9799-31cefb3f43a6] File "/opt/stack/old/nova/nova/virt/libvirt/guest.py", line 709, in get_job_info
Aug 13 10:03:49.974378 ubuntu-bionic-limestone-regionone-0010083920 nova-compute[25863]: ERROR nova.virt.libvirt.driver [instance: a1637e8b-6f2d-4127-9799-31cefb3f43a6] stats = self._domain.jobStats()
Aug 13 10:03:49.974378 ubuntu-bionic-limestone-regionone-0010083920 nova-compute[25863]: ERROR nova.virt.libvirt.driver [instance: a1637e8b-6f2d-4127-9799-31cefb3f43a6] File "/usr/local/lib/python2.7/dist-packages/eventlet/tpool.py", line 190, in doit
Aug 13 10:03:49.974378 ubuntu-bionic-limestone-regionone-0010083920 nova-compute[25863]: ERROR nova.virt.libvirt.driver [instance: a1637e8b-6f2d-4127-9799-31cefb3f43a6] result = proxy_call(self._autowrap, f, *args, **kwargs)
Aug 13 10:03:49.974378 ubuntu-bionic-limestone-regionone-0010083920 nova-compute[25863]: ERROR nova.virt.libvirt.driver [instance: a1637e8b-6f2d-4127-9799-31cefb3f43a6] File "/usr/local/lib/python2.7/dist-packages/eventlet/tpool.py", line 148, in proxy_call
Aug 13 10:03:49.974378 ubuntu-bionic-limestone-regionone-0010083920 nova-compute[25863]: ERROR nova.virt.libvirt.driver [instance: a1637e8b-6f2d-4127-9799-31cefb3f43a6] rv = execute(f, *args, **kwargs)
Aug 13 10:03:49.974378 ubuntu-bionic-limestone-regionone-0010083920 nova-compute[25863]: ERROR nova.virt.libvirt.driver [instance: a1637e8b-6f2d-4127-9799-31cefb3f43a6] File "/usr/local/lib/python2.7/dist-packages/eventlet/tpool.py", line 129, in execute
Aug 13 10:03:49.974378 ubuntu-bionic-limestone-regionone-0010083920 nova-compute[25863]: ERROR nova.virt.libvirt.driver [instance: a1637e8b-6f2d-4127-9799-31cefb3f43a6] six.reraise(c, e, tb)
Aug 13 10:03:49.974378 ubuntu-bionic-limestone-regionone-0010083920 nova-compute[25863]: ERROR nova.virt.libvirt.driver [instance: a1637e8b-6f2d-4127-9799-31cefb3f43a6] File "/usr/local/lib/python2.7/dist-packages/eventlet/tpool.py", line 83, in tworker
Aug 13 10:03:49.974378 ubuntu-bionic-limestone-regionone-0010083920 nova-compute[25863]: ERROR nova.virt.libvirt.driver [instance: a1637e8b-6f2d-4127-9799-31cefb3f43a6] rv = meth(*args, **kwargs)
Aug 13 10:03:49.974378 ubuntu-bionic-limestone-regionone-0010083920 nova-compute[25863]: ERROR nova.virt.libvirt.driver [instance: a1637e8b-6f2d-4127-9799-31cefb3f43a6] File "/usr/local/lib/python2.7/dist-packages/libvirt.py", line 1403, in jobStats
Aug 13 10:03:49.974378 ubuntu-bionic-limestone-regionone-0010083920 nova-compute[25863]: ERROR nova.virt.libvirt.driver [instance: a1637e8b-6f2d-4127-9799-31cefb3f43a6] if ret is None: raise libvirtError ('virDomainGetJobStats() failed', dom=self)
Aug 13 10:03:49.974378 ubuntu-bionic-limestone-regionone-0010083920 nova-compute[25863]: ERROR nova.virt.libvirt.driver [instance: a1637e8b-6f2d-4127-9799-31cefb3f43a6] libvirtError: Timed out during operation: cannot acquire state change lock (held by remoteDispatchDomainMigratePerform3Params)
Aug 13 10:03:49.974378 ubuntu-bionic-limestone-regionone-0010083920 nova-compute[25863]: ERROR nova.virt.libvirt.driver [instance: a1637e8b-6f2d-4127-9799-31cefb3f43a6]
Aug 13 10:03:49.977652 ubuntu-bionic-limestone-regionone-0010083920 nova-compute[25863]: ERROR nova.compute.manager [-] [instance: a1637e8b-6f2d-4127-9799-31cefb3f43a6] Live migration failed.: libvirtError: Timed out during operation: cannot acquire state change lock (held by remoteDispatchDomainMigratePerform3Params)

I thought we had another bug for this but I couldn't find one. This doesn't show up often, only 1 hit in 10 days:

http://logstash.openstack.org/#/dashboard/file/logstash.json?query=message:%5C%22Error%20monitoring%20migration:%20Timed%20out%20during%20operation:%20cannot%20acquire%20state%20change%20lock%20(held%20by%20remoteDispatchDomainMigratePerform3Params)%5C%22%20AND%20tags:%5C%22screen-n-cpu.txt%5C%22&from=10d

Matt Riedemann (mriedem) wrote :

The libvirt and qemu versions involved here (same on both nodes):

ii libvirt0:amd64 4.0.0-1ubuntu8.12

ii qemu-system 1:2.11+dfsg-1ubuntu7.15

tags: added: gate-failure libvirt live-migration
Matt Riedemann (mriedem) on 2019-08-15
Changed in nova:
status: New → Confirmed
importance: Undecided → Low
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers