Dumpxml for running Xen domains locks up with libvirt 1.0.6-0ubuntu1

Bug #1191782 reported by Stefan Bader
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
libvirt (Ubuntu)
Fix Released
High
Stefan Bader

Bug Description

After upgrading a Saucy Xen host to libvirt 1.0.6-0ubuntu1 I had issues getting a Precise/12.04 machine (which runs virt-manager) to connect.
After downgrading the Saucy libvirt packages to 1.0.5-0ubuntu1 allowed me to connect again.

With additional debug this could be traced down to the same action as the following command would do:

#> virsh -c xen+ssh://root@host dumpxml 0
or
#> virsh -c xen:/// dumpxml 0 (on the host)

Dumping xml info for another defined but not running domU is working. So it seems related to running domains only.

Revision history for this message
Stefan Bader (smb) wrote :

I should add that using virsh from the Precise machine did connect. So it seems something special related to virt-manager.

Changed in libvirt (Ubuntu):
importance: Undecided → High
Changed in virt-manager (Ubuntu):
importance: Undecided → High
Revision history for this message
Serge Hallyn (serge-hallyn) wrote :

Stefan, could you please show us how exactly you are initiating the connection? Are you using ssh and logging in as root? As the user? Using ssl?

Revision history for this message
Serge Hallyn (serge-hallyn) wrote :

(Note, I tried to reproduce this with virt-manager from precise, to a new saucy server install, connecting as root over ssh, but without success)

Revision history for this message
Stefan Bader (smb) wrote :

The connection is set up just like you did (xen+ssh://root@host). Clicking on the config (I ran virt-manager with --debug in the foreground this time) seems to get the capabilities, but then blocks on getting info for domain 0.

Finding that I can actually reproduce that with virsh as well. Connect to a Xen host and do a "dumpxml 0". That will lock up. Downgrading to the older libvirt produces output.

Revision history for this message
Stefan Bader (smb) wrote :

And yeah, I realized that I should probably have pointed out that I try to connect to a Xen host. Always those people never giving the full information... :-P

Revision history for this message
Serge Hallyn (serge-hallyn) wrote : adding xen to subject

Thanks, Stefan. So this is xen-only I gather. I'll go ahead and
change the subject and mark this confirmed.

 summary: "Timeout connecting 12.04 virt-manager to libvirt 1.0.6-0ubuntu1 + xen in Saucy"

summary: - Timeout connecting 12.04 virt-manager to libvirt 1.0.6-0ubuntu1 in Saucy
+ Timeout connecting 12.04 virt-manager to libvirt 1.0.6-0ubuntu1 + xen in
+ Saucy
Revision history for this message
Stefan Bader (smb) wrote : Re: Timeout connecting 12.04 libvirt to libvirt 1.0.6-0ubuntu1 + xen in Saucy

I changed it from virt-manager to libvirt as this can be reproduced by virsh -> xen host. And would also remove the affects from virt-manager.

summary: - Timeout connecting 12.04 virt-manager to libvirt 1.0.6-0ubuntu1 + xen in
+ Timeout connecting 12.04 libvirt to libvirt 1.0.6-0ubuntu1 + xen in
Saucy
Changed in virt-manager (Ubuntu):
status: New → Invalid
Changed in libvirt (Ubuntu):
status: New → Confirmed
Stefan Bader (smb)
description: updated
description: updated
no longer affects: virt-manager (Ubuntu)
Revision history for this message
Stefan Bader (smb) wrote :

Further info: The same problem exists when connecting directly from the Saucy Xen host (that rules out any Precise-Saucy problem). So

#> virsh -c xen:/// dumpxml 0

locks up. Even with libvirtd logging set to debug, there does not seem to be any related output. Interestingly dumpxml on defined but not running domains works, while any running domain cannot be queried. I believe for running domains the hypervisor subdriver gets used while for the defined ones the xm subdriver is responsible.

summary: - Timeout connecting 12.04 libvirt to libvirt 1.0.6-0ubuntu1 + xen in
- Saucy
+ Dumpxml for running Xen domains locks up with libvirt 1.0.6-0ubuntu1
description: updated
description: updated
Revision history for this message
Stefan Bader (smb) wrote :

Lazily copying from an email written to mailing list:

The problem is the mutex lock on xenUnifiedPrivatePtr which is held around
xenDomainUsedCpus.

xenUnifiedDomainGetXMLDesc
  ...
  xenUnifiedLock(priv);
  cpus = xenDomainUsedCpus(dom);
  xenUnifiedUnlock(priv);
  ...

Unfortunately the introduction of virDomainDefPtr added the following call paths

xenDomainUsedCpus
  ...
  nb_vcpu = xenUnifiedDomainGetMaxVcpus(dom);
    return xenUnifiedDomainGetVcpusFlags(...)
      ...
      if (!(def = xenGetDomainDefForDom(dom)))
        return xenGetDomainDefForUUID(dom->conn, dom->uuid);
          ...
          ret = xenHypervisorLookupDomainByUUID(conn, uuid);
            ...
            xenUnifiedLock(priv);
            name = xenStoreDomainGetName(conn, id);
            xenUnifiedUnlock(priv);
  ...
  if ((ncpus = xenUnifiedDomainGetVcpus(dom, cpuinfo, nb_vcpu,
    ...
    if (!(def = xenGetDomainDefForDom(dom)))
      [again like above]

Right now, running the GetXMLDesc command for an active Xen domain will lock up
right in the xenUnifiedDomainGetMaxVcpus call. But any subcall leading to a call
to xenGetDomainDefForDom while holding the xenUnifiedPrivatePtr lock will have
the same fate.

I assume the lock around the xenDomainUsedCpus call is there to ensure all
accesses to the private pointer see consistent data. Otherwise it would be
possible to simply release the lock before the GetMaxVcpus and GetVcpus calls.

If that lock cannot be dropped this feels like a much more painful rework is needed.

Stefan Bader (smb)
Changed in libvirt (Ubuntu):
assignee: nobody → Stefan Bader (stefan-bader-canonical)
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package libvirt - 1.0.6-0ubuntu4

---------------
libvirt (1.0.6-0ubuntu4) saucy; urgency=low

  * ubuntu-xen-fix-api-deadlocks.patch (LP: #1191782)
    Fix the deadlocks in the xen driver when doing a dumpxml for active
    domains.
  * ubuntu-libxl-qemu-nopath.patch
    Create libxl configurations without paths for qemu-dm and hvmloader.
    The Xen toolstack can figure this out.
  * ubuntu-xen-hypervisor-4.3.patch
    Update the xen driver to handle the new sysctl and domctl versions
    in Xen-4.3.
  * Add apparmor definitions to execute scripts in /etc/xen/scrips as
    the libxl driver calls out to them (with the xen/xm driver this was
    done by the xen toolstack and communication with that was through
    a socket).
 -- Stefan Bader <email address hidden> Tue, 16 Jul 2013 10:59:11 +0200

Changed in libvirt (Ubuntu):
status: Confirmed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.