Ubuntu
libvirt package

juju bootstrap hangs for local environment under precise on vmware

Precise (12.04)
Bug #920454

Bug #920454 reported by Brad Crittenden on 2012-01-23

This bug affects 5 people

	Status	Importance	Assigned to	Milestone
libvirt	Fix Released	High	redhat-bugs #796451
pyjuju	Fix Released	High	Unassigned	pyjuju 0.6.1
libvirt (Ubuntu)	Fix Released	Undecided	Unassigned
Precise	Won't Fix	Medium	Unassigned

Bug Description

With precise, 'juju bootstrap -e local' fails to start networking. The call to 'net.start()' hangs.

This problem only occurs when using precise on vmware. Precise on metal works and Oneiric with vmware works.

Tags:

Kapil Thangavelu (hazmat) on 2012-01-26

tags:

added: local

Revision history for this message

In Red Hat Bugzilla #796451, Matt (matt-redhat-bugs) wrote on 2012-02-22:

Created attachment 565119
gdb backtrace of virsh

Description of problem:
Using FC16 as VMware Workstation 8 guest with Intel VT-x virtualisation so that I can test KVM. When installing libvirt & qemu-kvm I am unable to connect to the local hypervisor with virsh (or virt-manager for that matter).

Running fallback Gnome desktop environment and latest updates

Have tried disabling auth (set to none) in the libvirtd.conf and disabling selinux (setenforce 0). Also tried with std user & root user.

Version-Release number of selected component (if applicable):
* FC16 stock with all updates (also tested with testing updates)
* Kernel 3.2.6-3.fc16.x86_64
* libvirt 0.9.6-4.fc16

How reproducible:
Have reproduced on another system, using fresh FC16 install as VMware Workstation 8 guest. Same results.

Steps to Reproduce:
1. Install FC16 as VMware guest with Intel VT-x virtualisation
2. Install qemu-kvm & libvirt
3. Type qemu --connect qemu:///system

Actual results:
Process hangs until ^C

Expected results:
Virsh prompt connected to local hypervisor

Additional info:
In the hope that it is useful, I have attached a gdb backtrace while it is hanging. I ran debuginfo-install libvirt then:

virsh --connect qemu:///system &
gdb
attach [processid]
backtrace
See attachment for backtrace

Revision history for this message

In Red Hat Bugzilla #796451, Dave (dave-redhat-bugs) wrote on 2012-02-22:

Can you provide a backtrace of all the libvirtd threads with bt -a when this problem is occurring?

Revision history for this message

In Red Hat Bugzilla #796451, Cole (cole-redhat-bugs) wrote on 2012-02-22:

And when you reproduce the hang, is dmidecode running?

ps axwww | grep dmide

Revision history for this message

In Red Hat Bugzilla #796451, Matt (matt-redhat-bugs) wrote on 2012-02-22:

Hi,

1. Attachment created: backtrace of libvirtd attached
I did not fully understand your instructions, I hope this is the information that you require, let me know if there's anything more that you want - the gdb commands that I used are in the attachment,

2. Results of ps axwww | grep dmide:

1484 ? S 0:00 /usr/sbin/dmidecode -q -t 0,1,4,17

Matt

Revision history for this message

In Red Hat Bugzilla #796451, Matt (matt-redhat-bugs) wrote on 2012-02-22:

#10

Created attachment 565125
libvirtd backtrace (all threads)

Revision history for this message

In Red Hat Bugzilla #796451, Cole (cole-redhat-bugs) wrote on 2012-02-22:

#11

Yeah I've heard of this issue before, the dmidecode hang in vmware guests. I think there's a patch upstream for it

Eric, do you know more about this?

Revision history for this message

In Red Hat Bugzilla #796451, Dave (dave-redhat-bugs) wrote on 2012-02-22:

#12

Matt, that's what I was looking for. I have the same thought Cole did which is that this is dmidecode related.

Revision history for this message

In Red Hat Bugzilla #796451, Dave (dave-redhat-bugs) wrote on 2012-02-22:

#13

Are you willing to try building upstream libvirt to see if it makes the problem go away? I'm not convinced it's fixed upstream yet, but if you can repro this at will and test builds I'm sure we can figure it out.

Revision history for this message

In Red Hat Bugzilla #796451, Matt (matt-redhat-bugs) wrote on 2012-02-22:

#14

Sure Dave. Can you provide me some high-level instructions, or point me to a site that might have something similar?

Thanks,

Matt

Revision history for this message

In Red Hat Bugzilla #796451, Eric (eric-redhat-bugs) wrote on 2012-02-22:

#15

bug 783453 is another example of a dmidecode hang; F16 does not (yet) have the two patches mentioned in that bug:

commit 06b9c5b9231ef4dbd4b5ff69564305cd4f814879
Author: Michal Privoznik <email address hidden>
Date: Tue Jan 3 18:40:55 2012 +0100

virCommand: Properly handle POLLHUP

    It is a good practise to set revents to zero before doing any poll().
    Moreover, we should check if event we waited for really occurred or
    if any of fds we were polling on didn't encountered hangup.

commit d19149dda888d36cea58b6cdf7446f98bd1bf734
Author: Laszlo Ersek <email address hidden>
Date: Tue Jan 24 15:55:19 2012 +0100

virCommandProcessIO(): make poll() usage more robust

    POLLIN and POLLHUP are not mutually exclusive. Currently the following
    seems possible: the child writes 3K to its stdout or stderr pipe, and
    immediately closes it. We get POLLIN|POLLHUP (I'm not sure that's possible
    on Linux, but SUSv4 seems to allow it). We read 1K and throw away the
    rest.

But it is not certain whether those two patches are all that's needed, or whether we need yet a third patch backported to the F16 build.

Revision history for this message

In Red Hat Bugzilla #796451, Matt (matt-redhat-bugs) wrote on 2012-02-22:

#16

After a bit of investigation - I am currently building the fc17 version of libvirt from src RPM.

Revision history for this message

In Red Hat Bugzilla #796451, Matt (matt-redhat-bugs) wrote on 2012-02-22:

#17

During the build of libvirt-0.9.10-1 from the fc17 source repo, the test for virsh-all hung. It seems that dmidecode was the issue again - the build continued once I have terminated the dmidecode process.

Once the new RPM was installed - and once I had disabled TLS auth :) - the problem is solved. Both virsh and virt-manager connect without issue.

P.S.
There was a sanlock=>0.8 dependency that I ignored for now as I don't have shared storage.

Revision history for this message

In Red Hat Bugzilla #796451, Dave (dave-redhat-bugs) wrote on 2012-02-23:

#18

So now the question is, are the two patches Eric mentioned sufficient, or is there some other required commit? Osier, I'm about to go offline for the day, would you mind spinning an F16 test build with just the two patches and see if it still fixes the problem?

Revision history for this message

In Red Hat Bugzilla #796451, Osier (osier-redhat-bugs) wrote on 2012-02-23:

#19

(In reply to comment #13)
> So now the question is, are the two patches Eric mentioned sufficient, or is
> there some other required commit? Osier, I'm about to go offline for the day,
> would you mind spinning an F16 test build with just the two patches and see if
> it still fixes the problem?

Let me do it.

Revision history for this message

In Red Hat Bugzilla #796451, Osier (osier-redhat-bugs) wrote on 2012-02-23:

#20

(In reply to comment #14)
> (In reply to comment #13)
> > So now the question is, are the two patches Eric mentioned sufficient, or is
> > there some other required commit? Osier, I'm about to go offline for the day,
> > would you mind spinning an F16 test build with just the two patches and see if
> > it still fixes the problem?
>
> Let me do it.

Tested with installing VMware Workstation 8, and fc16 guest, the problem was resolved exactly with those two patches applied in the testing build.

Revision history for this message

In Red Hat Bugzilla #796451, Fedora (fedora-redhat-bugs) wrote on 2012-03-04:

#21

libvirt-0.9.6-5.fc16 has been submitted as an update for Fedora 16.
https://admin.fedoraproject.org/updates/libvirt-0.9.6-5.fc16

Revision history for this message

In Red Hat Bugzilla #796451, Fedora (fedora-redhat-bugs) wrote on 2012-03-06:

#22

Package libvirt-0.9.6-5.fc16:
* should fix your issue,
* was pushed to the Fedora 16 testing repository,
* should be available at your local mirror within two days.
Update it with:
# su -c 'yum update --enablerepo=updates-testing libvirt-0.9.6-5.fc16'
as soon as you are able to.
Please go to the following url:
https://admin.fedoraproject.org/updates/FEDORA-2012-3067/libvirt-0.9.6-5.fc16
then log in and leave karma (feedback).

Revision history for this message

In Red Hat Bugzilla #796451, Fedora (fedora-redhat-bugs) wrote on 2012-03-17:

#23

libvirt-0.9.6-5.fc16 has been pushed to the Fedora 16 stable repository. If problems still persist, please make note of it in this bug report.

Revision history for this message

In Red Hat Bugzilla #796451, Garrett (garrett-redhat-bugs) wrote on 2012-04-26:

#24

I apologize for the noise, devs. I'm posting this to benefit those searching for RHEL solutions to this very problem. :)

This problem with libvirt exists in RHEL 6.2, and I stumbled upon it while preparing for RHCSA/RHCE recertification. My study environment consists of VMWare Workstation 8.0.2-591240 and RHEL 6.2.

This is fixed in RHEL 6.3 beta as of 2012/04/25.

Revision history for this message

Clint Byrum (clint-fewbar) wrote on 2012-05-27:

3 people affected, this one deserves some investigation. I suspect that the libvirt networking is fighting with vmware's guest additions.

Changed in juju:
status:	New → Confirmed
importance:	Undecided → High

Revision history for this message

Sidnei da Silva (sidnei) wrote on 2012-06-25:

See bug upstream, specially this comment:

https://bugzilla.redhat.com/show_bug.cgi?id=796451#c10

Revision history for this message

Clint Byrum (clint-fewbar) wrote on 2012-06-25:

Looks like we can cherry pick those two patches into precise and solve this bug for vmware users.

Changed in libvirt (Ubuntu):
status:	New → Fix Released
Changed in libvirt (Ubuntu Precise):
status:	New → Triaged
importance:	Undecided → Medium
Changed in juju:
status:	Confirmed → Triaged

Revision history for this message

Brad Crittenden (bac) wrote on 2012-09-18:

This bug still exists in precise but is fixed in quantal.

Revision history for this message

Kapil Thangavelu (hazmat) wrote on 2013-01-31:

in addition to being fixed in quantal we're also no longer using libvirt networking.. marking resolved.

Changed in juju:
status:	Triaged → Fix Released
milestone:	none → 0.6.1

Bug Watch Updater (bug-watch-updater) on 2017-10-28

Changed in libvirt:
importance:	Unknown → High
status:	Unknown → Fix Released

Revision history for this message

Steve Langasek (vorlon) wrote on 2021-10-14:

#25

The Precise Pangolin has reached end of life, so this bug will not be fixed for that release

Changed in libvirt (Ubuntu Precise):
status:	Triaged → Won't Fix

Report a bug

This report contains Public information

Everyone can see this information.

You are

Subscribing...

Edit bug mail

Other bug subscribers

Remote bug watches

redhat-bugs #796451
[CLOSED ERRATA] Edit

Bug watches keep track of this bug in other bug trackers.

Ubuntulibvirt package

juju bootstrap hangs for local environment under precise on vmware

Bug Description

Other bug subscribers

Remote bug watches

Ubuntu
libvirt package