[natty] virt-manager use 100% CPU

Bug #707313 reported by Id2ndR on 2011-01-25
82
This bug affects 12 people
Affects Status Importance Assigned to Milestone
pygtk (Ubuntu)
Undecided
Unassigned
Natty
Undecided
Unassigned
virt-manager (Ubuntu)
High
Unassigned
Natty
High
Unassigned

Bug Description

Binary package hint: virt-manager

When starting virt-manager, the python process consume 200% CPU (two cores) for about 3 seconds. After, it consumes 100% CPU (one core).
The fact than VMs are running or not doesn't change anything.
If I quit virt-manager, it ends normally. The application works normally except the permanent CPU consumption.

ProblemType: Bug
DistroRelease: Ubuntu 11.04
Package: virt-manager 0.8.4-8ubuntu1
ProcVersionSignature: Ubuntu 2.6.37-12.26-generic 2.6.37
Uname: Linux 2.6.37-12-generic x86_64
Architecture: amd64
Date: Tue Jan 25 10:05:33 2011
InstallationMedia: Ubuntu 11.04 "Natty Narwhal" - Alpha amd64+mac (20101223)
PackageArchitecture: all
ProcEnviron:
 LANGUAGE=fr_FR:fr:en_GB:en
 PATH=(custom, user)
 LANG=fr_FR.UTF-8
 LC_MESSAGES=fr_FR.utf8
 SHELL=/bin/bash
SourcePackage: virt-manager

Id2ndR (id2ndr) wrote :
Changed in virt-manager (Ubuntu):
status: New → Confirmed
Clint Byrum (clint-fewbar) wrote :
Download full text (6.0 KiB)

Marking Confirmed, I have experienced this too, and it seemed to start only when I updated about 24 hours ago.

Seems to be polling the sockets for each VM defined over and over without a 0 millisecond timeout. According to man 2 poll, a negative timeout is infinite, but it does not state that 0 is, so I can only assume that this is an error in some library call that should be specifying a timeout.

$ sudo strace -p 20110
Process 20110 attached - interrupt to quit
restart_syscall(<... resuming interrupted call ...>) = 1
rt_sigprocmask(SIG_BLOCK, NULL, [], 8) = 0
rt_sigprocmask(SIG_SETMASK, [], [], 8) = 0
recvfrom(28, "\0\0\0\4\0\240\0\363\0\20\0\v\0\0\0\7P\1\1\0\0\0\250\250\250\t\302>g\17\0\0"..., 4096, 0, NULL, NULL) = 131
rt_sigprocmask(SIG_BLOCK, NULL, [], 8) = 0
rt_sigprocmask(SIG_SETMASK, [], [], 8) = 0
read(4, 0x2e28f44, 4096) = -1 EAGAIN (Resource temporarily unavailable)
poll([{fd=5, events=POLLIN}, {fd=4, events=POLLIN}, {fd=14, events=POLLIN|POLLPRI}, {fd=16, events=POLLIN|POLLPRI}, {fd=17, events=POLLIN|POLLPRI}, {fd=10, events=POLLIN|POLLPRI}, {fd=22, events=POLLIN}, {fd=19, events=POLLIN}, {fd=27, events=POLLIN}, {fd=26, events=POLLIN}, {fd=23, events=POLLIN}, {fd=18, events=POLLIN}], 12, 0) = 0 (Timeout)
rt_sigprocmask(SIG_BLOCK, NULL, [], 8) = 0
rt_sigprocmask(SIG_SETMASK, [], [], 8) = 0
rt_sigprocmask(SIG_BLOCK, NULL, [], 8) = 0
rt_sigprocmask(SIG_SETMASK, [], [], 8) = 0
poll([{fd=4, events=POLLIN|POLLOUT}], 1, -1) = 1 ([{fd=4, revents=POLLOUT}])
writev(4, [{"H\2\266\0uF`\5\207+`\5\20\0\v\0\240\0\363\0\0\30`\5\250\250\250\377\250\250\250\377"..., 728}, {NULL, 0}, {"", 0}], 3) = 728
read(4, 0x2e28f44, 4096) = -1 EAGAIN (Resource temporarily unavailable)
poll([{fd=5, events=POLLIN}, {fd=4, events=POLLIN}, {fd=14, events=POLLIN|POLLPRI}, {fd=16, events=POLLIN|POLLPRI}, {fd=17, events=POLLIN|POLLPRI}, {fd=10, events=POLLIN|POLLPRI}, {fd=22, events=POLLIN}, {fd=19, events=POLLIN}, {fd=27, events=POLLIN}, {fd=26, events=POLLIN}, {fd=23, events=POLLIN}, {fd=18, events=POLLIN}], 12, 0) = 0 (Timeout)
poll([{fd=4, events=POLLIN|POLLOUT}], 1, -1) = 1 ([{fd=4, revents=POLLOUT}])
writev(4, [{"5\30\4\0v\335`\5%+`\5\20\0\v\0>\0\7\0uF`\5%+`\5\207+`\5"..., 52}, {NULL, 0}, {"", 0}], 3) = 52
read(4, 0x2e28f44, 4096) = -1 EAGAIN (Resource temporarily unavailable)
read(4, 0x2e28f44, 4096) = -1 EAGAIN (Resource temporarily unavailable)
poll([{fd=5, events=POLLIN}, {fd=4, events=POLLIN}, {fd=14, events=POLLIN|POLLPRI}, {fd=16, events=POLLIN|POLLPRI}, {fd=17, events=POLLIN|POLLPRI}, {fd=10, events=POLLIN|POLLPRI}, {fd=22, events=POLLIN}, {fd=19, events=POLLIN}, {fd=27, events=POLLIN}, {fd=26, events=POLLIN}, {fd=23, events=POLLIN}, {fd=18, events=POLLIN}], 12, 0) = 0 (Timeout)
rt_sigprocmask(SIG_BLOCK, NULL, [], 8) = 0
rt_sigprocmask(SIG_SETMASK, [], [], 8) = 0
rt_sigprocmask(SIG_BLOCK, NULL, [], 8) = 0
rt_sigprocmask(SIG_SETMASK, [], [], 8) = 0
poll([{fd=4, events=POLLIN|POLLOUT}], 1, -1) = 1 ([{fd=4, revents=POLLOUT}])
writev(4, [{"H\2\246\0uF`\5\207+`\5\20\0\n\0\260\0\364\0\0\30`\5\250\250\250\377\250\250\250\377"..., 664}, {NULL, 0}, {"", 0}], 3) = 664
read(4, 0x2e28f44, 4096...

Read more...

Changed in virt-manager (Ubuntu):
importance: Undecided → Medium
importance: Medium → High
Clint Byrum (clint-fewbar) wrote :

I think this may be related to pygtk... note the backtrace (repeated many times, always stuck at this point:

0x00007fe7811dc4b3 in __poll (fds=<value optimized out>, nfds=<value optimized out>, timeout=<value optimized out>)
    at ../sysdeps/unix/sysv/linux/poll.c:87
87 in ../sysdeps/unix/sysv/linux/poll.c
(gdb) bt
#0 0x00007fe7811dc4b3 in __poll (fds=<value optimized out>, nfds=<value optimized out>, timeout=<value optimized out>)
    at ../sysdeps/unix/sysv/linux/poll.c:87
#1 0x00007fe77e6f7260 in g_main_context_poll (context=0x2742530, block=<value optimized out>, dispatch=1, self=<value optimized out>)
    at /build/buildd/glib2.0-2.27.91/glib/gmain.c:3404
#2 g_main_context_iterate (context=0x2742530, block=<value optimized out>, dispatch=1, self=<value optimized out>)
    at /build/buildd/glib2.0-2.27.91/glib/gmain.c:3086
#3 0x00007fe77e6f7b3a in g_main_loop_run (loop=0x30f1490) at /build/buildd/glib2.0-2.27.91/glib/gmain.c:3299
#4 0x00007fe77ce0adf7 in gtk_main () from /usr/lib/libgtk-x11-2.0.so.0
#5 0x00007fe77d4e8f04 in ?? () from /usr/lib/pymodules/python2.7/gtk-2.0/gtk/_gtk.so
#6 0x00000000004abfc0 in call_function (f=<value optimized out>, throwflag=<value optimized out>) at ../Python/ceval.c:3996
#7 PyEval_EvalFrameEx (f=<value optimized out>, throwflag=<value optimized out>) at ../Python/ceval.c:2665
#8 0x00000000004acffd in fast_function (f=<value optimized out>, throwflag=<value optimized out>) at ../Python/ceval.c:4098
#9 call_function (f=<value optimized out>, throwflag=<value optimized out>) at ../Python/ceval.c:4033
#10 PyEval_EvalFrameEx (f=<value optimized out>, throwflag=<value optimized out>) at ../Python/ceval.c:2665
#11 0x00000000004ae006 in PyEval_EvalCodeEx (co=0x2136630, globals=<value optimized out>, locals=<value optimized out>, args=<value optimized out>,
    argcount=0, kws=0x0, kwcount=0, defs=0x0, defcount=0, closure=0x0) at ../Python/ceval.c:3252
#12 0x00000000004ae112 in PyEval_EvalCode (co=<value optimized out>, globals=<value optimized out>, locals=<value optimized out>)
    at ../Python/ceval.c:666
#13 0x00000000004d35a4 in run_mod (fp=0x217da70, filename=<value optimized out>, closeit=1, flags=0x7fff58544e80) at ../Python/pythonrun.c:1346
#14 PyRun_FileExFlags (fp=0x217da70, filename=<value optimized out>, closeit=1, flags=0x7fff58544e80) at ../Python/pythonrun.c:1332
#15 PyRun_SimpleFileExFlags (fp=0x217da70, filename=<value optimized out>, closeit=1, flags=0x7fff58544e80) at ../Python/pythonrun.c:936
#16 0x0000000000418ce9 in Py_Main (argc=<value optimized out>, argv=<value optimized out>) at ../Modules/main.c:599
#17 0x00007fe781127d1e in __libc_start_main (main=0x417ff0 <main>, argc=2, ubp_av=0x7fff58544fa8, init=<value optimized out>,
    fini=<value optimized out>, rtld_fini=<value optimized out>, stack_end=0x7fff58544f98) at libc-start.c:226
#18 0x0000000000417f29 in _start ()

python-gtk2:
  Installed: 2.22.0-0ubuntu1
  Candidate: 2.22.0-0ubuntu1
  Version table:
 *** 2.22.0-0ubuntu1 0
        500 http://us.archive.ubuntu.com/ubuntu/ natty/main amd64 Packages
        100 /var/lib/dpkg/status

I believe this is a dupe of:
https://bugs.launchpad.net/ubuntu/+source/pygtk/+bug/664920

Which has more information.

Marc Deslauriers (mdeslaur) wrote :

I don't believe it is the same issue. I rebuilt pygtk packages with the patch in gnome bug #640738, and although they fix the issue in that bug, they don't seem to fix the issue with virt-manager.

Oops, sorry about the noise, the symptoms and location were extremely similar: python + gtk + main loop + poll = 100% cpu!
Maybe the cause is similar too? (poll misused?)

Changed in virt-manager (Ubuntu Natty):
milestone: none → natty-alpha-3
Marc Deslauriers (mdeslaur) wrote :

I think the poll is a red herring...the new version of virt-manager actually does have a 1000ms timeout when I strace it.

So, oddly the issue doesn't pop up with virt-manager --debug...and looking though the code, when --debug is not used, it calls this:

def drop_tty():
    # We fork and setsid so that we drop the controlling
    # tty. This prevents libvirt's SSH tunnels from prompting
    # for user input if SSH keys/agent aren't configured.
    if os.fork() != 0:
        os._exit(0)

    os.setsid()

As soon as it forks, CPU goes up to 100%.

Marc Deslauriers (mdeslaur) wrote :

Whoops, I typoed the bug number in the changelog...anyway:

This bug was fixed in the package virt-manager - 0.8.6-1ubuntu2

---------------
virt-manager (0.8.6-1ubuntu2) natty; urgency=low

  * debian/patches/fork_before_using_gconf.patch: fork earlier to resolve
    cpu usage issue and appindicator issue. (LP: #720724, LP: #707303)
 -- Marc Deslauriers <email address hidden> Thu, 24 Feb 2011 21:15:18 -0500

Changed in virt-manager (Ubuntu Natty):
status: Confirmed → Fix Released
Changed in pygtk (Ubuntu Natty):
status: New → Invalid
Jonathan Rogers (jonner) wrote :

This bug may have been fixed in Natty, but it's back in Oneiric. I've opened a new bug:
https://bugs.launchpad.net/ubuntu/+source/virt-manager/+bug/874474

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Duplicates of this bug

Other bug subscribers