Xorg unresponsive after screensaver unlock with Dual Head

Bug #780424 reported by Druid
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
nvidia-graphics-drivers (Ubuntu)
New
Low
Unassigned

Bug Description

Binary package hint: xorg

As background I'm running multi-GPU multi-session multi-head with 3 screens on two NVIDIA GPU Quadro NVS 290 (G86GL) cards (one on one, two on the other). Machine is a quad core with 8G ram and 8G swap. Individual gnome sessions on each screen (ie I can't transfer programs between screens and each screen has full menus).

Occasionally after unlocking my mouse cursor rapidly flickers between screens and the machine is totally unresponsive at the console.

Logging in via ssh and I can see load and cpu usage rapidly increases, memory usage also goes through the roof until all memory and swap is consumed at which point the oom-killer kicks in (or possibly Xorg crashes) , kills Xorg and I can log back in (with my session lost).

In the mistaken assumption that I didn't have enough swap I increased it (from 2G to 10G) all this did was allow the machine to be a little bit more responsive for longer but Xorg still continued to consume memory, last time it got to 11.2G of virtual and a load average of 24 before I rebooted the machine. As a comparison, right now Xorg has 153m with load averages all at ~0.5.

While I realise this may not include enough as is this is a reasonably regular (and very annoying) bug for me, if any further details can be collected at the time please let me know, I can (within reason) run anything as needed.

Revision history for this message
Druid (3-launchpad-fastdruid-co-uk) wrote :

Output of top just after it locks up:

top - 17:46:14 up 3 days, 7:18, 18 users, load average: 7.68, 4.75, 2.34
Tasks: 299 total, 4 running, 292 sleeping, 3 stopped, 0 zombie
Cpu0 : 2.3%us, 14.5%sy, 0.9%ni, 81.6%id, 0.6%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu1 : 2.4%us, 15.7%sy, 0.9%ni, 80.6%id, 0.4%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu2 : 3.4%us, 14.9%sy, 0.9%ni, 80.0%id, 0.7%wa, 0.0%hi, 0.1%si, 0.0%st
Cpu3 : 3.4%us, 15.7%sy, 0.9%ni, 79.5%id, 0.4%wa, 0.0%hi, 0.1%si, 0.0%st
Mem: 8193768k total, 8118668k used, 75100k free, 316k buffers
Swap: 7960852k total, 3910980k used, 4049872k free, 19380k cached

  PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
 1800 sharpd 20 0 363m 8588 8588 R 121 0.1 46:30.21 compiz
 1447 root 20 0 5401m 3.4g 1504 R 101 43.1 192:28.90 Xorg
 2637 sharpd 20 0 1755m 1.0g 1.0g S 99 13.3 2403:05 VirtualBox
   47 root 20 0 0 0 0 D 12 0.0 0:20.94 kswapd0
19178 sharpd 20 0 3617m 2.7g 2.7g S 5 34.2 194:08.78 VirtualBox
  295 root 20 0 0 0 0 D 1 0.0 0:06.25 usb-storage
 1685 sharpd 20 0 19412 1076 676 R 1 0.0 0:00.67 top
    1 root 20 0 23896 296 260 S 0 0.0 0:03.10 init
    2 root 20 0 0 0 0 S 0 0.0 0:00.00 kthreadd
    3 root 20 0 0 0 0 S 0 0.0 0:18.85 ksoftirqd/0
    4 root RT 0 0 0 0 S 0 0.0 0:00.81 migration/0
    5 root RT 0 0 0 0 S 0 0.0 0:00.00 watchdog/0
    6 root RT 0 0 0 0 S 0 0.0 0:00.28 migration/1
    7 root 20 0 0 0 0 S 0 0.0 0:31.50 ksoftirqd/1
    8 root RT 0 0 0 0 S 0 0.0 0:00.00 watchdog/1
    9 root RT 0 0 0 0 S 0 0.0 0:01.20 migration/2
   10 root 20 0 0 0 0 S 0 0.0 10:10.17 ksoftirqd/2
   11 root RT 0 0 0 0 S 0 0.0 0:00.00 watchdog/2

Revision history for this message
Druid (3-launchpad-fastdruid-co-uk) wrote :

Output of top a short while later (note I've increase swap to 13G and its all been used, xorg now at 14.8G!)

top - 17:59:59 up 3 days, 7:32, 18 users, load average: 17.28, 10.08, 6.69
Tasks: 306 total, 2 running, 281 sleeping, 3 stopped, 20 zombie
Cpu0 : 2.5%us, 14.5%sy, 0.9%ni, 81.4%id, 0.7%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu1 : 2.5%us, 15.7%sy, 0.9%ni, 80.4%id, 0.4%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu2 : 3.4%us, 14.9%sy, 0.9%ni, 79.8%id, 0.9%wa, 0.0%hi, 0.1%si, 0.0%st
Cpu3 : 3.6%us, 15.7%sy, 0.9%ni, 79.3%id, 0.4%wa, 0.0%hi, 0.1%si, 0.0%st
Mem: 8193768k total, 8146352k used, 47416k free, 2336k buffers
Swap: 13058448k total, 13058128k used, 320k free, 17452k cached

  PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
 1800 sharpd 20 0 363m 8048 8048 R 110 0.1 62:13.09 compiz
 2637 sharpd 20 0 1755m 1.0g 1.0g S 99 13.3 2409:06 VirtualBox
 1447 root 20 0 14.8g 3.6g 944 D 11 46.1 202:14.59 Xorg
19178 sharpd 20 0 3617m 2.7g 2.7g S 6 34.2 195:26.96 VirtualBox
 2260 root 20 0 19412 1364 860 R 4 0.0 0:00.05 top
 1622 root 20 0 55424 2024 1240 D 2 0.0 0:00.87 polkitd
    1 root 20 0 23896 452 0 S 0 0.0 0:03.10 init
    2 root 20 0 0 0 0 S 0 0.0 0:00.00 kthreadd
    3 root 20 0 0 0 0 S 0 0.0 0:19.11 ksoftirqd/0
    4 root RT 0 0 0 0 S 0 0.0 0:00.96 migration/0
    5 root RT 0 0 0 0 S 0 0.0 0:00.00 watchdog/0
    6 root RT 0 0 0 0 S 0 0.0 0:00.28 migration/1
    7 root 20 0 0 0 0 S 0 0.0 0:31.73 ksoftirqd/1
    8 root RT 0 0 0 0 S 0 0.0 0:00.00 watchdog/1
    9 root RT 0 0 0 0 S 0 0.0 0:01.22 migration/2
   10 root 20 0 0 0 0 S 0 0.0 10:11.75 ksoftirqd/2
   11 root RT 0 0 0 0 S 0 0.0 0:00.00 watchdog/2

Revision history for this message
Druid (3-launchpad-fastdruid-co-uk) wrote :

messages file.

Revision history for this message
Druid (3-launchpad-fastdruid-co-uk) wrote :

I see nothing of interest in it but just in case!

Revision history for this message
Druid (3-launchpad-fastdruid-co-uk) wrote :

This is an strace of X while its stuck.

Revision history for this message
Bryce Harrington (bryce) wrote :

High X memory bugs are almost always caused by a runaway client program. There's nothing in any of the files you attached to suggest it is really an X bug.

You seem to have a number of non-standard software installed like virtualbox and chrome. Do a bit more analysis into what applications are running when this bug happens and try killing them off one by one.

Changed in xorg (Ubuntu):
importance: Undecided → Low
status: New → Incomplete
Revision history for this message
Druid (3-launchpad-fastdruid-co-uk) wrote :

Thanks for the response.

As if by magic it did it again today.
I killed off chrome and virtualbox with no response. Killed off the screensaver with again no change. Compiz and X just sat at 100% cpu and zero response to the mouse/keyboard on the screen.

I don't tend to run much, lots of gnome-terminal's, firefox, opera, chrome and virtualbox are pretty much all that's ever running.

I'm not sure if it is the same issue or not but I found this:
http://www.nvnews.net/vbulletin/showthread.php?t=162243

Which at least partly matches both the hardware (multi nvidia GPU, multi-monitor), Ubuntu and behaviour but I can't trigger it with the script the OP has posted.

For reference I'm using Nvidia driver version: 260.19.06

I can/will run anything required to try and tie down the issue further.

Revision history for this message
Bryce Harrington (bryce) wrote : Re: [Bug 780424] Re: Xorg unresponsive after screensaver unlock with Dual Head

On Wed, Jul 06, 2011 at 04:49:42PM -0000, Druid wrote:
> Thanks for the response.
>
> As if by magic it did it again today.
> I killed off chrome and virtualbox with no response. Killed off the screensaver with again no change. Compiz and X just sat at 100% cpu and zero response to the mouse/keyboard on the screen.
>
> I don't tend to run much, lots of gnome-terminal's, firefox, opera,
> chrome and virtualbox are pretty much all that's ever running.
>
> I'm not sure if it is the same issue or not but I found this:
> http://www.nvnews.net/vbulletin/showthread.php?t=162243
>
> Which at least partly matches both the hardware (multi nvidia GPU,
> multi-monitor), Ubuntu and behaviour but I can't trigger it with the
> script the OP has posted.

Careful; these bugs can look a lot a like considering just symptoms.
I don't see mention of high xorg memory or cpu mentioned there, so it
may be a "regular" gpu lockup. GPU lockups don't usually involve high
cpu or memory.

> For reference I'm using Nvidia driver version: 260.19.06
>
> I can/will run anything required to try and tie down the issue further.

You might find http://wiki.ubuntu.com/X/Troubleshooting useful.

Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for xorg (Ubuntu) because there has been no activity for 60 days.]

Changed in xorg (Ubuntu):
status: Incomplete → Expired
Revision history for this message
Druid (3-launchpad-fastdruid-co-uk) wrote :

No activity because there has been no developer action, not because the bug has gone away.

I've tried turning off compiz and it still does it.

Changed in xorg (Ubuntu):
status: Expired → New
Revision history for this message
Timo Aaltonen (tjaalton) wrote :

I suspect the nvidia blob is leaking texture memory, which then is showed as the Xorg process taking the memory.

affects: xorg (Ubuntu) → nvidia-graphics-drivers (Ubuntu)
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.