Bug #754711 “[Dell Studio XPS 1340] Doesn't enter suspend mode” : Bugs : linux package : Ubuntu

Revision history for this message

Daniel Manrique (roadmr) wrote on 2011-04-08:

#1

AcpiTables.txt Edit (224.0 KiB, text/plain; charset="utf-8")
AlsaDevices.txt Edit (526 bytes, text/plain; charset="utf-8")
AplayDevices.txt Edit (265 bytes, text/plain; charset="utf-8")
BootDmesg.txt Edit (59.8 KiB, text/plain; charset="utf-8")
Card0.Amixer.values.txt Edit (2.2 KiB, text/plain; charset="utf-8")
Card0.Codecs.codec.0.txt Edit (11.0 KiB, text/plain; charset="utf-8")
Card0.Codecs.codec.3.txt Edit (3.2 KiB, text/plain; charset="utf-8")
CurrentDmesg.txt Edit (26.2 KiB, text/plain; charset="utf-8")
Dependencies.txt Edit (1.9 KiB, text/plain; charset="utf-8")
IwConfig.txt Edit (278 bytes, text/plain; charset="utf-8")
Lspci.txt Edit (16.7 KiB, text/plain; charset="utf-8")
PciMultimedia.txt Edit (571 bytes, text/plain; charset="utf-8")
ProcCpuinfo.txt Edit (1.5 KiB, text/plain; charset="utf-8")
ProcCpuinfo_.txt Edit (1.5 KiB, text/plain; charset="utf-8")
ProcInterrupts.txt Edit (1.5 KiB, text/plain; charset="utf-8")
ProcModules.txt Edit (2.5 KiB, text/plain; charset="utf-8")
RfKill.txt Edit (120 bytes, text/plain; charset="utf-8")
UdevDb.txt Edit (122.5 KiB, text/plain; charset="utf-8")
UdevLog.txt Edit (293.1 KiB, text/plain; charset="utf-8")
WifiSyslog.txt Edit (176.0 KiB, text/plain; charset="utf-8")

description:

updated

Revision history for this message

Ara Pulido (ara) wrote on 2011-04-08:

#2

Can you guys have a look to this regression, please?

Changed in linux (Ubuntu):
assignee:	nobody → Canonical Platform QA Team (canonical-platform-qa)
importance:	Undecided → High

Revision history for this message

Daniel Manrique (roadmr) wrote on 2011-04-08:

#3

Tested this mainline kernel:

http://kernel.ubuntu.com/~kernel-ppa/mainline/daily/current/linux-image-2.6.39-999-generic_2.6.39-999.201104080911_i386.deb

With this one, the system enters suspend mode, but upon trying to resume, the system is unresponsive: backlight doesn't come on, there is no display, keyboard is unresponsive, and system doesn't respond to pings on the network.

Revision history for this message

Brian Murray (brian-murray) wrote on 2011-04-13:

#4

From dmesg:

[ 74.522347] PM: suspend of drv:scsi dev:host0 complete after 1597.165 msecs
[ 74.522499] PM: suspend of drv:pci dev:0000:00:09.0 complete after 1270.188 msecs
[ 74.532679] vmap allocation for size 1052672 failed: use vmalloc=<size> to increase size.
[ 74.576773] [drm] nouveau 0000:02:00.0: ... failed: -12
[ 74.576776] [drm] nouveau 0000:02:00.0: Re-enabling acceleration..
[ 74.576794] pci_legacy_suspend(): nouveau_pci_suspend+0x0/0x360 [nouveau] returns -12
[ 74.576800] pm_op(): pci_pm_suspend+0x0/0x100 returns -12
[ 74.576805] PM: suspend of drv:nouveau dev:0000:02:00.0 complete after 1602.740 msecs
[ 74.576809] PM: Device 0000:02:00.0 failed to suspend async: error -12
[ 74.576859] PM: suspend of drv:ahci dev:0000:00:0b.0 complete after 1346.055 msecs
[ 74.960080] HDA Intel 0000:00:08.0: PCI INT A disabled
[ 74.976069] HDA Intel 0000:00:08.0: power state changed by ACPI to D3
[ 74.976076] PM: suspend of drv:HDA Intel dev:0000:00:08.0 complete after 453.879 msecs
[ 75.453244] [drm] nouveau 0000:03:00.0: And we're gone!
[ 75.453271] nouveau 0000:03:00.0: PCI INT A disabled
[ 75.468020] PM: suspend of drv:nouveau dev:0000:03:00.0 complete after 2493.957 msecs
[ 75.468056] PM: Some devices failed to suspend

Changed in linux (Ubuntu):
status:	New → Triaged

Brian Murray (brian-murray) on 2011-04-13

Changed in linux (Ubuntu):
assignee:	Canonical Platform QA Team (canonical-platform-qa) → Canonical Kernel Team (canonical-kernel-team)

Revision history for this message

Seth Forshee (sforshee) wrote on 2011-04-14:

#5

This failure is due to the lack of a sufficiently large virtual address range available in the vmalloc area to satisfy what the driver is asking for at suspend to store GPU objects. A quick Google search shows a lot of reports of nouveau being a vmalloc hog for normal operation, neglecting suspend, so it may be doing itself in here. Or there could other drivers contributing to heavy vmalloc usage. I don't know if there are any tools to analyze how the kernel vmalloc space is being used; I'll look to see if I can find any.

Realistically the only options are probably to either reduce vmalloc usage or increase vmalloc size. The simple solution is to follow the advice of the kernel and pass vmalloc=<size> on the command-line (maybe start with 128M and go from there). Another (much more complicated) potential solution would be to see if it's possible to unmap the driver mmio space at suspend and remap it at resume.

Running a 64-bit kernel should also take care of the problem.

Revision history for this message

Daniel Manrique (roadmr) wrote on 2011-04-14:

#6

Thanks for the suggestions Seth, I tested and here's what I got:

1- I tried increasing vmalloc (went as far up as 256M). What I see then is the system freezing when I do pm-suspend. It doesn't go into suspension, but the keyboard and mouse stop responding, only remaining option is to reboot.

2- Same behavior when using the 64-bit kernel (in fact, a whole new 64-bit installation), it freezes, becomes unresponsive and I have to reboot.

3- I then installed the proprietary nvidia drivers:

[ 18.486] (II) NVIDIA(0): Creating default Display subsection in Screen section
[ 19.144] (II) NVIDIA(0): NVIDIA GPU GeForce 9400M G (C79) at PCI:3:0:0 (GPU-0)
[ 19.168] (II) NVIDIA(0): Assigned Display Device: DFP-0
[ 19.168] (II) NVIDIA(0): Validated modes:
[ 19.168] (II) NVIDIA(0): ""nvidia-auto-select""

With these proprietary drivers, the system successfully suspends, and comes back from restore with some garbling on the screen, what I did was maximizing the terminal (F11) and that basically "sweeps" the display and it's usable, although the background itself turns white. So it's better and possibly usable, it might need some work done, but more importantly, confirms your diagnosis about nouveau keeping the system from successfully suspending.

Let me know if more testing is needed.

Thanks again,
- Daniel

Revision history for this message

Seth Forshee (sforshee) wrote on 2011-04-14:

#7

/proc/vmallocinfo shows all the vmalloc mappings. We could take a look at that to get an idea of what's consuming the address space. The VmallocTotal, VmallocUsed, and VmallocChunk fields in /proc/meminfo would also be useful to look at.

Revision history for this message

Seth Forshee (sforshee) wrote on 2011-04-14:

#8

Do you get any kind of panic message or anything else when the system freezes after you increase vmalloc? Try running pm-suspend from within vt1 to see if anything shows up on the screen. You might also try using magic sysrq when it's frozen, try alt-sysrq-p to get a dump of the current task's state.

The freeze may well be a different problem.

Revision history for this message

Daniel Manrique (roadmr) wrote on 2011-04-14:

#9

vmalloc.txt Edit (12.9 KiB, text/plain)

Hi, I'm attaching vmallocinfo and meminfo.

Revision history for this message

Daniel Manrique (roadmr) wrote on 2011-04-14:

#10

meminfo.txt Edit (1.1 KiB, text/plain)

Revision history for this message

Seth Forshee (sforshee) wrote on 2011-04-14:

#11

Ah, you're using a 64-bit kernel now.

Some of the biggest vmalloc areas are coming from nouveau.

0xffffc90002480000-0xffffc90002c81000 8392704 nouveau_load+0xa7/0x550 [nouveau] phys=ae000000 ioremap
0xffffc90002d00000-0xffffc90004d01000 33558528 nouveau_load+0x2f1/0x550 [nouveau] phys=ac000000 ioremap
0xffffc90006100000-0xffffc90006901000 8392704 nouveau_load+0xa7/0x550 [nouveau] phys=aa000000 ioremap
0xffffc90006980000-0xffffc90008981000 33558528 nouveau_load+0x2f1/0x550 [nouveau] phys=cc000000 ioremap

That's over 80 MB from nouveau (and I omitted several others of a couple of pages each). Some other areas are using sizable amounts as well, the most notable of which is audio. meminfo shows the amount of vmalloc used as 138284 kB, and if it's anywhere near that on a 32-bit install it's not hard to see why it might start having problems.

Revision history for this message

Daniel Manrique (roadmr) wrote on 2011-04-15:

#12

dmesg after suspend/resume using the vesa videodriver and xforcevesa nomodeset Edit (122.5 KiB, text/plain)

OK, so I reinstalled the 32-bit kernel, however I'm unable to go to a vt to see if it gives a panic message while suspending, when I press say ctrl+alt+f1 the graphical cursor disappears but the rest of the desktop stays visible, looks like the display didn't get reset to text mode, so I can't see what I'm typing (and certainly no debugging messages). if I press alt+f7 I "go back" to graphical mode, the cursor reappears and the screen is responsive again.

The only way I found to get to a console was to use xforcevesa nomodeset, though of course that probably changes things in other ways (like not using the nouveau driver). The system enters suspend and of course, upon resuming, the screen is blank (with no backlight). However other than that, the system appears to have recovered, as I was able to ssh in and recover a dmesg file I'm attaching.

meminfo says the following about vmalloc:
VmallocTotal: 122880 kB
VmallocUsed: 22356 kB
VmallocChunk: 93500 kB

The top entries in vmalloc (sorted by size) are:

0xf84e5000-0xf8508000 143360 kvmalloc+0x3f/0x50 pages=34 vmalloc
0xf85d6000-0xf85f9000 143360 kvmalloc+0x3f/0x50 pages=34 vmalloc
0xf82ec000-0xf8314000 163840 module_alloc_update_bounds+0x19/0x70 pages=39 vmalloc
0xf84a9000-0xf84d6000 184320 module_alloc_update_bounds+0x19/0x70 pages=44 vmalloc
0xf842a000-0xf846a000 262144 module_alloc_update_bounds+0x19/0x70 pages=63 vmalloc
0xf83d6000-0xf8420000 303104 module_alloc_update_bounds+0x19/0x70 pages=73 vmalloc
0xf85fa000-0xf8693000 626688 module_alloc_update_bounds+0x19/0x70 pages=152 vmalloc
0xf8201000-0xf82b2000 724992 sys_swapon+0x428/0x8a0 pages=176 vmalloc
0xf9380000-0xf94b1000 1249280 0xf80361a9 phys=cd000000 ioremap
0xff000000-0xff400000 4194304 pcpu_get_vm_areas+0x0/0x4c0 vmalloc
0xf8711000-0xf8b12000 4198400 snd_malloc_sgbuf_pages+0x1a5/0x202 [snd_page_alloc] vmap
0xf8b13000-0xf8f14000 4198400 snd_malloc_sgbuf_pages+0x1a5/0x202 [snd_page_alloc] vmap
0xf8f15000-0xf9316000 4198400 snd_malloc_sgbuf_pages+0x1a5/0x202 [snd_page_alloc] vmap

OK, so I reinstalled the 32-bit kernel,  however I'm unable to go to a vt to see if it gives a panic message while suspending, when I press say ctrl+alt+f1 the graphical cursor disappears but the rest of the desktop stays visible, looks like the display didn't get reset to text mode, so I can't see what I'm typing (and certainly no debugging messages). if I press alt+f7 I "go back" to graphical mode, the cursor reappears and the screen is responsive again.

The only way I found to get to a console was to use xforcevesa nomodeset, though of course that probably changes things in other ways (like not using the nouveau driver). The system enters suspend and of course, upon resuming, the screen is blank (with no backlight). However other than that, the system appears to have recovered, as I was able to ssh in and recover a dmesg file I'm attaching.

meminfo says the following about vmalloc:
VmallocTotal:     122880 kB
VmallocUsed:       22356 kB
VmallocChunk:      93500 kB

The top entries in vmalloc (sorted by size) are:

0xf84e5000-0xf8508000  143360 kvmalloc+0x3f/0x50 pages=34 vmalloc
0xf85d6000-0xf85f9000  143360 kvmalloc+0x3f/0x50 pages=34 vmalloc
0xf82ec000-0xf8314000  163840 module_alloc_update_bounds+0x19/0x70 pages=39 vmalloc
0xf84a9000-0xf84d6000  184320 module_alloc_update_bounds+0x19/0x70 pages=44 vmalloc
0xf842a000-0xf846a000  262144 module_alloc_update_bounds+0x19/0x70 pages=63 vmalloc
0xf83d6000-0xf8420000  303104 module_alloc_update_bounds+0x19/0x70 pages=73 vmalloc
0xf85fa000-0xf8693000  626688 module_alloc_update_bounds+0x19/0x70 pages=152 vmalloc
0xf8201000-0xf82b2000  724992 sys_swapon+0x428/0x8a0 pages=176 vmalloc
0xf9380000-0xf94b1000 1249280 0xf80361a9 phys=cd000000 ioremap
0xff000000-0xff400000 4194304 pcpu_get_vm_areas+0x0/0x4c0 vmalloc
0xf8711000-0xf8b12000 4198400 snd_malloc_sgbuf_pages+0x1a5/0x202 [snd_page_alloc] vmap
0xf8b13000-0xf8f14000 4198400 snd_malloc_sgbuf_pages+0x1a5/0x202 [snd_page_alloc] vmap
0xf8f15000-0xf9316000 4198400 snd_malloc_sgbuf_pages+0x1a5/0x202 [snd_page_alloc] vmap

Revision history for this message

Seth Forshee (sforshee) wrote on 2011-04-15:

#13

That's really strange that you can't switch vts. We may be looking at more than one bug here. Does the same thing happen if you run 'sudo chvt 1' in a console?

Are the meminfo and vmalloc dumps from a boot with xforcevesa nomodeset? Because it really isn't using much of the vmalloc area, so I'd be surprised to see the vmap failures from the original "won't suspend" problem in that situation.

It might be best to try to attack the various issues one at a time. Starting with the vmalloc problem, I guess the first thing is for you to tell me whether the meminfo/vmalloc information you just supplied is from a "xforcevesa nomodeset" or not. If it is, I'd be interested to see the same information with the default kernel command-line.

The next step is probably to test the 2.6.38.3 mainline build since that's closest to natty's kernel and will tell us which direction we need to start looking to track down the regression. You can grab this build at:

http://kernel.ubuntu.com/~kernel-ppa/mainline/v2.6.38.3-natty/

Revision history for this message

Daniel Manrique (roadmr) wrote on 2011-04-15:

#14

Hi Seth,

Yes, I apologize, I guess I panicked about being unable to switch consoles so I started doing nonsense.

So let's go one step at a time:

here's vmalloc (top offenders) and meminfo from a boot without xforcevesa and nomodeset (i.e. using the nouveau driver as during the observed failures).

0xf845e000-0xf8481000 143360 kvmalloc+0x3f/0x50 pages=34 vmalloc
0xf8482000-0xf84a5000 143360 kvmalloc+0x3f/0x50 pages=34 vmalloc
0xf8303000-0xf832b000 163840 module_alloc_update_bounds+0x19/0x70 pages=39 vmalloc
0xf84da000-0xf8507000 184320 module_alloc_update_bounds+0x19/0x70 pages=44 vmalloc
0xf83ef000-0xf842f000 262144 module_alloc_update_bounds+0x19/0x70 pages=63 vmalloc
0xf839b000-0xf83e5000 303104 module_alloc_update_bounds+0x19/0x70 pages=73 vmalloc
0xf86a4000-0xf873d000 626688 module_alloc_update_bounds+0x19/0x70 pages=152 vmalloc
0xf8201000-0xf82b2000 724992 sys_swapon+0x428/0x8a0 pages=176 vmalloc
0xf87fc000-0xf89fd000 2101248 drm_ht_create+0x54/0xd0 [drm] pages=512 vmalloc
0xfb602000-0xfb803000 2101248 drm_ht_create+0x54/0xd0 [drm] pages=512 vmalloc
0xfed76000-0xfef77000 2101248 snd_malloc_sgbuf_pages+0x1a5/0x202 [snd_page_alloc] vmap
0xfb300000-0xfb601000 3149824 ttm_bo_kmap+0xff/0x120 [ttm] phys=d000c000 ioremap
0xfe180000-0xfe571000 4132864 ttm_bo_kmap+0xff/0x120 [ttm] phys=b000c000 ioremap
0xff000000-0xff400000 4194304 pcpu_get_vm_areas+0x0/0x4c0 vmalloc
0xfe572000-0xfe973000 4198400 snd_malloc_sgbuf_pages+0x1a5/0x202 [snd_page_alloc] vmap
0xfe974000-0xfed75000 4198400 snd_malloc_sgbuf_pages+0x1a5/0x202 [snd_page_alloc] vmap
0xf8a00000-0xf9201000 8392704 nouveau_load+0x9c/0x4f0 [nouveau] phys=ae000000 ioremap
0xfb880000-0xfc081000 8392704 nouveau_load+0x9c/0x4f0 [nouveau] phys=aa000000 ioremap
0xf9280000-0xfb281000 33558528 nouveau_load+0x2a9/0x4f0 [nouveau] phys=ac000000 ioremap
0xfc100000-0xfe101000 33558528 nouveau_load+0x2a9/0x4f0 [nouveau] phys=cc000000 ioremap

VmallocTotal: 122880 kB
VmallocUsed: 112688 kB
VmallocChunk: 4088 kB

I will test the mainline build you suggested and report back as soon as I have something.

Finally, for the vt switching problem, I ran the command you suggested and I had the same behavior (i.e. can't switch to the vt). However, I've seen this on one other system (a Dell Vostro 3400 with dual graphics which is using the intel driver), so I think I'll do some more testing about that and report it as a different bug.

Thanks so much for your help!

Hi Seth,

Yes, I apologize, I guess I panicked about being unable to switch consoles so I started doing nonsense.

So let's go one step at a time:

here's vmalloc (top offenders) and meminfo from a boot without xforcevesa and nomodeset (i.e. using the nouveau driver as during the observed failures).

0xf845e000-0xf8481000  143360 kvmalloc+0x3f/0x50 pages=34 vmalloc
0xf8482000-0xf84a5000  143360 kvmalloc+0x3f/0x50 pages=34 vmalloc
0xf8303000-0xf832b000  163840 module_alloc_update_bounds+0x19/0x70 pages=39 vmalloc
0xf84da000-0xf8507000  184320 module_alloc_update_bounds+0x19/0x70 pages=44 vmalloc
0xf83ef000-0xf842f000  262144 module_alloc_update_bounds+0x19/0x70 pages=63 vmalloc
0xf839b000-0xf83e5000  303104 module_alloc_update_bounds+0x19/0x70 pages=73 vmalloc
0xf86a4000-0xf873d000  626688 module_alloc_update_bounds+0x19/0x70 pages=152 vmalloc
0xf8201000-0xf82b2000  724992 sys_swapon+0x428/0x8a0 pages=176 vmalloc
0xf87fc000-0xf89fd000 2101248 drm_ht_create+0x54/0xd0 [drm] pages=512 vmalloc
0xfb602000-0xfb803000 2101248 drm_ht_create+0x54/0xd0 [drm] pages=512 vmalloc
0xfed76000-0xfef77000 2101248 snd_malloc_sgbuf_pages+0x1a5/0x202 [snd_page_alloc] vmap
0xfb300000-0xfb601000 3149824 ttm_bo_kmap+0xff/0x120 [ttm] phys=d000c000 ioremap
0xfe180000-0xfe571000 4132864 ttm_bo_kmap+0xff/0x120 [ttm] phys=b000c000 ioremap
0xff000000-0xff400000 4194304 pcpu_get_vm_areas+0x0/0x4c0 vmalloc
0xfe572000-0xfe973000 4198400 snd_malloc_sgbuf_pages+0x1a5/0x202 [snd_page_alloc] vmap
0xfe974000-0xfed75000 4198400 snd_malloc_sgbuf_pages+0x1a5/0x202 [snd_page_alloc] vmap
0xf8a00000-0xf9201000 8392704 nouveau_load+0x9c/0x4f0 [nouveau] phys=ae000000 ioremap
0xfb880000-0xfc081000 8392704 nouveau_load+0x9c/0x4f0 [nouveau] phys=aa000000 ioremap
0xf9280000-0xfb281000 33558528 nouveau_load+0x2a9/0x4f0 [nouveau] phys=ac000000 ioremap
0xfc100000-0xfe101000 33558528 nouveau_load+0x2a9/0x4f0 [nouveau] phys=cc000000 ioremap

VmallocTotal:     122880 kB
VmallocUsed:      112688 kB
VmallocChunk:       4088 kB

I will test the mainline build you suggested and report back as soon as I have something.

Finally, for the vt switching problem, I ran the command you suggested and I had the same behavior (i.e. can't switch to the vt). However, I've seen this on one other system (a Dell Vostro 3400 with dual graphics which is using the intel driver), so I think I'll do some more testing about that and report it as a different bug.

Thanks so much for your help!

Revision history for this message

Daniel Manrique (roadmr) wrote on 2011-04-15:

#15

I tested the 2.6.38-3-natty mainline kernel. The system boots and is usable, enters suspend mode, but upon attempting to resume is unresponsive, the backlight doesn't come up, no display, keyboard is unresponsive and system doesn't respond to pings on the network. At some point I saw capslock flashing but it stopped after a bit.

I also tested v2.6.35.12-maverick, with this kernel the system boots but when trying to enter graphical mode becomes unresponsive, no keyboard, network ping or display (screen is black). Maybe some Ubuntu-specific modifications are what enabled the actual shipped Maverick kernel to work?

Seth Forshee (sforshee) on 2011-04-15

Changed in linux (Ubuntu):
assignee:	Canonical Kernel Team (canonical-kernel-team) → Seth Forshee (sforshee)
status:	Triaged → In Progress

Revision history for this message

Seth Forshee (sforshee) wrote on 2011-04-15:

#16

Okay, that vmalloc information is more like what I expected. Space is pretty tight.

It's interesting that the 2.6.38.3 mainline build doesn't have the same problems with suspend. So to start we can try to figure out what's different there that's making suspend fail in natty. Can you grab the vmalloc and meminfo dumps with the mainline kernel so we can check if anything there is significantly different? And I'll scan our patches on top of mainline to see if anything jumps out as potentially related.

Thanks!

Revision history for this message

Daniel Manrique (roadmr) wrote on 2011-04-15:

#17

vmalloc-mainline.txt Edit (9.3 KiB, text/plain)

Hi,
Seeing as to how the kernel shipped with Maverick worked, whereas the mainline 2.6.35 one doesn't, maybe some ubuntu-specific patch enables things to work correctly on that one.

In any case, I'm currently with this kernel:

Linux 200912-4906 2.6.38-02063803-generic #201104150912 SMP Fri Apr 15 10:37:38 UTC 2011 i686 i686 i386 GNU/Linux

meminfo is manageable so I'm posting that here:

MemTotal: 2314600 kB
MemFree: 1905148 kB
Buffers: 27732 kB
Cached: 204272 kB
SwapCached: 0 kB
Active: 156192 kB
Inactive: 188916 kB
Active(anon): 113824 kB
Inactive(anon): 1876 kB
Active(file): 42368 kB
Inactive(file): 187040 kB
Unevictable: 0 kB
Mlocked: 0 kB
HighTotal: 1448712 kB
HighFree: 1102132 kB
LowTotal: 865888 kB
LowFree: 803016 kB
SwapTotal: 2880508 kB
SwapFree: 2880508 kB
Dirty: 60 kB
Writeback: 0 kB
AnonPages: 113140 kB
Mapped: 42952 kB
Shmem: 2600 kB
Slab: 24348 kB
SReclaimable: 13192 kB
SUnreclaim: 11156 kB
KernelStack: 2344 kB
PageTables: 2992 kB
NFS_Unstable: 0 kB
Bounce: 0 kB
WritebackTmp: 0 kB
CommitLimit: 4037808 kB
Committed_AS: 989152 kB
VmallocTotal: 122880 kB
VmallocUsed: 102720 kB
VmallocChunk: 10812 kB
HardwareCorrupted: 0 kB
HugePages_Total: 0
HugePages_Free: 0
HugePages_Rsvd: 0
HugePages_Surp: 0
Hugepagesize: 4096 kB
DirectMap4k: 12280 kB
DirectMap4M: 897024 kB

I'm also attaching vmalloc-mainline.txt which is a bit larger, but I thought it would be good to have the complete file here.

Let me know when any more information or testing is needed, remember this machine lives for testing so we can be as invasive with it as necessary.

Thanks again for all your help!

Kate Stewart (kate.stewart) on 2011-04-16

Changed in linux (Ubuntu Natty):
milestone:	none → ubuntu-11.04
milestone:	ubuntu-11.04 → natty-updates
Changed in linux (Ubuntu Oneiric):
status:	New → In Progress
importance:	Undecided → High

Revision history for this message

Seth Forshee (sforshee) wrote on 2011-04-18:

#18

@Daniel, you don't happen to have the full /proc/vmalloc dump for the 32-bit natty kernel, do you? That would help with determining what's using more vmalloc space in natty versus mainline.

Revision history for this message

Daniel Manrique (roadmr) wrote on 2011-04-18:

#19

vmalloc-natty-32bit.txt Edit (9.3 KiB, text/plain)

Hi Seth,

Here it is, produced with this kernel:

Linux 200912-4906 2.6.38-8-generic #42-Ubuntu SMP Mon Apr 11 03:31:50 UTC 2011 i686 i686 i386 GNU/Linux

Revision history for this message

Seth Forshee (sforshee) wrote on 2011-04-18:

#20

The vmap differences can be almost completely attributed to a patch we're carrying to the snd-hda-intel driver to increase the audio buffer size to "improve the audio experience." I'll look into it but I'd guess there's a good reason for the patch.

Moving on to the next problem. I'd like to see what's going on with the suspend hang when you increase the vmalloc size, but having working vt's might be helpful. Did you file a bug for the vt problem?

Since you don't have vt's, you could try booting in recovery mode (boot with no_console_suspend) and run pm-suspend to see if it works there and if you get any interesting output. You can also try some of the steps in the following wiki pages to see if they yield anything useful.

https://wiki.ubuntu.com/DebuggingKernelSuspend
https://wiki.ubuntu.com/DebuggingKernelSuspendHibernateResume

Seth Forshee (sforshee) on 2011-04-19

Changed in linux (Ubuntu Natty):
status:	In Progress → Incomplete

Revision history for this message

Daniel Manrique (roadmr) wrote on 2011-04-19:

#21

dmesg-4906.txt Edit (122.6 KiB, text/plain)

hi Seth,

I did the DebuggingKernelSuspend procedure and this is what popped out:

[ 1.100693] Magic number: 0:846:402
[ 1.100695] hash matches /build/buildd/linux-2.6.38/drivers/base/power/main.c:535
[ 1.100720] pci 0000:03:00.0: hash matches

I'm attaching the entire dmesg from that run to this comment.

I also tried no_console_suspend in combination with vmalloc=256M, when I issue pm-suspend from a terminal (still can't switch to a vt) the system "freezes" (same behavior as in #6). I see no useful messages :(

Changed in linux (Ubuntu Natty):
status:	Incomplete → In Progress

Revision history for this message

Seth Forshee (sforshee) wrote on 2011-04-20:

#22

Daniel,

Thanks for testing. If that PCI id is accurate then it seems we're still looking for some kind of problem with the nouveau driver. When you see the hang, is your caps lock led blinking?

I think my wording for one of my suggested test cases was confusing. I think it would be useful if you could boot into recovery mode by holding left-shift when booting to get the grub menu and selecting the "recovery mode" boot option. Also modify the kernel command-line when you boot into recovery mode to include "vmalloc=256M no_console_suspend". This should boot you to a text-mode terminal where you can run pm-suspend without the graphical UI in the way. Chances are that you'll either get working suspend-resume or won't see any useful output, but it's worth a try.

Revision history for this message

Daniel Manrique (roadmr) wrote on 2011-04-20:

#23

4096-c.jpg Edit (93.7 KiB, image/jpeg)

Hi Seth,

I repeated the suspend to get the system to hang again, there's no flashing caps lock :(

Also, I tried your suggestion to boot in single-user text mode. There seems to be some problem initializing the display, I'm attaching a picture of what I see, and nothing I type shows up. What's odd is that the system is "alive", again, I can blind-type and I can even issue commands to reboot the system, bring up X (which comes up just fine) or make it suspend in that state. Of course, this is not useful as I still can't see actual text on the console :( I tried using the nosplash kernel parameter but there was no change in the display behavior.

Revision history for this message

Seth Forshee (sforshee) wrote on 2011-04-21:

#24

Boy, that machine just has all sorts of problems with graphics. Kind of makes it hard to decide what to pound on first.

I checked, and we aren't carrying any patches to nouveau in the natty kernel. So I'm a bit puzzled why you see different behavior with natty (with vmalloc size increased) versus mainline unless it's due to something external to nouveau. Looking closer at the pm_trace it seems that it really only traces device resume and not suspend, so it's a bit interesting that pm_trace showed anything at all. Makes me wonder if something went wrong elsewhere in suspend and then the machine hung in the nouveau code while trying to back out of the failed suspend.

Probably the most effective thing to do at this point is to open upstream bug reports against the issues you see when running a mainline build. If you don't mind filing the bugs yourself it might be easier since you actually have the hardware, otherwise I can do it. The appropriate location for the upstream bug reports is:

https://bugs.freedesktop.org/

Revision history for this message

Daniel Ricao Canelhas (daniel-canelhas) wrote on 2011-05-02:

#25

This seems similar to:
suspend hibernation not working on dell 1749
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/748994

Revision history for this message

Ara Pulido (ara) wrote on 2011-05-23:

#26

Seth, any updates on this bug?

Revision history for this message

Seth Forshee (sforshee) wrote on 2011-05-23:

#27

No updates. My suggestion in comment #24 was to start working with upstream on the issue. I offered to help with filing the bug upstream if needed but haven't seen any response from Daniel. If possible I think it's more efficient for him to interface directly since I don't have the hardware, but I'm willing to be an intermediary or at least file the initial bug report.

Revision history for this message

Seth Forshee (sforshee) wrote on 2011-05-23:

#28

Daniel, if you do file an upstream bug also be sure to link to it here. I certainly intend to follow the progress and provide support as needed.

Revision history for this message

In freedesktop.org Bugzilla #37549, Daniel Manrique (roadmr) wrote on 2011-05-24:

#50

Download full text (3.5 KiB)

This problem was originally reported here:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/754711

The system works correctly with the Ubuntu-patched 2.6.35 Kernel as shipped with Ubuntu 10.10, but fails with the 2.6.38 kernel from Ubuntu 11.04.

While tracking down this problem, I tested several mainline kernels, all of which also fail to suspend/resume on this system, as I describe below.

Steps to reproduce:
- try to enter suspend mode (I used sudo pm-suspend)

Expected result:
- The system enters sleep mode and upon pressing the power switch, resumes successfully

Actual result:
I tried this with the following kernels (all mainline):

2.6.39-999.201104080911 - The system enters suspend mode, but upon trying to resume, the system is unresponsive: backlight doesn't come on, there is no display, keyboard is unresponsive, and system doesn't respond to pings on the network.

2.6.38-3-natty mainline kernel. The system boots and is usable, enters suspend mode, but upon attempting to resume is unresponsive, the backlight doesn't come up, no display, keyboard is unresponsive and system doesn't respond to pings on the network. At some point I saw capslock flashing but it stopped after a bit.

v2.6.35.12-maverick mainline kernel, with this kernel the system boots but when trying to enter graphical mode becomes unresponsive, no keyboard, network ping or display (screen is black).

Here is some relevant information on the system, please let me know if any more tests are needed.

dmi.bios.date: 09/08/2009
dmi.bios.vendor: Dell Inc.
dmi.bios.version: A11
dmi.board.name: 0K183D
dmi.board.vendor: Dell Inc.
dmi.board.version: A11
dmi.chassis.asset.tag: 1234567890
dmi.chassis.type: 8
dmi.chassis.vendor: Dell Inc.
dmi.chassis.version: A11
dmi.modalias: dmi:bvnDellInc.:bvrA11:bd09/08/2009:svnDellInc.:pnStudioXPS1340:pvrA11:rvnDellInc.:rn0K183D:rvrA11:cvnDellInc.:ct8:cvrA11:
dmi.product.name: Studio XPS 1340
dmi.product.version: A11
dmi.sys.vendor: Dell Inc.

GraphicsCard:
02:00.0 VGA compatible controller [0300]: nVidia Corporation G98 [GeForce 9200M GS] [10de:06e8] (rev a1) (prog-if 00 [VGA controller])
Subsystem: Dell Device [1028:0271]
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 0, Cache Line Size: 64 bytes
Interrupt: pin A routed to IRQ 23
Region 0: Memory at ae000000 (32-bit, non-prefetchable) [size=16M]
Region 1: Memory at d0000000 (64-bit, prefetchable) [size=256M]
Region 3: Memory at ac000000 (64-bit, non-prefetchable) [size=32M]
Region 5: I/O ports at 4000 [size=128]
Capabilities: <access denied>
Kernel driver in use: nouveau
Kernel modules: nouveau, nvidiafb

03:00.0 VGA compatible controller [0300]: nVidia Corporation C79 [GeForce 9400M G] [10de:0866] (rev b1) (prog-if 00 [VGA controller])
Subsystem: Dell Device [1028:0271]
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 0, Cache Line Size: 64 bytes
...