NetworkManager: page allocation failure. order:3, mode:0x4020

Bug #655413 reported by Phillip Susi
32
This bug affects 5 people
Affects Status Importance Assigned to Milestone
linux (Fedora)
Fix Released
Medium
linux (Ubuntu)
Invalid
Undecided
Unassigned

Bug Description

After resuming from suspend today, network manager was showing the no connection, no wireless signal, which was odd since I am on a desktop with no wireless card. I checked dmesg and it seems that there had been a kernel OOPS in network manager:

Restarting tasks ... done.
NetworkManager: page allocation failure. order:3, mode:0x4020
Pid: 1076, comm: NetworkManager Not tainted 2.6.35-22-generic #33-Ubuntu
Call Trace:

 [<ffffffff81108253>] __alloc_pages_slowpath+0x583/0x590
 [<ffffffff813d69f0>] ? ata_scsi_rw_xlat+0x0/0x200
 [<ffffffff811083fa>] __alloc_pages_nodemask+0x19a/0x1f0
 [<ffffffff8113fb02>] kmalloc_large_node+0x62/0xb0
 [<ffffffff8114367c>] __kmalloc_node_track_caller+0x13c/0x1f0
 [<ffffffff8148fcd6>] ? __netdev_alloc_skb+0x36/0x60
 [<ffffffff8148f9c3>] __alloc_skb+0x83/0x170
 [<ffffffff81049964>] ? scale_rt_power+0x24/0x70
 [<ffffffff8148fcd6>] __netdev_alloc_skb+0x36/0x60
 [<ffffffffa002f1fc>] rtl8169_rx_fill+0xbc/0x260 [r8169]
 [<ffffffff81010de0>] ? nommu_map_page+0x0/0xc0
 [<ffffffffa002fe63>] rtl8169_init_ring+0x73/0xb0 [r8169]
 [<ffffffffa0030141>] rtl8169_open+0x1b1/0x460 [r8169]
 [<ffffffff8149b297>] __dev_open+0xa7/0xf0
 [<ffffffff814996e1>] __dev_change_flags+0xa1/0x180
 [<ffffffff8149b1a8>] dev_change_flags+0x28/0x70
 [<ffffffff814a8c35>] do_setlink+0x1e5/0x840
 [<ffffffff8148b4fc>] ? sock_rmalloc+0x3c/0xa0
 [<ffffffff8148f98f>] ? __alloc_skb+0x4f/0x170
 [<ffffffff812ccef4>] ? nla_parse+0x34/0x110
 [<ffffffff814a968e>] rtnl_setlink+0x11e/0x170
 [<ffffffff814a86c7>] rtnetlink_rcv_msg+0x177/0x290
 [<ffffffff814a8550>] ? rtnetlink_rcv_msg+0x0/0x290
 [<ffffffff814c2089>] netlink_rcv_skb+0xa9/0xd0
 [<ffffffff814a8535>] rtnetlink_rcv+0x25/0x40
 [<ffffffff814c1cee>] netlink_unicast+0x2de/0x2f0
 [<ffffffff814c2aee>] netlink_sendmsg+0x1fe/0x2e0
 [<ffffffff81488023>] sock_sendmsg+0xf3/0x120
 [<ffffffff81488023>] ? sock_sendmsg+0xf3/0x120
 [<ffffffff81486ae5>] ? move_addr_to_kernel+0x65/0x70
 [<ffffffff81492db8>] ? verify_iovec+0x88/0xe0
 [<ffffffff81488b10>] sys_sendmsg+0x240/0x3a0
 [<ffffffff8116a11f>] ? destroy_inode+0x2f/0x60
 [<ffffffff81488228>] ? sys_sendto+0x178/0x180
 [<ffffffff811542e9>] ? __fput+0x199/0x210
 [<ffffffff8148921e>] ? sys_recvmsg+0x6e/0x80
 [<ffffffff8100a0f2>] system_call_fastpath+0x16/0x1b

 Mem-Info:
 Node 0 DMA per-cpu:
 CPU 0: hi: 0, btch: 1 usd: 0
 CPU 1: hi: 0, btch: 1 usd: 0
 Node 0 DMA32 per-cpu:
 CPU 0: hi: 186, btch: 31 usd: 39
 CPU 1: hi: 186, btch: 31 usd: 162
 active_anon:79068 inactive_anon:39694 isolated_anon:0
  active_file:162226 inactive_file:157712 isolated_file:0
  unevictable:4 dirty:27 writeback:0 unstable:0
  free:9229 slab_reclaimable:10662 slab_unreclaimable:5553
  mapped:18263 shmem:1642 pagetables:7742 bounce:0
 Node 0 DMA free:8032kB min:40kB low:48kB high:60kB active_anon:0kB inactive_anon:0kB active_file:3540k
 lowmem_reserve[]: 0 2003 2003 2003
 Node 0 DMA32 free:29008kB min:5704kB low:7128kB high:8556kB active_anon:316272kB inactive_anon:158776k
 lowmem_reserve[]: 0 0 0 0
 Node 0 DMA: 10*4kB 9*8kB 7*16kB 2*32kB 1*64kB 2*128kB 3*256kB 1*512kB 2*1024kB 2*2048kB 0*4096kB = 803
 Node 0 DMA32: 888*4kB 1742*8kB 690*16kB 7*32kB 0*64kB 0*128kB 1*256kB 0*512kB 0*1024kB 0*2048kB 0*4096
 321554 total pagecache pages
 0 pages in swap cache
 Swap cache stats: add 0, delete 0, find 0/0
 Free swap = 0kB
 Total swap = 0kB
 523984 pages RAM
 9977 pages reserved
 320787 pages shared
 351164 pages non-shared

Revision history for this message
In , A. (a.-redhat-bugs) wrote :

Description of problem:
Network adapter "disappears" after resuming from acpi suspend. Network Managaer doesn't see the device, the device doesn't show up in kinfocenter, either.

Version-Release number of selected component (if applicable):
kernel-2.6.33.6-147.2.4.fc13.x86_64

How reproducible:
Every time.

Steps to Reproduce:
1. Switch on computer
2. Ascertain that networking works
3. Suspend
4. Resume (switch it on again)
5. See how there is no networking

Actual results:
No networking

Expected results:
Everything works normally

Additional info:

From /var/log/messages:

Sep 1 06:54:09 localhost NetworkManager[1234]: <info> wake requested (sleeping: yes enabled: yes)
Sep 1 06:54:09 localhost NetworkManager[1234]: <info> waking up and re-enabling...
Sep 1 06:54:09 localhost NetworkManager[1234]: <info> (eth2): now managed
Sep 1 06:54:09 localhost NetworkManager[1234]: <info> (eth2): device state change: 1 -> 2 (reason 2)
Sep 1 06:54:09 localhost NetworkManager[1234]: <info> (eth2): bringing up device.
Sep 1 06:54:09 localhost kernel: NetworkManager: page allocation failure. order:3, mode:0x4020
Sep 1 06:54:09 localhost kernel: Pid: 1234, comm: NetworkManager Not tainted 2.6.33.6-147.2.4.fc13.x86_64 #1

From dmesg:

NetworkManager: page allocation failure. order:3, mode:0x4020
Pid: 1234, comm: NetworkManager Not tainted 2.6.33.6-147.2.4.fc13.x86_64 #1
Call Trace:
 [<ffffffff810c6d88>] __alloc_pages_nodemask+0x5ad/0x630
 [<ffffffff810f4058>] kmalloc_large_node+0x5a/0x97
 [<ffffffff810f5803>] __kmalloc_node_track_caller+0x2c/0x119
 [<ffffffff81381651>] ? __netdev_alloc_skb+0x2f/0x4c
 [<ffffffff81381224>] __alloc_skb+0x7b/0x16b
 [<ffffffff81381651>] __netdev_alloc_skb+0x2f/0x4c
 [<ffffffffa015d51b>] rtl8169_rx_fill+0xa3/0x14f [r8169]
 [<ffffffffa015f4f7>] rtl8169_init_ring+0x6c/0x99 [r8169]
 [<ffffffffa015fcf3>] rtl8169_open+0x7a/0x194 [r8169]
 [<ffffffff8138adfd>] dev_open+0x98/0xd3
 [<ffffffff8138a35c>] dev_change_flags+0xb9/0x179
 [<ffffffff81393925>] do_setlink+0x26c/0x33d
 [<ffffffff811c2f5b>] ? avc_has_perm+0x57/0x69
 [<ffffffff81393af3>] rtnl_setlink+0xfd/0x110
 [<ffffffff81393372>] rtnetlink_rcv_msg+0x1c1/0x1de
 [<ffffffff813931b1>] ? rtnetlink_rcv_msg+0x0/0x1de
 [<ffffffff813a433c>] netlink_rcv_skb+0x3e/0x8f
 [<ffffffff813931aa>] rtnetlink_rcv+0x21/0x28
 [<ffffffff813a411b>] netlink_unicast+0xe6/0x14f
 [<ffffffff813a4e22>] netlink_sendmsg+0x254/0x263
 [<ffffffff813793b1>] __sock_sendmsg+0x59/0x64
 [<ffffffff813796ae>] sock_sendmsg+0xa3/0xbc
 [<ffffffff813796ae>] ? sock_sendmsg+0xa3/0xbc
 [<ffffffff8137833b>] ? might_fault+0x1c/0x1e
 [<ffffffff81382ce0>] ? copy_from_user+0x2a/0x2c
 [<ffffffff813830b2>] ? verify_iovec+0x4f/0x8d
 [<ffffffff8137997e>] sys_sendmsg+0x217/0x29b
 [<ffffffff8137972f>] ? sockfd_lookup_light+0x1b/0x53
 [<ffffffff81379712>] ? fput_light+0xd/0xf
 [<ffffffff8137b274>] ? sys_sendto+0x120/0x14d
 [<ffffffff81109057>] ? path_put+0x1d/0x22
 [<ffffffff81095cff>] ? audit_syscall_entry+0x119/0x145
 [<ffffffff81009b02>] system_call_fastpath+0x16/0x1b

Revision history for this message
In , A. (a.-redhat-bugs) wrote :

I forgot. Here is some data about my hardware:
# dmidecode
<SNIP>
Handle 0x0002, DMI type 2, 15 bytes
Base Board Information
        Manufacturer: ASUSTeK Computer INC.
        Product Name: M4A88T-M
<SNIP>
Handle 0x0004, DMI type 4, 40 bytes
Processor Information
        Socket Designation: AM3
<SNIP>
Handle 0x002D, DMI type 10, 6 bytes
On Board Device Information
        Type: Ethernet
        Status: Enabled
        Description: To Be Filled By O.E.M.
<SNIP>
#

Revision history for this message
In , Stanislaw (stanislaw-redhat-bugs) wrote :

After resume network driver rtl8169 is not able to allocate memory and fail to initialize, hence "disappears" effect. I will prepare patch with change allocation strategy, what should fix the problem.

Revision history for this message
In , Stanislaw (stanislaw-redhat-bugs) wrote :

Created attachment 446284
f13-r8169-alloc-fix.patch

Could you please test this patch. Please build debug kernel since it catch bugs when improper allocation method is used. I you can not build kernel by yourself, I will prepare packages tomorrow. Thanks.

Revision history for this message
In , Stanislaw (stanislaw-redhat-bugs) wrote :

http://koji.fedoraproject.org/koji/taskinfo?taskID=2459629
Please test kernel-debug when it finish to build.

Revision history for this message
In , Stanislaw (stanislaw-redhat-bugs) wrote :

These koji builds are automatically removed after about a week, so please test soon.

Revision history for this message
In , A. (a.-redhat-bugs) wrote :

Created attachment 446963
excerpt of /var/log/messages, using the testing/debugging kernel

Sorry for any lateness in reporting. I installed yesterday, and and while the computer suspends properly, it hangs upon resume. I wanted to test a few times, and can confirm that it hangs every time. Therefore, I cannot say whether or not the patch works for the network, I never get that far.

I am attaching an excerpt of /var/log/messages.

File starts with successful boot at 15:06:14. Resume at about the middle of the file (about line 750), at 16:17.

Revision history for this message
In , Stanislaw (stanislaw-redhat-bugs) wrote :

Ok. I will build 2.6.33 kernel with the patch, which should not have resume problems on your system. On a while, could you provide info described
in https://wiki.ubuntu.com/DebuggingKernelSuspend on 2.6.34 kernel ?

Revision history for this message
In , Stanislaw (stanislaw-redhat-bugs) wrote :

(In reply to comment #7)
> problems on your system. On a while, could you provide info described
> in https://wiki.ubuntu.com/DebuggingKernelSuspend on 2.6.34 kernel ?

Oops sorry, that info is not needed, just looked at logs. There is some problem
with graphics driver, maybe updating Xorg will help?

Revision history for this message
In , A. (a.-redhat-bugs) wrote :

All the updates that were released have been applied, so I am not sure how I am supposed to figure out if Xorg should be updated. May be you should CC in some Xorg person in this thread?

Do note that the graphics problem had been somewhat diagnosed, earlier, in the following bug report, which I now updated with the latest info from our bug here. Sadly, no one responded yet to that other bug report: https://bugzilla.redhat.com/show_bug.cgi?id=622737 .

Revision history for this message
In , Stanislaw (stanislaw-redhat-bugs) wrote :

(In reply to comment #9)
> All the updates that were released have been applied, so I am not sure how I am
> supposed to figure out if Xorg should be updated.

If you did
# yum --enablerepo=updates-testing update
that's the updates I was talking about.

> May be you should CC in some
> Xorg person in this thread?
I will reassign your other bug to Xorg, but first I have to think more about it, fedora graphics is mixed kernel - user space monster, quite frequently is not clear where the problem is.

Revision history for this message
In , Stanislaw (stanislaw-redhat-bugs) wrote :

Here is 2.6.33 kernel build with proposed fix http://koji.fedoraproject.org/koji/taskinfo?taskID=2466611

Revision history for this message
In , Stanislaw (stanislaw-redhat-bugs) wrote :

Resume problems on 2.6.34 seems to be caused by radeon driver. I just changed the topic, maybe someone will pick up this bug.

What about that problem?

Note: to test on 2.6.34 you can login as root to virtual terminal (using Ctrl+Alt+F2 on X window, to go back to X use Alt+F1 or Alt+F{Number} to login on different VT). Then run init 3 ("init 5" to turn on X window again), then "pm-suspend" . If suspend/resume still not work, you can boot kernel with radeon.modeset=0 parameter (add in /boot/grub/grub.conf).

Revision history for this message
In , A. (a.-redhat-bugs) wrote :

OK, I tried your tip, and when resuming, I could not get my screen back, at all. However, one little thing did run better: the keyboard wasn't locked, and I managed to switch terminals, log in as root and type reboot, all without a screen.

I also wonder: I generally boot with nomodeset (because there is another problem with edid, so I don't get my full screen resolution unless doing so - the subject of a separate, older bug report of mine). Does adding radeon.modeset=0 add anything? I tried with and without, and had the same result every time, so I don't know whether the radeon parameter added anything.

Anyway, given that even without X I have this problem, I believe we must conclude this is a kernel issue.

Revision history for this message
In , Stanislaw (stanislaw-redhat-bugs) wrote :

(In reply to comment #13)
> I also wonder: I generally boot with nomodeset (because there is another
> problem with edid, so I don't get my full screen resolution unless doing so -
> the subject of a separate, older bug report of mine). Does adding
> radeon.modeset=0 add anything?

In new kernels radeon.modeset=0 is replacement of nomodeset=1 (for ATI devices only), it should have the same effects. For example resolution of Virtual Terminal (switched by Ctrl+Alt+Fn) should be different. Also "radeon kernel modesetting enabled" is printed in dmesg when radeon.modeset=1 (default)

What about 2.6.33 kernel from comment 11 and problem with r8169 memory allocations? IIRC on 2.6.33, radeon drivers works fine on your system.

Revision history for this message
In , A. (a.-redhat-bugs) wrote :

What do you mean? It was under 2.6.33 that I first reported the problem. It wasn't good then, either.

Revision history for this message
In , Stanislaw (stanislaw-redhat-bugs) wrote :

In comment 0 we have r8169 allocation failures on resume on 2.6.33.6-147.2.4.fc13.x86_64 kernel.

Does kernel-2.6.33.6-147.bz629158.fc13.x86_64 from
http://koji.fedoraproject.org/koji/taskinfo?taskID=2466611
helps with that issue and does not cause any other problems?

Revision history for this message
In , A. (a.-redhat-bugs) wrote :

Is that kernel different from the plain vanilla kernel-2.6.33.6-147 testing or debug kernel, which I both tried?

Revision history for this message
In , Stanislaw (stanislaw-redhat-bugs) wrote :

Yes, it has patch from comment 3 applied.

Revision history for this message
In , Stanislaw (stanislaw-redhat-bugs) wrote :

I just posted patches which should fix problem from comment 0
http://marc.info/?l=linux-netdev&m=128524323702376&w=2
http://marc.info/?l=linux-netdev&m=128524323702378&w=2

Since you have other, worse problems with suspend/resume with bug 622737 (and I don't have much time :-() I will not backport that patches to current fedora, however bug will be fixed in future releases.

Revision history for this message
In , Stanislaw (stanislaw-redhat-bugs) wrote :

Upstream r8169 driver maintainer would like to know if patch really fix the problem. Did you test kernel-2.6.33.6-147.bz629158 ? If not, will you test it if I build the kernel with the patch (previous koji build was removed) ?

Revision history for this message
In , Stanislaw (stanislaw-redhat-bugs) wrote :

*** Bug 566389 has been marked as a duplicate of this bug. ***

Revision history for this message
In , Stanislaw (stanislaw-redhat-bugs) wrote :

Let's reopen since more people interested by fixing that problem.

Can someone test patch from comment 3 or upstream patches from comment 19 ?

Revision history for this message
In , Stanislaw (stanislaw-redhat-bugs) wrote :

*** Bug 567256 has been marked as a duplicate of this bug. ***

Revision history for this message
In , Serguei (serguei-redhat-bugs) wrote :

(In reply to comment #21)
> *** Bug 566389 has been marked as a duplicate of this bug. ***

Please, read my reply here: https://bugzilla.redhat.com/show_bug.cgi?id=566389#c20

Revision history for this message
In , A. (a.-redhat-bugs) wrote :

I will gladly test the patch, but am right now going through a drive repair and have to travel for a few days. Will try next week.

Revision history for this message
In , Neal (neal-redhat-bugs) wrote :

Is there a kernel build with the proposed patch, or do I need to get kernel srpm and rebuild myself?

Revision history for this message
In , Stanislaw (stanislaw-redhat-bugs) wrote :
Revision history for this message
In , Stanislaw (stanislaw-redhat-bugs) wrote :

Created attachment 450501
f12-r8169-alloc-fix.patch

The same fix for F-12.

http://koji.fedoraproject.org/koji/taskinfo?taskID=2496825

Revision history for this message
In , James (james-redhat-bugs) wrote :

(In reply to comment #3)
> Created attachment 446284 [details]
> f13-r8169-alloc-fix.patch

I'm currently using the F14 (2.6.35) series kernel on F13. Two of the hunks in this patch fail to apply cleanly to this version, but it builds OK. Not seen any r8169 PAFs yet, but "the night is young"...

Revision history for this message
In , Stanislaw (stanislaw-redhat-bugs) wrote :

(In reply to comment #29)
> I'm currently using the F14 (2.6.35) series kernel on F13. Two of the hunks in
> this patch fail to apply cleanly to this version, but it builds OK. Not seen
> any r8169 PAFs yet, but "the night is young"...

You should look at drivers/net/r8169.c.rej and integrate remaining hunks by hand. Anyway these two hunks are not related with bug directly (I just checked), so I guess everything should be fine.

Revision history for this message
In , Stanislaw (stanislaw-redhat-bugs) wrote :

Created attachment 451212
f14-r8169-alloc-fix.patch

The same fix for Fedora 14

Revision history for this message
In , A. (a.-redhat-bugs) wrote :

(In reply to comment #19)
> I just posted patches which should fix problem from comment 0
> http://marc.info/?l=linux-netdev&m=128524323702376&w=2
> http://marc.info/?l=linux-netdev&m=128524323702378&w=2
>
> Since you have other, worse problems with suspend/resume with bug 622737 (and I
> don't have much time :-() I will not backport that patches to current fedora,
> however bug will be fixed in future releases.

I was off line for a few days, as I had a really dicey issue that required a reinstall (grub was totally corrupted and I couldn't figure out what was wrong, but I couldn't get even to the boot menu, even though that part was all right and grub theoretically in charge.), so I didn't do anything until yesterday. I now have the 2.6.34.7-56.fc13.x86_64 kernel install, and it works well. This problem seems solved, apparently thanks to you! If any part of the problem reappears, I will report.

Oh, I should also mention that so far, I am still using nomodeset, because otherwise the max screen resolution isn't available, so I do not know whether removing nomodeset will negatively influence your patch. Please let me know whether you want me to test this, too.

Cheers,

Revision history for this message
In , Stanislaw (stanislaw-redhat-bugs) wrote :

(In reply to comment #32)
> now have the 2.6.34.7-56.fc13.x86_64 kernel install, and it works well.
Hmm, this kernel does not include the fix, try 2.6.34.7-58.bz629158.fc13 from
http://koji.fedoraproject.org/koji/taskinfo?taskID=2496181

Revision history for this message
In , Stanislaw (stanislaw-redhat-bugs) wrote :

Anyone can confirm problem is fixed in test kernels/patches I prepared?

Revision history for this message
In , Stanislaw (stanislaw-redhat-bugs) wrote :

Guys, please give me info if test kernels fix the problem or not, otherwise this bug will not be fixed.

Revision history for this message
In , Neal (neal-redhat-bugs) wrote :

The last page allocation failure message I see is from Sept 21.

Sep 29 08:44:31 Installed: kernel-devel-2.6.34.7-58.bz629158.fc13.x86_64

So, maybe it's fixed?

Before the fix (installed Sept 29), it was 8 days since the last occurrance.

Now it's been about another 8 days, and no occurance.

I use it every day.

Revision history for this message
In , Serguei (serguei-redhat-bugs) wrote :

(In reply to comment #35)
> Guys, please give me info if test kernels fix the problem or not, otherwise
> this bug will not be fixed.

Have you sent announcement to this thread?

https://bugzilla.redhat.com/show_bug.cgi?id=566389

May be people there just don't receive these news...

I have just downloaded

kernel-2.6.32.23-170.bz629158.fc12.i686.rpm
kernel-headers-2.6.32.23-170.bz629158.fc12.i686.rpm
kernel-devel-2.6.32.23-170.bz629158.fc12.i686.rpm

So, I will install this stuff and try for a week. Please wait...

Revision history for this message
In , A. (a.-redhat-bugs) wrote :

OK, I am back on after going through a move and a (not so smooth) ISP change. The stock kernels *seem* to work, but that's deceptive. Basically, I can't figure it out. Sometimes/much of the time the network works upon resume, but at other times, it doesn't come back up. I now tried downloading your test kernels to testing them over the weekend, but they are no longer up. Can you put 'em back up? I will test.

Revision history for this message
In , Stanislaw (stanislaw-redhat-bugs) wrote :

I should copy these test kernels from koji to other site, ah ... I will not build another scratch kernel, I will rather try to put patches upstream and to fedora. Since we have Neal confirmation, now I can proceed.

Revision history for this message
In , Serguei (serguei-redhat-bugs) wrote :

Stanislaw, sorry but I could not test the kernel for F-12 because the kernel-firmware package is absent, so yum refuses to install the new kernel. Could you push your fixes to updates-testing repository?

When it is ready, could you please make an announcement in previous thread? :

https://bugzilla.redhat.com/show_bug.cgi?id=566389

This is to make sure that people who started to report this bug could also test your fix.

Revision history for this message
Jeremy Foshee (jeremyfoshee) wrote :

Hi Phillip,

Please be sure to confirm this issue exists with the latest development release of Ubuntu. ISO CD images are available from http://cdimage.ubuntu.com/daily/current/ . If the issue remains, please run the following command from a Terminal (Applications->Accessories->Terminal). It will automatically gather and attach updated debug information to this report.

apport-collect -p linux 655413

Also, if you could test the latest upstream kernel available that would be great. It will allow additional upstream developers to examine the issue. Refer to https://wiki.ubuntu.com/KernelMainlineBuilds . Once you've tested the upstream kernel, please remove the 'needs-upstream-testing' tag. This can be done by clicking on the yellow pencil icon next to the tag located at the bottom of the bug description and deleting the 'needs-upstream-testing' text. Please let us know your results.

Thanks in advance.

    [This is an automated message. Apologies if it has reached you inappropriately; please just reply to this message indicating so.]

tags: added: needs-kernel-logs
tags: added: needs-upstream-testing
tags: added: kj-triage
Changed in linux (Ubuntu):
status: New → Incomplete
Revision history for this message
In , Fedora (fedora-redhat-bugs) wrote :

kernel-2.6.35.6-45.fc14 has been submitted as an update for Fedora 14.
https://admin.fedoraproject.org/updates/kernel-2.6.35.6-45.fc14

Revision history for this message
In , Fedora (fedora-redhat-bugs) wrote :

kernel-2.6.34.7-61.fc13 has been submitted as an update for Fedora 13.
https://admin.fedoraproject.org/updates/kernel-2.6.34.7-61.fc13

Revision history for this message
In , Fedora (fedora-redhat-bugs) wrote :

kernel-2.6.32.23-170.fc12 has been submitted as an update for Fedora 12.
https://admin.fedoraproject.org/updates/kernel-2.6.32.23-170.fc12

Revision history for this message
In , Fedora (fedora-redhat-bugs) wrote :

kernel-2.6.35.6-45.fc14 has been pushed to the Fedora 14 stable repository. If problems still persist, please make note of it in this bug report.

Revision history for this message
In , Fedora (fedora-redhat-bugs) wrote :

kernel-2.6.34.7-61.fc13 has been pushed to the Fedora 13 stable repository. If problems still persist, please make note of it in this bug report.

Revision history for this message
In , Fedora (fedora-redhat-bugs) wrote :

kernel-2.6.32.23-170.fc12 has been pushed to the Fedora 12 stable repository. If problems still persist, please make note of it in this bug report.

Revision history for this message
In , A. (a.-redhat-bugs) wrote :

(In reply to comment #42)
> kernel-2.6.34.7-61.fc13 has been submitted as an update for Fedora 13.
> https://admin.fedoraproject.org/updates/kernel-2.6.34.7-61.fc13

I am using this kernel, and while most of the time the problem is gone, occasionally, the network still fails to come back up after resume. This was diagnosed both on an i686 and an x86_64 system.

Revision history for this message
In , Serguei (serguei-redhat-bugs) wrote :
Download full text (4.5 KiB)

Suggested fix is not working:

Linux xxxxxxxxxxx 2.6.32.23-170.fc12.i686 #1 SMP Mon Sep 27 17:58:16 UTC 2010 i686 i686 i386 GNU/Linux

NetworkManager: page allocation failure. order:3, mode:0x4020
Pid: 15744, comm: NetworkManager Tainted: P 2.6.32.23-170.fc12.i686 #1
Call Trace:
 [<c07946c6>] ? printk+0x14/0x16
 [<c04aac75>] __alloc_pages_nodemask+0x44c/0x4ac
 [<c04aace9>] __get_free_pages+0x14/0x26
 [<c04d01f2>] __kmalloc_track_caller+0x37/0x127
 [<c0706ea6>] ? __netdev_alloc_skb+0x1b/0x36
 [<c0706800>] __alloc_skb+0x4e/0x10d
 [<c0706ea6>] __netdev_alloc_skb+0x1b/0x36
 [<f7f45435>] rtl8169_rx_fill+0x93/0x12d [r8169]
 [<f7f459c0>] rtl8169_init_ring+0x58/0x84 [r8169]
 [<f7f47f68>] rtl8169_open+0x6e/0x15e [r8169]
 [<c070ec58>] dev_open+0x8b/0xc5
 [<c070e4b2>] dev_change_flags+0xa9/0x158
 [<c07166d8>] do_setlink+0x242/0x2e8
 [<c071677e>] ? rtnl_setlink+0x0/0xee
 [<c071685b>] rtnl_setlink+0xdd/0xee
 [<c0702f00>] ? sk_wait_data+0x6a/0x9a
 [<c071677e>] ? rtnl_setlink+0x0/0xee
 [<c07161e2>] rtnetlink_rcv_msg+0x190/0x1a6
 [<c05bff23>] ? might_fault+0x1e/0x20
 [<c0724470>] ? netlink_sendmsg+0x152/0x228
 [<c0716052>] ? rtnetlink_rcv_msg+0x0/0x1a6
 [<c0723b7f>] netlink_rcv_skb+0x35/0x7b
 [<c071604b>] rtnetlink_rcv+0x20/0x27
 [<c07239a3>] netlink_unicast+0xc3/0x11e
 [<c0724539>] netlink_sendmsg+0x21b/0x228
 [<c06fffff>] __sock_sendmsg+0x4a/0x53
 [<c0700678>] sock_sendmsg+0xbb/0xd1
 [<c04547a1>] ? autoremove_wake_function+0x0/0x34
 [<c04547a1>] ? autoremove_wake_function+0x0/0x34
 [<c05bff23>] ? might_fault+0x1e/0x20
 [<c05c0096>] ? copy_from_user+0x32/0x11a
 [<c070817a>] ? verify_iovec+0x43/0x71
 [<c070081a>] sys_sendmsg+0x18c/0x1f0
 [<c07014c5>] ? sys_recvmsg+0x1c2/0x1e1
 [<c04a5dab>] ? find_get_page+0x22/0x7c
 [<c04bc20b>] ? handle_mm_fault+0x47a/0x93e
 [<c079505b>] ? schedule+0x817/0x864
 [<c0701afa>] sys_socketcall+0x163/0x195
 [<c040ac82>] ? syscall_trace_leave+0xaa/0xbd
 [<c040367c>] syscall_call+0x7/0xb
Mem-Info:
DMA per-cpu:
CPU 0: hi: 0, btch: 1 usd: 0
CPU 1: hi: 0, btch: 1 usd: 0
Normal per-cpu:
CPU 0: hi: 186, btch: 31 usd: 155
CPU 1: hi: 186, btch: 31 usd: 59
HighMem per-cpu:
CPU 0: hi: 186, btch: 31 usd: 63
CPU 1: hi: 186, btch: 31 usd: 82
active_anon:316984 inactive_anon:119250 isolated_anon:0
 active_file:120833 inactive_file:111248 isolated_file:0
 unevictable:0 dirty:17 writeback:0 unstable:0
 free:48183 slab_reclaimable:30602 slab_unreclaimable:7930
 mapped:34850 shmem:1391 pagetables:3286 bounce:0
DMA free:3488kB min:64kB low:80kB high:96kB active_anon:0kB inactive_anon:0kB active_file:480kB inactive_file:96kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15864kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:3812kB slab_unreclaimable:556kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
lowmem_reserve[]: 0 861 3029 3029
Normal free:158048kB min:3720kB low:4648kB high:5580kB active_anon:76520kB inactive_anon:152480kB active_file:138912kB inactive_file:127416kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:881880kB mlocked:0kB dirty:4kB writeback:0kB mapped:14316k...

Read more...

Revision history for this message
In , Stanislaw (stanislaw-redhat-bugs) wrote :

(In reply to comment #48)
> Suggested fix is not working:
>
> Linux xxxxxxxxxxx 2.6.32.23-170.fc12.i686 #1 SMP Mon Sep 27 17:58:16 UTC 2010
> i686 i686 i386 GNU/Linux
>
> NetworkManager: page allocation failure. order:3, mode:0x4020

mode: 0x4020 mean atomic allocation, so fix was not there.

I checked 2.6.32.23-170 sources and indeed patch was not there. It was dropped because patch was merged upstream to -stable kernels, but the fix was removed too early. Anyway current 2.6.32.25 kernel have this fix.

Revision history for this message
In , Stanislaw (stanislaw-redhat-bugs) wrote :

(In reply to comment #47)
> (In reply to comment #42)
> > kernel-2.6.34.7-61.fc13 has been submitted as an update for Fedora 13.
> > https://admin.fedoraproject.org/updates/kernel-2.6.34.7-61.fc13
>
> I am using this kernel, and while most of the time the problem is gone,
> occasionally, the network still fails to come back up after resume. This was
> diagnosed both on an i686 and an x86_64 system.

kernel-2.6.34.7-61 have the fix, hmm. This can be different problem or indeed patch does not fix allocation issues, like was suggested by Serguei.

Please attach dmesg when the problem happen.

Revision history for this message
In , Serguei (serguei-redhat-bugs) wrote :

(In reply to comment #49)
> (In reply to comment #48)
> > Suggested fix is not working:
> >
> > Linux xxxxxxxxxxx 2.6.32.23-170.fc12.i686 #1 SMP Mon Sep 27 17:58:16 UTC 2010
> > i686 i686 i386 GNU/Linux
> >
> > NetworkManager: page allocation failure. order:3, mode:0x4020
>
> mode: 0x4020 mean atomic allocation, so fix was not there.
>
> I checked 2.6.32.23-170 sources and indeed patch was not there. It was dropped
> because patch was merged upstream to -stable kernels, but the fix was removed
> too early. Anyway current 2.6.32.25 kernel have this fix.

# yum update
Loaded plugins: aliases, auto-update-debuginfo, changelog, dellsysidplugin2, downloadonly, fastestmirror, filter-data,
              : keys, kmdl, list-data, merge-conf, post-transaction-actions, priorities, protectbase, refresh-packagekit,
              : remove-with-leaves, rpm-warm-cache, security, show-leaves, tsflags, upgrade-helper, verify, versionlock
Loading mirror speeds from cached hostfile
.....

Skipping filters plugin, no data
0 packages excluded due to repository protections
Skipping security plugin, no data
Setting up Update Process
No Packages marked for Update

So, where is the current kernel with this fix?

Revision history for this message
Christian Reis (kiko) wrote :

Likely to be a dupe of 667478; there are two proposed patches here:

  http://marc.info/?l=linux-netdev&m=128524323702376&w=2
  http://marc.info/?l=linux-netdev&m=128524323702376&w=2

I'm not sure if they are in mainline yet; testing Maverick is likely to be the best next step, which I should do this weekend.

Revision history for this message
Christian Reis (kiko) wrote :

Note the comment here http://www.spinics.net/lists/netdev/msg140231.html -- basically, the kernel is failing to find sufficient contiguous memory for the network driver.

Revision history for this message
Per Ångström (autark) wrote :

My experience is that the bug does not show up until after quite a few suspend/resume cycles. I have sometimes been able to restore the network connection by repeatedly enabling/disabling networking and unplugging/replugging the network cable, but that will only work until the next suspend.

tags: added: maverick
Revision history for this message
Per Ångström (autark) wrote :

Linux kernel 2.6.35-23-generic #40-Ubuntu SMP Wed Nov 17 22:14:33 UTC 2010 x86_64 GNU/Linux

Confirming bug.

Changed in linux (Ubuntu):
status: Incomplete → Confirmed
Revision history for this message
Christian Reis (kiko) wrote :

Again, are this and bug 667478 not dupes?

Revision history for this message
Per Ångström (autark) wrote : Re: [Bug 655413]

On 2010-11-27 03:06, Christian Reis wrote:
> Again, are this and bug 667478 not dupes?
Yes, most probably. This one has a better description, IMO.

Revision history for this message
In , Stanislaw (stanislaw-redhat-bugs) wrote :

(In reply to comment #51)
> # yum update
[snip]
> No Packages marked for Update
>
> So, where is the current kernel with this fix?

I don't know why repositories (still!) are not updated. You can download latest kernels directly from koji http://koji.fedoraproject.org/koji/packageinfo?packageID=8

Revision history for this message
In , Serguei (serguei-redhat-bugs) wrote :

Since today I'm testing kernel 2.6.32.26-175.fc12.i686 with all workarounds removed. Please, standby.

Revision history for this message
Per Ångström (autark) wrote :

I haven't seen this problem since I upgraded to kernel 2.6.35-23-generic #41. I now have an uptime of 9 days with frequent suspend/resume cycles every day.

Revision history for this message
Hélio Nunes (dedalu-dedalu) wrote :

With 2.6.35-23-generic #41-Ubuntu SMP, I had the problem at least one time, when the interface became completely unusable even after reboot and up again only after power cord off...

Revision history for this message
Per Ångström (autark) wrote :

I got it now, after 14 days' uptime. Could recover by disabling/enabling networking some five times. It will probably become more frequent from now, and increasingly difficult to recover. I may have to reboot soon.

@Hélio Nunes: I find it strange that a plain reboot doesn't work for you.

Revision history for this message
In , John (john-redhat-bugs) wrote :

I also suffered from this bug for a while and have a question. Why does the driver throw an ENOMEM if it cannot allocate a full complement of 256 packet/data buffers? I mean, suppose that in init_ring / rx_fill, the loop has allocated, say, 255 buffers successfully, and then fails with nomem on the 256'th. Why does it not simply continue on and use the 255 it allocated? Why fail the entire device open?

After all the various fixes, this is still the case today (2.6.37-rc5)

The chip certainly does not insist that there must be 256 rx descriptors in the chain passed to it. I have verified that in the grub netboot context. And I've been running with 128 on my linux kernel for a while now.

Maybe this is moot if the bug really has been fixed - I don't know. Has it (definitively been fixed?) or still being assessed?

Revision history for this message
In , Stanislaw (stanislaw-redhat-bugs) wrote :

(In reply to comment #54)
> I also suffered from this bug for a while and have a question. Why does the
> driver throw an ENOMEM if it cannot allocate a full complement of 256
> packet/data buffers?
Driver could use smaller ring buffer, but it need to be rewritten to use variable instead of hard codded NUM_RX_DESC .

> After all the various fixes, this is still the case today (2.6.37-rc5)
Hmm, can you add comment and attach dmesg to https://bugzilla.kernel.org/show_bug.cgi?id=19752 . If still we fail to allocate in not atomic mode, this seems to be issue of allocator not the driver. Anyway dmesg should show some interesting information.

Revision history for this message
In , John (john-redhat-bugs) wrote :

> Driver could use smaller ring buffer, but it need to be rewritten to use
> variable instead of hard codded NUM_RX_DESC .

Yes, that's what I did, and with a couple of other minor changes including
new module param to specify num_rx_buffs to (try to) alloc at open, this
has been working fine for some time. It seems to me to be an improvement
even after all other fixes but don't know if actually needed.
I can send you my patch if you are interested.

> > After all the various fixes, this is still the case today (2.6.37-rc5)
> Hmm, can you add comment and attach dmesg to
> https://bugzilla.kernel.org/show_bug.cgi?id=19752 . If still we fail

Sorry, I did not make clear - when I said "this is still the case today" the
"this" I am referring to is the driver logic (insist on 256), not occurrence
of problem. I do not know whether the problem itself exists in latest level
of driver, and was too lazy to try my failure scenario on latest kernel build
because this thread says it is still under assessment and to stand by. I am
still on older level (2.6.33) but can do that some time if no-one else has.

I did not see there is a kernel bugzilla on this until you mentioned it -
(search didn't find it) - maybe move discussion there.

Revision history for this message
In , Serguei (serguei-redhat-bugs) wrote :

From yum.log:

Dec 03 08:08:08 Installed: kernel-2.6.32.26-175.fc12.i686

No more issues since then. All quirks removed. Will continue to test.

Revision history for this message
In , Stanislaw (stanislaw-redhat-bugs) wrote :

(In reply to comment #56)
> Yes, that's what I did, and with a couple of other minor changes including
> new module param to specify num_rx_buffs to (try to) alloc at open, this
> has been working fine for some time. It seems to me to be an improvement

If you think patch is needed rebase it to current upstream code and post to netdev mailing list and maintainer.

Revision history for this message
In , Neal (neal-redhat-bugs) wrote :

I haven't seen this error for quite a long time now with any kernel, but maybe cause I now have 4G ram

Revision history for this message
In , John (john-redhat-bugs) wrote :

Thanks for the updates - sounds as though it is really fixed now. In which case my patch is obsoleted I think. If anyone finds some reason to want it, feel free to request it from me.

Revision history for this message
In , Serguei (serguei-redhat-bugs) wrote :

(In reply to comment #57)
> From yum.log:
>
> Dec 03 08:08:08 Installed: kernel-2.6.32.26-175.fc12.i686
>
> No more issues since then. All quirks removed. Will continue to test.

Fix confirmed:

$ uptime
 21:32:26 up 29 days, 21:11, 6 users, load average: 0.57, 0.68, 0.49
$ uname -a
Linux quantumpoint 2.6.32.26-175.fc12.i686 #1 SMP Wed Dec 1 21:52:04 UTC 2010 i686 i686 i386 GNU/Linux

No more problems.

Revision history for this message
penalvch (penalvch) wrote :

Phillip Susi, this bug was reported a while ago and there hasn't been any activity in it recently. We were wondering if this is still an issue? Can you try with the latest development release of Ubuntu? ISO CD images are available from http://cdimage.ubuntu.com/releases/ .

If it remains an issue, could you run the following command in the development release from a Terminal (Applications->Accessories->Terminal). It will automatically gather and attach updated debug information to this report.

apport-collect -p linux <replace-with-bug-number>

Also, if you could test the latest upstream kernel available that would be great. It will allow additional upstream developers to examine the issue. Refer to https://wiki.ubuntu.com/KernelMainlineBuilds . Please do not test the kernel in the daily folder, but the one all the way at the bottom. Once you've tested the upstream kernel, please remove the 'needs-upstream-testing' tag. This can be done by clicking on the yellow pencil icon next to the tag located at the bottom of the bug description and deleting the 'needs-upstream-testing' text. As well, please comment on which kernel version specifically you tested.

If this bug is fixed in the mainline kernel, please add the following tag 'kernel-fixed-upstream'.

If the mainline kernel does not fix this bug, please add the tag: 'kernel-bug-exists-upstream'.

If you are unable to test the mainline kernel, for example it will not boot, please add the tag: 'kernel-unable-to-test-upstream', and comment as to why specifically you were unable to test it.

Please let us know your results. Thanks in advance.

Changed in linux (Ubuntu):
status: Confirmed → Incomplete
Revision history for this message
Phillip Susi (psusi) wrote :

I have not experienced the issue in the last few releases.

Changed in linux (Ubuntu):
status: Incomplete → Invalid
Changed in linux (Fedora):
importance: Unknown → Medium
status: Unknown → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.