BUG: unable to handle kernel NULL pointer dereference at 00000000000000d0; RIP: 0010:[<ffffffffa001a1ea>] [<ffffffffa001a1ea>] e1000_clean_tx_irq+0xfa/0x3e0 [e1000]

Bug #1009545 reported by Roman Kagan
12
This bug affects 2 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Fix Released
Medium
Unassigned
Oneiric
Fix Released
Undecided
Brad Figg

Bug Description

[ 400.216494] BUG: unable to handle kernel NULL pointer dereference at 00000000000000d0
[ 400.220072] IP: [<ffffffffa001a1ea>] e1000_clean_tx_irq+0xfa/0x3e0 [e1000]
[ 400.220072] PGD 36c99067 PUD 36f67067 PMD 0
[ 400.220072] Oops: 0000 [#1] SMP
[ 400.220072] CPU 0
[ 400.220072] Modules linked in: nls_utf8 isofs vesafb snd_intel8x0 snd_ac97_codec ac97_bus snd_pcm snd_timer snd soundcore psmouse snd_page_alloc serio_raw uvcvideo videodev v4l2_compat_ioctl32 virtio_balloon shpchp lp parport floppy ahci libahci e1000 virtio_pci virtio_ring virtio
[ 400.220072]
[ 400.220072] Pid: 0, comm: swapper Not tainted 3.0.0-20-generic #34-Ubuntu Parallels Software International Inc. Parallels Virtual Platform/Parallels Virtual Platform
[ 400.220072] RIP: 0010:[<ffffffffa001a1ea>] [<ffffffffa001a1ea>] e1000_clean_tx_irq+0xfa/0x3e0 [e1000]
[ 400.220072] RSP: 0018:ffff88003fc03d60 EFLAGS: 00010246
[ 400.220072] RAX: 0000000000000000 RBX: ffff88003c13bd90 RCX: 00000000000000d9
[ 400.220072] RDX: 00000000000000d9 RSI: ffffc900004e11e8 RDI: ffff88003ba630f0
[ 400.220072] RBP: ffff88003fc03e10 R08: ffffc900004df000 R09: ffff88003dc000d0
[ 400.220072] R10: 0000000000000001 R11: 0000000000000293 R12: ffff880036e0d600
[ 400.220072] R13: 0000000000000003 R14: ffffc900004df001 R15: 00000000000000d9
[ 400.220072] FS: 0000000000000000(0000) GS:ffff88003fc00000(0000) knlGS:0000000000000000
[ 400.220072] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 400.220072] CR2: 00000000000000d0 CR3: 000000003647b000 CR4: 00000000000006f0
[ 400.220072] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 400.220072] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[ 400.220072] Process swapper (pid: 0, threadinfo ffffffff81c00000, task ffffffff81c0b020)
[ 400.220072] Stack:
[ 400.220072] ffff88003fc03d70 ffffffff815f46ae ffff88003fc03da0 ffffffff81151b68
[ 400.220072] ffff88003c1a35b0 ffffea0000d11970 ffff88003e002900 0000000000000282
[ 400.220072] ffff88003fc03df0 0000000000000282 ffff88003cb05500 ffff88003b8c5000
[ 400.220072] Call Trace:
[ 400.220072] <IRQ>
[ 400.220072] [<ffffffff815f46ae>] ? _raw_spin_lock+0xe/0x20
[ 400.220072] [<ffffffff81151b68>] ? add_partial+0x58/0x90
[ 400.220072] [<ffffffffa001a50d>] e1000_clean+0x3d/0xc0 [e1000]
[ 400.220072] [<ffffffff814e32d4>] net_rx_action+0x134/0x290
[ 400.220072] [<ffffffffa0017768>] ? e1000_intr+0xa8/0x140 [e1000]
[ 400.220072] [<ffffffff810660d8>] __do_softirq+0xa8/0x210
[ 400.220072] [<ffffffff8102beb5>] ? native_apic_msr_write+0x35/0x40
[ 400.220072] [<ffffffff8102aa62>] ? ack_apic_level+0x72/0x190
[ 400.220072] [<ffffffff815fdd5c>] call_softirq+0x1c/0x30
[ 400.220072] [<ffffffff8100c355>] do_softirq+0x65/0xa0
[ 400.220072] [<ffffffff810664be>] irq_exit+0x8e/0xb0
[ 400.220072] [<ffffffff815fe5b3>] do_IRQ+0x63/0xe0
[ 400.220072] [<ffffffff815f4bd3>] common_interrupt+0x13/0x13
[ 400.220072] <EOI>
[ 400.220072] [<ffffffff81088625>] ? sched_clock_local+0x25/0x90
[ 400.220072] [<ffffffff81031e7b>] ? native_safe_halt+0xb/0x10
[ 400.220072] [<ffffffff81012273>] default_idle+0x53/0x1d0
[ 400.220072] [<ffffffff8100920b>] cpu_idle+0xab/0x100
[ 400.220072] [<ffffffff815c209e>] rest_init+0x72/0x74
[ 400.220072] [<ffffffff81cd0c2b>] start_kernel+0x3d4/0x3df
[ 400.220072] [<ffffffff81cd0388>] x86_64_start_reservations+0x132/0x136
[ 400.220072] [<ffffffff81cd0140>] ? early_idt_handlers+0x140/0x140
[ 400.220072] [<ffffffff81cd0459>] x86_64_start_kernel+0xcd/0xdc
[ 400.220072] Code: f6 75 5e 44 89 f9 48 89 cb 4d 8b 74 24 20 48 8d 34 89 48 c1 e3 04 49 03 1c 24 44 3b 7d c8 49 8d 34 f6 41 0f 94 c6 75 a5 48 8b 06 <8b> 90 d0 00 00 00 48 8b 88 d8 00 00 00 0f b7 4c 11 04 8b 50 68

The problem has been addressed upstream:

commit 31c15a2f24ebdab14333d9bf5df49757842ae2ec
Author: Dean Nelson <email address hidden>
Date: Thu Aug 25 14:39:24 2011 +0000

    e1000: save skb counts in TX to avoid cache misses

    Virtual Machines with emulated e1000 network adapter running on Parallels'
    server were seeing kernel panics due to the e1000 driver dereferencing an
    unexpected NULL pointer retrieved from buffer_info->skb.

    The problem has been addressed for the e1000e driver, but not for the e1000.
    Since the two drivers share similar code in the affected area, a port of the
    following e1000e driver commit solves the issue for the e1000 driver:

    commit 9ed318d546a29d7a591dbe648fd1a2efe3be1180
    Author: Tom Herbert <email address hidden>
    Date: Wed May 5 14:02:27 2010 +0000

        e1000e: save skb counts in TX to avoid cache misses

        In e1000_tx_map, precompute number of segements and bytecounts which
        are derived from fields in skb; these are stored in buffer_info. When
        cleaning tx in e1000_clean_tx_irq use the values in the associated
        buffer_info for statistics counting, this eliminates cache misses
        on skb fields.

    Signed-off-by: Dean Nelson <email address hidden>
    Acked-by: Jeff Kirsher <email address hidden>
    Signed-off-by: David S. Miller <email address hidden>

The commit applies (with path adjustments) to latest Ubuntu kernel in the series:

# git describe
Ubuntu-3.0.0-21.35
# git show 31c15a2f24ebdab14333d9bf5df49757842ae2ec | sed 's@ethernet/intel/@@' | git apply --check --verbose
Checking patch drivers/net/e1000/e1000.h...
Checking patch drivers/net/e1000/e1000_main.c...
Hunk #1 succeeded at 2798 (offset -50 lines).
Hunk #2 succeeded at 2899 (offset -50 lines).
Hunk #3 succeeded at 3579 (offset -50 lines).

Revision history for this message
Brad Figg (brad-figg) wrote : Missing required logs.

This bug is missing log files that will aid in diagnosing the problem. From a terminal window please run:

apport-collect 1009545

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete
tags: added: oneiric
Revision history for this message
Roman Kagan (rkagan) wrote : Re: NULL pointer dereference in e1000_clean_tx_irq

The virtual machine where this bug happened was destroyed during the automated test run; I only have its serial console log.

Anyway since the problem has been accepted and fixed upstream I think it's justified to mark it "Confirmed".

Changed in linux (Ubuntu):
status: Incomplete → Confirmed
penalvch (penalvch)
summary: - NULL pointer dereference in e1000_clean_tx_irq
+ BUG: unable to handle kernel NULL pointer dereference at
+ 00000000000000d0; RIP: 0010:[<ffffffffa001a1ea>] [<ffffffffa001a1ea>]
+ e1000_clean_tx_irq+0xfa/0x3e0 [e1000]
Changed in linux (Ubuntu):
importance: Undecided → Medium
status: Confirmed → Triaged
Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

@rkagan

I see this patch is in v3.2 upstream stable:

git describe --contains 31c15a2f24ebdab14333d9bf5df49757842ae2ec
v3.2-rc1~129^2~378

Do you know if it will be requested for inclusion in upstream 3.0 stable?

tags: added: kernel-da-key
Revision history for this message
Roman Kagan (rkagan) wrote :

Not that I've heard of.

With my Parallels hat on, do you want me to make such a request?

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

That would be great if you could. Then the fix will make it's way into Oneiric through the SRU process.

Brad Figg (brad-figg)
Changed in linux (Ubuntu Oneiric):
assignee: nobody → Brad Figg (brad-figg)
status: New → In Progress
Revision history for this message
Brad Figg (brad-figg) wrote :

@rkagan,

You will find test kernels at: http://people.canonical.com/~bradf/lp1009545/ Please test the appropriate one for your installation and report back here if it fixes your issue or not.

Thank you for your help.

Revision history for this message
Roman Kagan (rkagan) wrote :

The patch (after a bit of misunderstanding, see http://thread.gmane.org/gmane.linux.kernel/1309397) has been queued up to 3.0-stable. Here's a snippet of the notification:

This is a note to let you know that I've just added the patch titled

    e1000: save skb counts in TX to avoid cache misses

to the 3.0-stable tree which can be found at:
    http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary

The filename of the patch is:
     e1000-save-skb-counts-in-tx-to-avoid-cache-misses.patch
and it can be found in the queue-3.0 subdirectory.

Revision history for this message
Roman Kagan (rkagan) wrote :

AFAICS the patch is included in the published update. I'm not getting any reports of this issue since then.

I guess this bug can be marked as resolved.

Thanks!

Changed in linux (Ubuntu Oneiric):
status: In Progress → Fix Released
Changed in linux (Ubuntu):
status: Triaged → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.