Comment 4 for bug 1201869

Revision history for this message
Chris J Arges (arges) wrote : Re: poor networking throughput across an OpenStack Neutron router on 3.5/3.8 kernels

Looks like with these patch we get the following crash:

PID: 6752   TASK: ffff8817d41e1700  CPU: 13  COMMAND: "ip"
#0 [ffff8817d2be33f0] machine_kexec at ffffffff8103bbda
#1 [ffff8817d2be3460] crash_kexec at ffffffff810bc028
#2 [ffff8817d2be3530] oops_end at ffffffff8169f098
#3 [ffff8817d2be3560] no_context at ffffffff8168436f
#4 [ffff8817d2be35b0] __bad_area_nosemaphore at ffffffff81684551
#5 [ffff8817d2be3610] bad_area at ffffffff816845ca
#6 [ffff8817d2be3640] do_page_fault at ffffffff816a1f54
#7 [ffff8817d2be3750] page_fault at ffffffff8169e4e5
    [exception RIP: netif_carrier_off+9]
    RIP: ffffffff8159f379  RSP: ffff8817d2be3808  RFLAGS: 00010292
    RAX: 0000000000000001  RBX: ffff8817f3e36000  RCX: ffffffff81ca9948
    RDX: 0000000000000000  RSI: 0000000000000282  RDI: 0000000000000000
    RBP: ffff8817d2be3808   R8: ffff880c0fcee400   R9: ffff8817d2be38e8
    R10: 0000000000001a60  R11: 0000000000000246  R12: ffff8817f3e36000
    R13: ffff8817d2be3858  R14: ffff8817fd32be00  R15: 0000000000000000
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
#8 [ffff8817d2be3810] veth_close at ffffffffa02221a2 [veth]
#9 [ffff8817d2be3830] __dev_close_many at ffffffff8157e3be
#10 [ffff8817d2be3850] dev_close_many at ffffffff8157e500
#11 [ffff8817d2be3890] rollback_registered_many at ffffffff8157e658
#12 [ffff8817d2be38c0] unregister_netdevice_many at ffffffff8157e7eb
#13 [ffff8817d2be38e0] rtnl_dellink at ffffffff8159148b
#14 [ffff8817d2be3a30] rtnetlink_rcv_msg at ffffffff81591be4
#15 [ffff8817d2be3ab0] netlink_rcv_skb at ffffffff815aab39
#16 [ffff8817d2be3ae0] rtnetlink_rcv at ffffffff8158fa35
#17 [ffff8817d2be3b00] netlink_unicast at ffffffff815aa4ad
#18 [ffff8817d2be3b50] netlink_sendmsg at ffffffff815aa810
#19 [ffff8817d2be3be0] sock_sendmsg at ffffffff81568e37
#20 [ffff8817d2be3d60] ___sys_sendmsg at ffffffff8156adcc
#21 [ffff8817d2be3f00] __sys_sendmsg at ffffffff8156c7b9
#22 [ffff8817d2be3f70] sys_sendmsg at ffffffff8156c819
#23 [ffff8817d2be3f80] system_call_fastpath at ffffffff816a65e9
    RIP: 00007f7dd23fae30  RSP: 00007fff98b25f58  RFLAGS: 00010293
    RAX: 000000000000002e  RBX: ffffffff816a65e9  RCX: 0000000000000011
    RDX: 0000000000000000  RSI: 00007fff98b21ef0  RDI: 0000000000000005
    RBP: 0000000000000000   R8: 0000000000000000   R9: 0000000000000000
    R10: 00007fff98b21c60  R11: 0000000000000246  R12: ffffffff8156c819
    R13: ffff8817d2be3f78  R14: 0000000000000000  R15: 00007fff98b25fe8
    ORIG_RAX: 000000000000002e  CS: 0033  SS: 002b

A patch already exists that addresses this crash:

Commit 2efd32ee1b60b0b31404ca47c1ce70e5a5d24ebc
Author: Eric Dumazet <email address hidden>
Date:   Thu Jan 10 08:32:45 2013 +0000

    veth: fix a NULL deref in netif_carrier_off

    In commit d0e2c55e7c94 (veth: avoid a NULL deref in veth_stats_one)
    we now clear the peer pointers in veth_dellink()

    veth_close() must therefore make sure the peer pointer is set.