vRouter: kernel crash in vr_flow_lookup

Bug #1556363 reported by Anand H. Krishnan on 2016-03-12
10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Juniper Openstack
Status tracked in Trunk
R2.20
Fix Committed
High
Anand H. Krishnan
R2.21.x
Fix Committed
High
Anand H. Krishnan
R2.22.x
Fix Committed
High
Anand H. Krishnan
R3.0
Fix Committed
High
Anand H. Krishnan
R3.1
Fix Committed
High
Anand H. Krishnan
Trunk
Fix Committed
High
Anand H. Krishnan

Bug Description

One of the compute nodes went down with the trace that is pasted towards the end of the report. On analysis, it was found that vRouter was trying to form a flow for an ICMP error packet that was generated for another ICMP error. vRouter doesn't form flow keys for such packets, expecting that the flow calls upstream will drop such packets. However, no error is returned to the upstream calls to indicate that packet should be dropped and hence the flow module tries to allocate a new flow entry with uninitialized key resulting in memory corruption (key length value can be high).

[162923.869234] general protection fault: 0000 [#1] SMP
[162923.874937] Modules linked in: ipt_MASQUERADE iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack veth ipmi_si act_police cls_u32 sch_ingress cls_fw sch_sfq sch_htb mpt3sas mpt2sas raid_class scsi_transport_sas mptctl mptbase ip6table_filter ip6_tables iptable_filter ip_tables ebtable_nat ebtables x_tables dell_rbu nbd ib_iser rdma_cm iw_cm ib_cm ib_sa ib_mad ib_core ib_addr iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi dm_crypt 8021q garp stp mrp llc ipmi_devintf dcdbas dm_multipath scsi_dh x86_pkg_temp_thermal intel_powerclamp coretemp crct10dif_pclmul crc32_pclmul aesni_intel aes_x86_64 lrw gf128mul glue_helper ablk_helper cryptd mei_me shpchp mei lpc_ich acpi_power_meter mac_hid nfsd vrouter(OX) auth_rpcgss nfs_acl nfs virtio_rng lockd sunrpc vhost_net vhost macvtap
[162923.955481] macvlan kvm_intel fscache kvm bonding btrfs raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq raid1 raid0 multipath linear bnx2x ahci mdio megaraid_sas libahci libcrc32c wmi [last unloaded: ipmi_si]
[162923.979042] CPU: 48 PID: 5427 Comm: contrail-vroute Tainted: G OX 3.13.0-77-generic #121-Ubuntu
[162923.989944] Hardware name: Dell Inc. PowerEdge R630/0CNCJW, BIOS 1.3.6 06/03/2015
[162923.998458] task: ffff8827959eb000 ti: ffff882797722000 task.ti: ffff882797722000
[162924.006970] RIP: 0010:[<ffffffffa040ba4e>] [<ffffffffa040ba4e>] vr_flow_lookup+0xee/0x250 [vrouter]
[162924.017365] RSP: 0000:ffff8827df703778 EFLAGS: 00010286
[162924.023427] RAX: 8827df703a98ffff RBX: ffff88278f6e0b40 RCX: 000000000007e392
[162924.031552] RDX: 0000000038190037 RSI: ffff88278f6e0b40 RDI: ffff881dba565800
[162924.039676] RBP: ffff8827df7037b0 R08: ffff8827df703784 R09: ffff881bf9559980
[162924.047800] R10: ffff8827df007900 R11: 0000000000000001 R12: ffffffffa0418a00
[162924.055923] R13: ffff8827df703a98 R14: 0000000000000001 R15: ffff881dba565828
[162924.064058] FS: 00007f04277fd700(0000) GS:ffff8827df700000(0000) knlGS:0000000000000000
[162924.073250] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[162924.079798] CR2: 00007f0408bd64a8 CR3: 0000004f91cfe000 CR4: 00000000001427e0
[162924.087922] Stack:
[162924.090290] 381900370000b82d 0000b82ddf703868 ffff881dba565828 ffff881c1141006c
[162924.098753] ffff8827df703a98 ffffffffa0418a00 00000000000000fe ffff8827df7037f8
[162924.107216] ffffffffa04079b2 0002000100000000 00003790ffff0000 ffff8827df708827
[162924.115674] Call Trace:
[162924.118527] <IRQ>
[162924.120779]
[162924.122578] [<ffffffffa04079b2>] vr_inet_flow_lookup+0xc2/0x210 [vrouter]
[162924.128839] [<ffffffffa040bc0a>] vr_flow_forward+0x5a/0x60 [vrouter]
[162924.136174] [<ffffffffa0400777>] nh_output+0xb7/0x140 [vrouter]
[162924.143018] [<ffffffffa04089b1>] vr_mpls_input+0x161/0x280 [vrouter]
[162924.150348] [<ffffffffa0407344>] vr_gre_input+0x174/0x1e0 [vrouter]
[162924.157580] [<ffffffffa040816e>] vr_ip_rcv+0x2ce/0x360 [vrouter]
[162924.164527] [<ffffffffa03ff8d8>] nh_l3_rcv+0x68/0x70 [vrouter]
[162924.171274] [<ffffffffa0400704>] nh_output+0x44/0x140 [vrouter]
[162924.178116] [<ffffffffa0406c56>] vr_forward+0x126/0x3b0 [vrouter]
[162924.185153] [<ffffffff81629bf8>] ? dev_hard_start_xmit+0x318/0x560
[162924.192290] [<ffffffffa040828e>] vr_ip_input+0x8e/0xa0 [vrouter]
[162924.199231] [<ffffffffa0403142>] vr_l3_input+0x52/0x60 [vrouter]
[162924.206174] [<ffffffffa040330c>] vr_fabric_input+0x10c/0x150 [vrouter]
[162924.213697] [<ffffffffa040408f>] eth_rx+0xdf/0x140 [vrouter]
[162924.220250] [<ffffffffa03fb086>] linux_rx_handler+0x256/0x7a0 [vrouter]
[162924.227880] [<ffffffff8120710b>] ? ep_poll_callback+0x11b/0x170
[162924.234724] [<ffffffffa03fbb49>] pkt_rps_dev_rx_handler+0xa9/0xf0 [vrouter]
[162924.242729] [<ffffffff81627ff2>] __netif_receive_skb_core+0x262/0x850
[162924.250153] [<ffffffff816285f8>] __netif_receive_skb+0x18/0x60
[162924.256896] [<ffffffff81628663>] netif_receive_skb+0x23/0x90
[162924.263447] [<ffffffffa03fb0ff>] linux_rx_handler+0x2cf/0x7a0 [vrouter]
[162924.271065] [<ffffffff81627ff2>] __netif_receive_skb_core+0x262/0x850
[162924.278491] [<ffffffff8101b700>] ? recalibrate_cpu_khz+0x10/0x10
[162924.285428] [<ffffffff816285f8>] __netif_receive_skb+0x18/0x60
[162924.292171] [<ffffffff81628663>] netif_receive_skb+0x23/0x90
[162924.298720] [<ffffffff816290b0>] napi_gro_receive+0x80/0xb0
[162924.305184] [<ffffffffa00c4fc1>] bnx2x_rx_int+0x1021/0x18a0 [bnx2x]
[162924.312420] [<ffffffff810a558d>] ? enqueue_entity+0x2ad/0xbb0
[162924.319068] [<ffffffff8165e6c8>] ? ip_local_deliver_finish+0xa8/0x210
[162924.326491] [<ffffffff810a3320>] ? update_curr+0x80/0x180
[162924.332750] [<ffffffff8108dbd2>] ? run_posix_cpu_timers+0x42/0x5c0
[162924.339883] [<ffffffff81360062>] ? cfq_close_cooperator+0x162/0x190
[162924.347118] [<ffffffffa00c593a>] bnx2x_poll+0xfa/0x3e0 [bnx2x]
[162924.353861] [<ffffffff816289e2>] net_rx_action+0x152/0x250
[162924.360217] [<ffffffff8106cd2c>] __do_softirq+0xec/0x2c0
[162924.366376] [<ffffffff8106d275>] irq_exit+0x105/0x110
[162924.372246] [<ffffffff81738026>] do_IRQ+0x56/0xc0
[162924.377726] [<ffffffff8172d62d>] common_interrupt+0x6d/0x6d
[162924.384176] <EOI>
[162924.386425]
[162924.388212] [<ffffffff81735d1d>] ? system_call_fastpath+0x1a/0x1f
[162924.393691] Code: 5e 41 5f 5d c3 0f 1f 84 00 00 00 00 00 80 43 3a 01 66 83 7b 20 01 75 c4 48 8b 43 18 48 85 c0 0f 84 50 01 00 00 41 be 01 00 00 00 <f0> 44 0f c1 70 04 be 08 00 00 00 41 83 fe 02 0f 86 cd 00 00 00
[162924.415752] RIP [<ffffffffa040ba4e>] vr_flow_lookup+0xee/0x250 [vrouter]
[162924.423481] RSP <ffff8827df703778>

Review in progress for https://review.opencontrail.org/18358
Submitter: Anand H. Krishnan (<email address hidden>)

Review in progress for https://review.opencontrail.org/18369
Submitter: Anand H. Krishnan (<email address hidden>)

Review in progress for https://review.opencontrail.org/18370
Submitter: Anand H. Krishnan (<email address hidden>)

Review in progress for https://review.opencontrail.org/18371
Submitter: Anand H. Krishnan (<email address hidden>)

Reviewed: https://review.opencontrail.org/18358
Committed: http://github.org/Juniper/contrail-vrouter/commit/f589486960ced8866e261dd2b84ba1820c43022b
Submitter: Zuul
Branch: R2.21.x

commit f589486960ced8866e261dd2b84ba1820c43022b
Author: Anand H. Krishnan <email address hidden>
Date: Sat Mar 12 09:53:33 2016 +0530

Drop ICMP error packets for ICMP errors

In case of ICMP error packets for ICMP errors, we were not initializing
flow key and trying to form a flow out of that key, resulting in wrong
key length and corrupted flow entry(s).

We will drop such packets.

Change-Id: Idae46a7e128482ad89da8b5bd1bd0ef6b17ef28e
Closes-BUG: #1556363

Changed in juniperopenstack:
importance: Undecided → High

Reviewed: https://review.opencontrail.org/18369
Committed: http://github.org/Juniper/contrail-vrouter/commit/a2e664811d3b61643fe57f6196b2572385ba6e63
Submitter: Zuul
Branch: R3.0

commit a2e664811d3b61643fe57f6196b2572385ba6e63
Author: Anand H. Krishnan <email address hidden>
Date: Sat Mar 12 09:53:33 2016 +0530

Drop ICMP error packets for ICMP errors

In case of ICMP error packets for ICMP errors, we were not initializing
flow key and trying to form a flow out of that key, resulting in wrong
key length and corrupted flow entry(s).

We will drop such packets.

Change-Id: Idae46a7e128482ad89da8b5bd1bd0ef6b17ef28e
Closes-BUG: #1556363

Reviewed: https://review.opencontrail.org/18371
Committed: http://github.org/Juniper/contrail-vrouter/commit/869bd5e71b0ffcea8044d04f7b33b83b790d8f78
Submitter: Zuul
Branch: R2.22.x

commit 869bd5e71b0ffcea8044d04f7b33b83b790d8f78
Author: Anand H. Krishnan <email address hidden>
Date: Sat Mar 12 09:53:33 2016 +0530

Drop ICMP error packets for ICMP errors

In case of ICMP error packets for ICMP errors, we were not initializing
flow key and trying to form a flow out of that key, resulting in wrong
key length and corrupted flow entry(s).

We will drop such packets.

Change-Id: Idae46a7e128482ad89da8b5bd1bd0ef6b17ef28e
Closes-BUG: #1556363

Reviewed: https://review.opencontrail.org/18370
Committed: http://github.org/Juniper/contrail-vrouter/commit/0171abbd439b3c0a6f660f00f9eff24b5e357ea1
Submitter: Zuul
Branch: R2.20

commit 0171abbd439b3c0a6f660f00f9eff24b5e357ea1
Author: Anand H. Krishnan <email address hidden>
Date: Sat Mar 12 09:53:33 2016 +0530

Drop ICMP error packets for ICMP errors

In case of ICMP error packets for ICMP errors, we were not initializing
flow key and trying to form a flow out of that key, resulting in wrong
key length and corrupted flow entry(s).

We will drop such packets.

Change-Id: Idae46a7e128482ad89da8b5bd1bd0ef6b17ef28e
Closes-BUG: #1556363

Review in progress for https://review.opencontrail.org/18933
Submitter: Anand H. Krishnan (<email address hidden>)

Reviewed: https://review.opencontrail.org/18933
Committed: http://github.org/Juniper/contrail-vrouter/commit/4e91f850524b67c5dc38dcf762ba584103a981cf
Submitter: Zuul
Branch: master

commit 4e91f850524b67c5dc38dcf762ba584103a981cf
Author: Anand H. Krishnan <email address hidden>
Date: Sat Mar 12 09:53:33 2016 +0530

Drop ICMP error packets for ICMP errors

In case of ICMP error packets for ICMP errors, we were not initializing
flow key and trying to form a flow out of that key, resulting in wrong
key length and corrupted flow entry(s).

We will drop such packets.

Change-Id: Idae46a7e128482ad89da8b5bd1bd0ef6b17ef28e
Closes-BUG: #1556363

Changed in juniperopenstack:
milestone: r2.21 → r3.1.0.0-fcs

Review in progress for https://review.opencontrail.org/21744
Submitter: Anand H. Krishnan (<email address hidden>)

Reviewed: https://review.opencontrail.org/21744
Committed: http://github.org/Juniper/contrail-vrouter/commit/354f7e27278b6565f39b3ae116442cd83bb355ad
Submitter: Zuul
Branch: master

commit 354f7e27278b6565f39b3ae116442cd83bb355ad
Author: Anand H. Krishnan <email address hidden>
Date: Thu Jul 7 16:30:49 2016 +0530

Do not create new flows for ICMP errors

Bug fix for 1556363 have reintroduced the issue of creating new flows
for ICMP errors. It looks like a cherry-pick/merge problem, since the
issue is present only in the mainline branch.

The process of forming the flow key is a recursive call for ICMP error,
since we look into the inner packet that caused the ICMP error. Once we
process the inner packet and form the flow key, we should return
immediately rather than continuing with the parent function, since the
parent will start using the ICMP error packet for formation of the flow
key.

Change-Id: I41c92ad44477c771e694661e625c39ff529bbe4a
Closes-BUG: #1556363

Review in progress for https://review.opencontrail.org/22140
Submitter: Anand H. Krishnan (<email address hidden>)

Reviewed: https://review.opencontrail.org/22140
Committed: http://github.org/Juniper/contrail-vrouter/commit/7c6319a47d4663a3e34aaa5e6098a9e937ab8363
Submitter: Zuul
Branch: R3.1

commit 7c6319a47d4663a3e34aaa5e6098a9e937ab8363
Author: Anand H. Krishnan <email address hidden>
Date: Thu Jul 7 16:30:49 2016 +0530

Do not create new flows for ICMP errors

Bug fix for 1556363 have reintroduced the issue of creating new flows
for ICMP errors. It looks like a cherry-pick/merge problem, since the
issue is present only in the mainline branch.

The process of forming the flow key is a recursive call for ICMP error,
since we look into the inner packet that caused the ICMP error. Once we
process the inner packet and form the flow key, we should return
immediately rather than continuing with the parent function, since the
parent will start using the ICMP error packet for formation of the flow
key.

Change-Id: I41c92ad44477c771e694661e625c39ff529bbe4a
Closes-BUG: #1556363

Review in progress for https://review.opencontrail.org/22313
Submitter: Anand H. Krishnan (<email address hidden>)

Review in progress for https://review.opencontrail.org/22314
Submitter: Anand H. Krishnan (<email address hidden>)

Reviewed: https://review.opencontrail.org/22313
Committed: http://github.org/Juniper/contrail-vrouter/commit/4928317cf1975fce73912935d9a6f02afbdf26fb
Submitter: Zuul
Branch: R3.1

commit 4928317cf1975fce73912935d9a6f02afbdf26fb
Author: Anand H. Krishnan <email address hidden>
Date: Fri Jul 22 10:11:36 2016 +0530

Do not create new flows for ICMP6 errors

Bug fix for 1556363 have reintroduced the issue of creating new flows
for ICMP errors. It looks like a cherry-pick/merge problem, since the
issue is present only in the mainline branch.

The process of forming the flow key is a recursive call for ICMP error,
since we look into the inner packet that caused the ICMP error. Once we
process the inner packet and form the flow key, we should return
immediately rather than continuing with the parent function, since the
parent will start using the ICMP error packet for formation of the flow
key.

Change-Id: I601886aeb3fa4a50a057f2c982f8c7b19dd7e3d1
Closes-Bug: #1556363

Reviewed: https://review.opencontrail.org/22314
Committed: http://github.org/Juniper/contrail-vrouter/commit/3d6d49be286d7ace48d5eba1c7253b4076797f57
Submitter: Zuul
Branch: master

commit 3d6d49be286d7ace48d5eba1c7253b4076797f57
Author: Anand H. Krishnan <email address hidden>
Date: Fri Jul 22 10:11:36 2016 +0530

Do not create new flows for ICMP6 errors

Bug fix for 1556363 have reintroduced the issue of creating new flows
for ICMP errors. It looks like a cherry-pick/merge problem, since the
issue is present only in the mainline branch.

The process of forming the flow key is a recursive call for ICMP error,
since we look into the inner packet that caused the ICMP error. Once we
process the inner packet and form the flow key, we should return
immediately rather than continuing with the parent function, since the
parent will start using the ICMP error packet for formation of the flow
key.

Change-Id: I601886aeb3fa4a50a057f2c982f8c7b19dd7e3d1
Closes-Bug: #1556363

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers