vhost guest network randomly drops under stress (kvm)

Bug #1711251 reported by bugproxy
14
This bug affects 2 people
Affects Status Importance Assigned to Milestone
The Ubuntu-power-systems project
Fix Released
High
Canonical Kernel Team
linux (Ubuntu)
Fix Released
High
Joseph Salisbury
Zesty
Fix Released
High
Joseph Salisbury

Bug Description

== SRU Justification ==

A vhost performance patch was introduced in the 4.10 kernel upstream, and is currently included in the Zesty 4.10 kernel:

commit 809ecb9bca6a9424ccd392d67e368160f8b76c92
Author: Jason Wang <email address hidden>
Date: Mon Dec 12 14:46:49 2016 +0800

    vhost: cache used event for better performance

--

However I recently hit a functional issue linked to this patch which would cause random guests to lose their network connection under stress. This is not architecture specific and more likely to be hit with high network stress (i.e. lots of uperf instances).

The patch author has now reverted this patch upstream:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/drivers/vhost?id=8d65843c44269c21e95c98090d9bb4848d473853

which reads:
"
Revert "vhost: cache used event for better performance"
This reverts commit 809ecb9bca6a9424ccd392d67e368160f8b76c92. Since it
was reported to break vhost_net. We want to cache used event and use
it to check for notification. The assumption was that guest won't move
the event idx back, but this could happen in fact when 16 bit index
wraps around after 64K entries.

Signed-off-by: Jason Wang <email address hidden>
Acked-by: Michael S. Tsirkin <email address hidden>
Signed-off-by: David S. Miller <email address hidden>
"

I am requesting this patch to revert the problematic one be pulled into Ubuntu Zesty (anything 4.10+).

---uname output---
Linux p82qvirt 4.10.0-32-generic #36~16.04.1-Ubuntu SMP Wed Aug 9 09:19:19 UTC 2017 ppc64le ppc64le ppc64le GNU/Linux

Machine Type = 8247-22L

---Steps to Reproduce---
 I can recreate the scenario with the following setup:
 - on a 20core host, start 20 1core VMs
 - I have a single linux bridge assigned to all guests using virtio
 - start a uperf benchmark between each guest pair (10 total) using a high number of uperf nprocs (32)

CVE References

bugproxy (bugproxy)
tags: added: architecture-ppc64le bugnameltc-157775 severity-high targetmilestone-inin16043
Changed in ubuntu:
assignee: nobody → Ubuntu on IBM Power Systems Bug Triage (ubuntu-power-triage)
affects: ubuntu → linux (Ubuntu)
Frank Heimes (fheimes)
Changed in ubuntu-power-systems:
assignee: nobody → Canonical Kernel Team (canonical-kernel-team)
importance: Undecided → High
tags: added: kernel-da-key
Changed in linux (Ubuntu):
status: New → Triaged
importance: Undecided → High
Frank Heimes (fheimes)
Changed in ubuntu-power-systems:
status: New → Triaged
Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

I built a Zesty test kernel with a revert of commit 809ecb9bca. The test kernel can be downloaded from:

http://kernel.ubuntu.com/~jsalisbury/lp1711251

Can you test this kernel and see if it resolves this bug?

Thanks in advance!

Changed in linux (Ubuntu):
status: Triaged → In Progress
assignee: Ubuntu on IBM Power Systems Bug Triage (ubuntu-power-triage) → Joseph Salisbury (jsalisbury)
Frank Heimes (fheimes)
Changed in ubuntu-power-systems:
status: Triaged → In Progress
Revision history for this message
bugproxy (bugproxy) wrote : Comment bridged from LTC Bugzilla

------- Comment From <email address hidden> 2017-08-23 12:05 EDT-------
(In reply to comment #4)
> I built a Zesty test kernel with a revert of commit 809ecb9bca. The test
> kernel can be downloaded from:
>
> http://kernel.ubuntu.com/~jsalisbury/lp1711251
>
> Can you test this kernel and see if it resolves this bug?
>
> Thanks in advance!

Thanks! I ran multiple rounds of VMtoVM network stress tests using this host kernel and found no issues, it resolves the network drop issue.

Changed in linux (Ubuntu Zesty):
importance: Undecided → High
status: New → In Progress
description: updated
Changed in linux (Ubuntu Zesty):
assignee: nobody → Joseph Salisbury (jsalisbury)
Revision history for this message
Alexey Dushechkin (alexey.dushechkin) wrote :

I believe I faced this bug too on Xenial with kernel 4.10.0-26-generic on amd64. We use multiple multiqueue virtual adapters per VM so only partial connectivity loss occurs.

With systemtap I can see that on host last_used_event from patch equals to 0x0 on one of the queues, and guest receiving 0x1 as return value for that queue from __dev_queue_xmit which I beleve is NET_XMIT_DROP.

Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (12.1 KiB)

This bug was fixed in the package linux - 4.12.0-13.14

---------------
linux (4.12.0-13.14) artful; urgency=low

  * linux: 4.12.0-13.14 -proposed tracker (LP: #1714687)

  * vhost guest network randomly drops under stress (kvm) (LP: #1711251)
    - Revert "vhost: cache used event for better performance"

  * EDAC sbridge: Failed to register device with error -22. (LP: #1714112)
    - [Config] CONFIG_EDAC_GHES=n

  * Artful update to v4.12.10 stable release (LP: #1714525)
    - sparc64: remove unnecessary log message
    - bonding: require speed/duplex only for 802.3ad, alb and tlb
    - bonding: ratelimit failed speed/duplex update warning
    - af_key: do not use GFP_KERNEL in atomic contexts
    - dccp: purge write queue in dccp_destroy_sock()
    - dccp: defer ccid_hc_tx_delete() at dismantle time
    - ipv4: fix NULL dereference in free_fib_info_rcu()
    - net_sched/sfq: update hierarchical backlog when drop packet
    - net_sched: remove warning from qdisc_hash_add
    - bpf: fix bpf_trace_printk on 32 bit archs
    - net: igmp: Use ingress interface rather than vrf device
    - openvswitch: fix skb_panic due to the incorrect actions attrlen
    - ptr_ring: use kmalloc_array()
    - ipv4: better IP_MAX_MTU enforcement
    - nfp: fix infinite loop on umapping cleanup
    - tun: handle register_netdevice() failures properly
    - sctp: fully initialize the IPv6 address in sctp_v6_to_addr()
    - tipc: fix use-after-free
    - ipv6: reset fn->rr_ptr when replacing route
    - ipv6: repair fib6 tree in failure case
    - tcp: when rearming RTO, if RTO time is in past then fire RTO ASAP
    - net/mlx4_core: Enable 4K UAR if SRIOV module parameter is not enabled
    - irda: do not leak initialized list.dev to userspace
    - net: sched: fix NULL pointer dereference when action calls some targets
    - net_sched: fix order of queue length updates in qdisc_replace()
    - bpf, verifier: add additional patterns to evaluate_reg_imm_alu
    - bpf: fix mixed signed/unsigned derived min/max value bounds
    - bpf/verifier: fix min/max handling in BPF_SUB
    - Input: trackpoint - add new trackpoint firmware ID
    - Input: elan_i2c - add ELAN0602 ACPI ID to support Lenovo Yoga310
    - Input: ALPS - fix two-finger scroll breakage in right side on ALPS touchpad
    - KVM: s390: sthyi: fix sthyi inline assembly
    - KVM: s390: sthyi: fix specification exception detection
    - KVM: x86: simplify handling of PKRU
    - KVM, pkeys: do not use PKRU value in vcpu->arch.guest_fpu.state
    - KVM: x86: block guest protection keys unless the host has them enabled
    - ALSA: usb-audio: Add delay quirk for H650e/Jabra 550a USB headsets
    - ALSA: core: Fix unexpected error at replacing user TLV
    - ALSA: hda - Add stereo mic quirk for Lenovo G50-70 (17aa:3978)
    - ALSA: firewire: fix NULL pointer dereference when releasing uninitialized
      data of iso-resource
    - ALSA: firewire-motu: destroy stream data surely at failure of card
      initialization
    - ARCv2: SLC: Make sure busy bit is set properly for region ops
    - ARCv2: PAE40: Explicitly set MSB counterpart of SLC region ops addresses
    - ARCv2: PAE40: set MSB even if !CONFIG_ARC_HAS_...

Changed in linux (Ubuntu):
status: In Progress → Fix Released
Stefan Bader (smb)
Changed in linux (Ubuntu Zesty):
status: In Progress → Fix Committed
Manoj Iyer (manjo)
Changed in ubuntu-power-systems:
status: In Progress → Fix Committed
Revision history for this message
Kleber Sacilotto de Souza (kleber-souza) wrote :

This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-zesty' to 'verification-done-zesty'. If the problem still exists, change the tag 'verification-needed-zesty' to 'verification-failed-zesty'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-zesty
Revision history for this message
Kleber Sacilotto de Souza (kleber-souza) wrote :

Hello IBM,

Could you please verify the fix for this issue with the Zesty kernel that is currently in -proposed pocket?

Thank you.

Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2017-09-29 14:20 EDT-------
I am working to verify now, I updated to artful since the fix is listed in (4.12.0-13.14) artful-proposed. Will post results soon.

Revision history for this message
Po-Hsu Lin (cypressyew) wrote :

Hello IBM,

Do you have any update on this test result?
Thanks!

Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2017-10-06 14:16 EDT-------
Ok the 4.13 -proposed kernel in artful resolves this issue after a round of successful uperf tests, but I mistakenly updated to artful and didn't test on zesty 17.04, sorry! Will install zesty and re confirm.

Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package linux - 4.10.0-37.41

---------------
linux (4.10.0-37.41) zesty; urgency=low

  * CVE-2017-1000255
    - SAUCE: powerpc/64s: Use emergency stack for kernel TM Bad Thing program
      checks
    - SAUCE: powerpc/tm: Fix illegal TM state in signal handler

linux (4.10.0-36.40) zesty; urgency=low

  * linux: 4.10.0-36.40 -proposed tracker (LP: #1718143)

  * Neighbour confirmation broken, breaks ARP cache aging (LP: #1715812)
    - sock: add sk_dst_pending_confirm flag
    - net: add dst_pending_confirm flag to skbuff
    - sctp: add dst_pending_confirm flag
    - tcp: replace dst_confirm with sk_dst_confirm
    - net: add confirm_neigh method to dst_ops
    - net: use dst_confirm_neigh for UDP, RAW, ICMP, L2TP
    - net: pending_confirm is not used anymore

  * SRIOV: warning if unload VFs (LP: #1715073)
    - PCI: Lock each enable/disable num_vfs operation in sysfs
    - PCI: Disable VF decoding before pcibios_sriov_disable() updates resources

  * Kernel has troule recognizing Corsair Strafe RGB keyboard (LP: #1678477)
    - usb: quirks: add delay init quirk for Corsair Strafe RGB keyboard

  * CVE-2017-14106
    - tcp: initialize rcv_mss to TCP_MIN_MSS instead of 0

  * [CIFS] Fix maximum SMB2 header size (LP: #1713884)
    - CIFS: Fix maximum SMB2 header size

  * Middle button of trackpoint doesn't work (LP: #1715271)
    - Input: trackpoint - assume 3 buttons when buttons detection fails

  * Drop GPL from of_node_to_nid() export to match other arches (LP: #1709179)
    - powerpc: Drop GPL from of_node_to_nid() export to match other arches

  * vhost guest network randomly drops under stress (kvm) (LP: #1711251)
    - Revert "vhost: cache used event for better performance"

  * arm64 arch_timer fixes (LP: #1713821)
    - Revert "UBUNTU: SAUCE: arm64: arch_timer: Enable CNTVCT_EL0 trap if
      workaround is enabled"
    - arm64: arch_timer: Enable CNTVCT_EL0 trap if workaround is enabled
    - clocksource/arm_arch_timer: Fix arch_timer_mem_find_best_frame()
    - clocksource/drivers/arm_arch_timer: Fix read and iounmap of incorrect
      variable
    - clocksource/drivers/arm_arch_timer: Fix mem frame loop initialization
    - clocksource/drivers/arm_arch_timer: Avoid infinite recursion when ftrace is
      enabled

  * Touchpad not detected (LP: #1708852)
    - Input: elan_i2c - add ELAN0608 to the ACPI table

 -- Thadeu Lima de Souza Cascardo <email address hidden> Fri, 06 Oct 2017 16:45:48 -0300

Changed in linux (Ubuntu Zesty):
status: Fix Committed → Fix Released
Changed in ubuntu-power-systems:
status: Fix Committed → Fix Released
bugproxy (bugproxy)
tags: added: targetmilestone-inin1704
removed: targetmilestone-inin16043
Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2017-10-13 18:16 EDT-------
The zesty 4.10.0-37.41 kernel completed multiple kvm uperf stress tests with no more issues, we can close this bug as fixed.

tags: added: verification-done-zesty
removed: verification-needed-zesty
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.