tap interface drops many packets on highload systems

Bug #1423631 reported by Michael Kazakov
38
This bug affects 7 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Confirmed
Medium
Unassigned
qemu (Ubuntu)
Incomplete
Medium
Unassigned
qemu-kvm (Ubuntu)
Incomplete
Medium
Unassigned

Bug Description

I use qemu-kvm in openstack. On highload hypervisor tap interface of net-highload guest drops many TX packets.
Network options of qemu-system-x86_64 "... -netdev tap,fd=30,id=hostnet0,vhost=on,vhostfd=31 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=fa:16:3e:86:67:7b,bus=pci.0,addr=0x3..."

Network domin config:

    <interface type='bridge'>
      <mac address='fa:16:3e:86:67:7b'/>
      <source bridge='qbre4009073-0b'/>
      <target dev='tape4009073-0b'/>
      <model type='virtio'/>
      <alias name='net0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/>
    </interface>

tape4009073-0b Link encap:Ethernet HWaddr fe:16:3e:86:67:7b
          inet6 addr: fe80::fc16:3eff:fe86:677b/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
          RX packets:1587622634 errors:0 dropped:0 overruns:0 frame:0
          TX packets:1484106438 errors:0 dropped:460259 overruns:0 carrier:0
          collisions:0 txqueuelen:500
          RX bytes:877878711500 (877.8 GB) TX bytes:3071846828531 (3.0 TB)
---
AlsaDevices:
 total 0
 crw-rw---- 1 root audio 116, 1 Feb 18 12:06 seq
 crw-rw---- 1 root audio 116, 33 Feb 18 12:06 timer
AplayDevices: Error: [Errno 2] No such file or directory
ApportVersion: 2.14.1-0ubuntu3.7
Architecture: amd64
ArecordDevices: Error: [Errno 2] No such file or directory
AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', '/dev/snd/timer'] failed with exit code 1:
CRDA: Error: [Errno 2] No such file or directory
DistroRelease: Ubuntu 14.04
InstallationDate: Installed on 2014-07-01 (234 days ago)
InstallationMedia: Ubuntu-Server 14.04 LTS "Trusty Tahr" - Release amd64 (20140416.2)
MachineType: Dell Inc. PowerEdge M620
Package: qemu-kvm
PciMultimedia:

ProcEnviron:
 TERM=xterm
 PATH=(custom, no user)
 LANG=en_US.UTF-8
 LC_MESSAGES=en_US.UTF-8
 SHELL=/bin/bash
ProcFB: 0 VESA VGA
ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-3.16.0-30-generic root=UUID=5965239d-820a-49a1-9d8e-23a4c1f87ce6 ro
ProcVersionSignature: Ubuntu 3.16.0-30.40~14.04.1-generic 3.16.7-ckt3
RelatedPackageVersions:
 linux-restricted-modules-3.16.0-30-generic N/A
 linux-backports-modules-3.16.0-30-generic N/A
 linux-firmware 1.127.11
RfKill: Error: [Errno 2] No such file or directory
Tags: trusty
Uname: Linux 3.16.0-30-generic x86_64
UpgradeStatus: No upgrade log present (probably fresh install)
UserGroups:

_MarkForUpload: True
dmi.bios.date: 01/21/2014
dmi.bios.vendor: Dell Inc.
dmi.bios.version: 2.2.7
dmi.board.name: 0T36VK
dmi.board.vendor: Dell Inc.
dmi.board.version: A01
dmi.chassis.type: 25
dmi.chassis.vendor: Dell Inc.
dmi.chassis.version: PowerEdge M1000e
dmi.modalias: dmi:bvnDellInc.:bvr2.2.7:bd01/21/2014:svnDellInc.:pnPowerEdgeM620:pvr:rvnDellInc.:rn0T36VK:rvrA01:cvnDellInc.:ct25:cvrPowerEdgeM1000e:
dmi.product.name: PowerEdge M620
dmi.sys.vendor: Dell Inc.

summary: - tap interface drors may packages on highload sysstems
+ tap interface drops many packages on highload systems
summary: - tap interface drops many packages on highload systems
+ tap interface drops many packets on highload systems
Revision history for this message
Serge Hallyn (serge-hallyn) wrote :

Please update the bug with system information by doing

apport-collect 1423631

Revision history for this message
Brad Figg (brad-figg) wrote : Missing required logs.

This bug is missing log files that will aid in diagnosing the problem. From a terminal window please run:

apport-collect 1423631

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete
Revision history for this message
Michael Kazakov (gnomino) wrote : BootDmesg.txt

apport information

tags: added: apport-collected trusty
description: updated
Revision history for this message
Michael Kazakov (gnomino) wrote : CurrentDmesg.txt

apport information

Revision history for this message
Michael Kazakov (gnomino) wrote : IwConfig.txt

apport information

Revision history for this message
Michael Kazakov (gnomino) wrote : Lspci.txt

apport information

Revision history for this message
Michael Kazakov (gnomino) wrote : Lsusb.txt

apport information

Revision history for this message
Michael Kazakov (gnomino) wrote : ProcCpuinfo.txt

apport information

Revision history for this message
Michael Kazakov (gnomino) wrote : ProcInterrupts.txt

apport information

Revision history for this message
Michael Kazakov (gnomino) wrote : ProcModules.txt

apport information

Revision history for this message
Michael Kazakov (gnomino) wrote : UdevDb.txt

apport information

Revision history for this message
Michael Kazakov (gnomino) wrote : UdevLog.txt

apport information

Revision history for this message
Michael Kazakov (gnomino) wrote : WifiSyslog.txt

apport information

Changed in qemu-kvm (Ubuntu):
status: New → Incomplete
status: Incomplete → Confirmed
Changed in linux (Ubuntu):
status: Incomplete → New
Revision history for this message
Brad Figg (brad-figg) wrote : Status changed to Confirmed

This change was made by a bot.

Changed in linux (Ubuntu):
status: New → Confirmed
Chris J Arges (arges)
description: updated
Revision history for this message
Chris J Arges (arges) wrote :

Can you explain more about your setup?
Which version of openstack are you running?
How have you setup your instance?
Which operating system / kernel version are you running in your instance?
How are you generating outbound traffic from the instance?
Is the outbound traffic going through other equipment which may be dropping those packets?

In my Trusty/Icehouse deploy, I've deployed instanced and observed the tapXXXXX interfaces on the compute node and noticed no large amount of dropped TX packets.

Changed in qemu-kvm (Ubuntu):
status: Confirmed → Incomplete
importance: Undecided → Medium
Changed in qemu (Ubuntu):
importance: Undecided → Medium
status: New → Incomplete
Changed in linux (Ubuntu):
importance: Undecided → Medium
Revision history for this message
Michael Kazakov (gnomino) wrote :

I found the cause of the problem.
The tap interface of a instance has small tx length 500 packets. At a high network load (15-25 Kpps) and high CPU load on a guest some times queue overflows and the tap interface start drop network packages. I think that queue overflow occurs in moments of micro friezes (<1000ms) of a virtual machine. I solved this problem by increasing the tx_queue up to 10000.

Revision history for this message
sean redmond (sean-redmond1) wrote :

I also see this bu only on Windows guests.

Revision history for this message
sean redmond (sean-redmond1) wrote :

I was able to resolve this with the below udev rule:

# cat /etc/udev/rules.d/60-tap.rules
KERNEL=="tap*", RUN+="/sbin/ip link set %k txqueuelen 10000"
#

I used the below to reload udev

udevadm control --reload-rules

I then used the below to apply the rules to already created interfaces:

udevadm trigger --attr-match=subsystem=net

Revision history for this message
Satish Patel (satish-txt) wrote :

We are having same issue on CentOS7 and i have increase tx_queue upto 10000 but i am still seeing TX drops on tap interface.

Question:

Can i increase tx_queue on interface with "ifconfig <tap_interface> txqueue 10000" without rebooting machine?

OR

I have to do it at boot time?

Revision history for this message
Satish Patel (satish-txt) wrote :

This is fact, you can't get high PPS with tap interface because they run in kernel space and it drive packets on kernel which isn't scalable.

Finally i have migrate all my compute nodes to SR-IOV and much happy now.

Brad Figg (brad-figg)
tags: added: ubuntu-certified
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.