Dropped packets on EC2, "xen_netfront: xennet: skb rides the rocket: x slots"

Bug #1317811 reported by Stéphan Kochen
162
This bug affects 32 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Fix Released
Medium
Unassigned
Trusty
Fix Released
Medium
Unassigned
Utopic
Fix Released
Medium
Unassigned

Bug Description

Running Ubuntu 14.04 LTS on EC2, we see a lot of the following in the kernel log:

    xen_netfront: xennet: skb rides the rocket: 19 slots

Each of these messages corresponds to a dropped TX packet, and eventually causes our application's connections to break and timeout.

The problem appears when network load increases. We have Node.js processes doing pubsub with a Redis server, and these are most visibly affected, showing frequent connection loss. The processes talk to each other using the private addresses EC2 allocates to the machines.

Notably, the default MTU on the network interface seems to have gone up from 1500 on 13.10, to 9000 in 14.04 LTS. Reducing the MTU back to 1500 seems to drastically reduce dropped packets. (Can't say for certain if it completely eliminates the problem.)

The machines we run are started from ami-896c96fe.

ProblemType: Bug
DistroRelease: Ubuntu 14.04
Package: linux-image-3.13.0-24-generic 3.13.0-24.46
ProcVersionSignature: User Name 3.13.0-24.46-generic 3.13.9
Uname: Linux 3.13.0-24-generic x86_64
AlsaDevices:
 total 0
 crw-rw---- 1 root audio 116, 1 May 9 09:01 seq
 crw-rw---- 1 root audio 116, 33 May 9 09:01 timer
AplayDevices: Error: [Errno 2] No such file or directory: 'aplay'
ApportVersion: 2.14.1-0ubuntu3
Architecture: amd64
ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord'
AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', '/dev/snd/timer'] failed with exit code 1:
CRDA: Error: [Errno 2] No such file or directory: 'iw'
Date: Fri May 9 09:11:18 2014
Ec2AMI: ami-896c96fe
Ec2AMIManifest: (unknown)
Ec2AvailabilityZone: eu-west-1c
Ec2InstanceType: c3.large
Ec2Kernel: aki-52a34525
Ec2Ramdisk: unavailable
IwConfig: Error: [Errno 2] No such file or directory: 'iwconfig'
Lspci:

Lsusb: Error: command ['lsusb'] failed with exit code 1: unable to initialize libusb: -99
PciMultimedia:

ProcFB:

ProcKernelCmdLine: root=LABEL=cloudimg-rootfs ro console=hvc0
RelatedPackageVersions:
 linux-restricted-modules-3.13.0-24-generic N/A
 linux-backports-modules-3.13.0-24-generic N/A
 linux-firmware N/A
RfKill: Error: [Errno 2] No such file or directory: 'rfkill'
SourcePackage: linux
UpgradeStatus: No upgrade log present (probably fresh install)
---
AlsaDevices:
 total 0
 crw-rw---- 1 root audio 116, 1 May 9 09:54 seq
 crw-rw---- 1 root audio 116, 33 May 9 09:54 timer
AplayDevices: Error: [Errno 2] No such file or directory
ApportVersion: 2.14.1-0ubuntu3
Architecture: amd64
ArecordDevices: Error: [Errno 2] No such file or directory
AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', '/dev/snd/timer'] failed with exit code 1:
CRDA: Error: [Errno 2] No such file or directory
CurrentDmesg: [ 24.724129] init: plymouth-upstart-bridge main process ended, respawning
DistroRelease: Ubuntu 14.04
Ec2AMI: ami-896c96fe
Ec2AMIManifest: (unknown)
Ec2AvailabilityZone: eu-west-1c
Ec2InstanceType: c3.large
Ec2Kernel: aki-52a34525
Ec2Ramdisk: unavailable
IwConfig: Error: [Errno 2] No such file or directory
Lspci:

Lsusb: Error: command ['lsusb'] failed with exit code 1: unable to initialize libusb: -99
Package: linux (not installed)
PciMultimedia:

ProcFB:

ProcKernelCmdLine: root=LABEL=cloudimg-rootfs ro console=hvc0
ProcVersionSignature: User Name 3.13.0-24.46-generic 3.13.9
RelatedPackageVersions:
 linux-restricted-modules-3.13.0-24-generic N/A
 linux-backports-modules-3.13.0-24-generic N/A
 linux-firmware N/A
RfKill: Error: [Errno 2] No such file or directory
Tags: trusty ec2-images
Uname: Linux 3.13.0-24-generic x86_64
UpgradeStatus: No upgrade log present (probably fresh install)
UserGroups: adm audio cdrom dialout dip floppy netdev plugdev sudo video
_MarkForUpload: True

break-fix: - 97a6d1bb2b658ac85ed88205ccd1ab809899884d
break-fix: - 11d3d2a16cc1f05c6ece69a4392e99efb85666a6

CVE References

Revision history for this message
Stéphan Kochen (stephank) wrote :
Revision history for this message
Brad Figg (brad-figg) wrote : Missing required logs.

This bug is missing log files that will aid in diagnosing the problem. From a terminal window please run:

apport-collect 1317811

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete
Revision history for this message
Stéphan Kochen (stephank) wrote :

The machine is no longer running, but I can run apport-collect from a similar machine. The only difference being that we've since added a line to our startup script to reduce the MTU to 1500.

tags: added: apport-collected
description: updated
Revision history for this message
Stéphan Kochen (stephank) wrote : BootDmesg.txt

apport information

Revision history for this message
Stéphan Kochen (stephank) wrote : ProcCpuinfo.txt

apport information

Revision history for this message
Stéphan Kochen (stephank) wrote : ProcEnviron.txt

apport information

Revision history for this message
Stéphan Kochen (stephank) wrote : ProcInterrupts.txt

apport information

Revision history for this message
Stéphan Kochen (stephank) wrote : ProcModules.txt

apport information

Revision history for this message
Stéphan Kochen (stephank) wrote : UdevDb.txt

apport information

Revision history for this message
Stéphan Kochen (stephank) wrote : UdevLog.txt

apport information

Revision history for this message
Stéphan Kochen (stephank) wrote : WifiSyslog.txt

apport information

Changed in linux (Ubuntu):
status: Incomplete → Confirmed
Changed in linux (Ubuntu):
importance: Undecided → Medium
Revision history for this message
Stéphan Kochen (stephank) wrote :

For what it's worth, the MTU appears to differ per instance type. At least c3.large has an MTU of 9000 by default, while m1.small has a normal MTU of 1500.

Revision history for this message
Stefan Bader (smb) wrote :

Could be interesting to find out whether on a m1.small the issue does not occur (although that still could be resulting from other differences in the setup than mtu). Not sure how AWS manages to cause the instance to come up with a different mtu either. In my experiments I had a normal bridge on the host set to 9000 and the guest still had 1500. Though I do not know how the network is set up in EC2 in detail (could be openvswitch).
Generally the issue is that something seems to cause packets with a large data buffer. One slot in the xen-net driver is a 4k page. The limit is 18 slots. Anything above that causes the observed message and the packet to be dropped. The host side would have another limit of (usually) 20 slots on which it would assume a malicious guest and disrupts the connection. But since the guest drops at 17 or above the host should never see that number.
Unfortunately I am not that deeply understanding the network code, so I will have to ask upstream. As far as I understand a socket buffer can consist of of multiple fragments (kind of a scatter gather list). There is a definition in the code that sets a limit to the number of fragments based on a maximum frame size of 64K. This results in 17 frags (for 4K pages that is 16 + 1 to handle data not starting at page boundary). The Xen driver counts the length of the memory area in all frags (if data in a frag starts at an offset that is added, the code does that for every frag, the question would be whether in theory each frag would be allowed to have an offset because that might add up to more than one page). To the number of pages needed for the frags, the driver then adds the number of pages (can that be more than one?) needed for the header. If that is bigger than 18 (17 for frag + 1 for header?) the rides the rocket error happens.
This leaves a few question marks for me: the memory associated with a frag can be a compound page, so I would think that the length might be greater than 4K. I have no clue, yet, how compound pages exactly come into play. Is the 64K limit still enforced by a limit of the number of frags? Can each frag data begin at some offset (and end with more than one page of overall overhead)? Apparently the header can start at some offset, too. So worst case (assuming header length to be less than 4K), if the offset is quite big, that could end up requiring 2 pages. Then if the frag data happens to use up its 17 pages limit, we just would end up hitting the 19 pages failure size.

Revision history for this message
Stefan Bader (smb) wrote :

Thinking about this, I could build a debug kernel to which I add code to print out the layout of the socket buffer when the size check fails. Stéphan, would you be able to run that on a setup that shows the failures?

Revision history for this message
Stéphan Kochen (stephank) wrote :

I can't comment on the driver implementation details, but I can give some further details about our experience.

The app in question was a second screen app for the dutch public broadcasting network for the Eurovision Song Contest. The app was live for two semi-finals on tuesday the 6th and thursday the 8th, as well as the finals saturday the 10th. Load was lowest on the thursday, when the Netherlands did not perform, and highest saturday during the finals. We ran c3.large instances for all shows.

During the first run on tuesday was when we first noticed the issue.

Shortly before the second run on thursday we noticed the high MTU setting as a possible cause, and changed it to 1500 on half of our machines in the redundant setup. There was a clear difference in connection stability between these machines.

For the third run on saturday, we had all machines on the normal MTU of 1500, as we adjusted our startup scripts to force the setting. We had zero connection issues that night, and clean kernel logs, even though this night saw the highest network load of all three.

We have several m1.small instances running 24/7 as well, and these have clean kernel logs, but their network load is quite low. The MTU on these has always been untouched, and is a normal 1500, apparently by default.

In the instance type list, EC2 shows Compute Optimized instances as having Enhanced Networking. Even though we don't qualify for it, perhaps the networking setup is different for these instances. https://aws.amazon.com/ec2/instance-types/

About a custom kernel, we'd have to look into deploying it, or reproducing the issue on a smaller test setup. I'd prefer looking into the latter, because maybe we can reproduce it between just two instances with stress tools.

Revision history for this message
Stefan Bader (smb) wrote :

Thanks for the additional info. Definitely the relation to MTU size sounds quite plausible. The checking is on traffic from the guest out and that I would expect to be affected by MTU together with GSO support. And yes, preferably we find a reproducer that does not require a production system to suffer. And ideally for me if I could do so on a local test system to understand the host side.
I will try to figure out more details from a stock c3.large if that is possible and maybe whether something like iperf can trigger it there.

Revision history for this message
Stéphan Kochen (stephank) wrote :

So I have a smaller test case. Basically, install Redis (from apt) on one machine, and Node.js (binaries from nodejs.org) with the below scripts on the other. Run pub.js once, and sub.js twice, this quickly triggers the error. The first arg to each script is the address of the redis machine; I use the internal 10.0.0.0/8 address.

https://gist.github.com/stephank/764e3414d57bc3bcb6b3

I initially tried to do this using openbsd-inetd echo and several netcat processes, but that doesn't seem to trigger it. Maybe it's something specific about the way Redis distributes pubsub messages to its subscribers?

Revision history for this message
Stéphan Kochen (stephank) wrote :

If you'd like me to run this on EC2, I can give it a try. A custom kernel would simply be a replacement package?

Revision history for this message
Stefan Bader (smb) wrote :

Yes, the kernel would be a set of dpkg files to be installed via 'dpkg -i'. Of course I still have to code that up. If I can reproduce it with your instructions locally then even better (would cut down turnaround times). Otherwise I can start up some EC2 instances, too. Good to have a simple way to trigger it. Not like some other issues that only happen under production conditions.

Revision history for this message
Stefan Bader (smb) wrote :

Good news, the reproducer works on my local system, too. Thanks. :)

Stefan Bader (smb)
Changed in linux (Ubuntu):
assignee: nobody → Stefan Bader (smb)
status: Confirmed → In Progress
Revision history for this message
Stefan Bader (smb) wrote :

So with the added debugging and running the reproducer with the outside bridge (and so the vifs) and the PV guests eth0 set to 9001 (as seen on EC2), I get the following (format is <length>@<offset>):

[ 698.108119] xen_netfront: xennet: skb rides the rocket: 19 slots
[ 698.108134] header 1490@238 -> 1 slots
[ 698.108139] frag #0 1614@2164 -> + 1 pages
[ 698.108143] frag #1 3038@1296 -> + 2 pages
[ 698.108147] frag #2 6076@1852 -> + 2 pages
[ 698.108151] frag #3 6076@292 -> + 2 pages
[ 698.108156] frag #4 6076@2828 -> + 3 pages
[ 698.108160] frag #5 3038@1268 -> + 2 pages
[ 698.108164] frag #6 2272@1824 -> + 1 pages
[ 698.108168] frag #7 3804@0 -> + 1 pages
[ 698.108172] frag #8 6076@264 -> + 2 pages
[ 698.108177] frag #9 3946@2800 -> + 2 pages
[ 698.108180] frags adding 18 slots

So multiple frags can point to a compound page and also start at an offset. Which makes either the assumption about the size required to handle N frags is wrong or whatever creates that buffer...

Revision history for this message
Stefan Bader (smb) wrote :

Playing around with this, I actually found an even simpler way to trigger the issue:

PV guest #1: Install redis-server (and enable eth0 ip in config)
PV guest #2: Install redis-tools and run 'redis-benchmark -q -h <PV guest #1 IP> -d 1000'

The MTU size turns out to be irrelevant, this even happens with 1500 during the batch request tests. What does make a difference is to prevent scatter gather as it was reported in another bug about this (on any host that sees the "rides the rocket" message:

sudo ethtool -K eth0 sg off

I discussed the issue upstream and the driver should handle this case without dropping the request. It might be a bit complicated so I cannot give an ETA on the fix right now.

Revision history for this message
Stéphan Kochen (stephank) wrote :

Thanks for the continued help fixing this!

I couldn't reproduce it using redis-benchmark on EC2, but that's okay.

Scatter/gather IO is solely a performance flag in the driver? As in, it won't affect applications?

The only effect I noticed after disabling it is that it's apparently required for jumbo framing:

    vif vif-0 eth0: Reducing MTU because no SG offload

And it dropped to 1500. But I can live with that.

Also, do you have a link to the upstream discussion?

Revision history for this message
Stefan Bader (smb) wrote :

Oh, ok. It does work quite well on my local guests that come up with 1500 MTU. Maybe the EC2 guests would need a bigger data size value than 1000. But yeah, as long as I have some way to verify whatever comes up to fix this, it is ok.

Yes, the loss of jumbo frames was expected. As long as high throughput is not critical it is at least good enough as a work-around.

About a upstream discussion: http://www.spinics.net/lists/netdev/msg282340.html

Basically it looks like the problem was kind of known but probably did not happens often enough. Or actually complicated to fix. It appears that other drivers will not have that issue as long as the limit is in the actual transfer size and not in the number of pages required to accommodate the frags/scatter gather list. Unfortunately Xen has a limit there that guests have to impose because otherwise the host side driver would shut down the connection completely.

Revision history for this message
Ran Rubisntein (ran-cld) wrote :

I am getting this error on Ubuntu 14.04 with latest kernel 3.13.0-30-generic running on c3.2xlarge instances on EC2 PV.

Changing MTU to 1500 didn't help.

Any other suggestions? We are getting 10-20 dropped packets a day (out of millions)

Revision history for this message
Stéphan Kochen (stephank) wrote :

As Stefan Bader mentions in #22, the current workaround is:

    sudo ethtool -K eth0 sg off

Revision history for this message
Stefan Bader (smb) wrote :

Right, unfortunately a real fix without the need to disable scatter gather will unlikely happen soon. None of the approaches discussed until now seem to find the agreement of everybody as they all would not be perfect.

Revision history for this message
Carl Hörberg (carl-hoerberg) wrote :

HVM instances does not seem to have this issue, only PV/paravirtual instances.

Revision history for this message
Stefan Bader (smb) wrote :

HVM instance would have the same issue when using PV network drivers (which usually they do for performance). However one also needs to cause fragmented skbs which contain multiple compound page fragments. And that depends on many factors which may not always be easy to meet.

By now, there actually seems to be a work-around that has been applied to upstream v3.17. Looks like we have to pick the following (or actually get it into the stable process):

commit 97a6d1bb2b658ac85ed88205ccd1ab809899884d
Author: Zoltan Kiss <email address hidden>
Date: Mon Aug 11 18:32:23 2014 +0100

    xen-netfront: Fix handling packets on compound pages with skb_linearize

Changed in linux (Ubuntu Trusty):
importance: Undecided → Medium
status: New → Triaged
Changed in linux (Ubuntu Utopic):
importance: Undecided → Medium
status: New → Triaged
Stefan Bader (smb)
description: updated
tags: added: kernel-bug-break-fix
Andy Whitcroft (apw)
Changed in linux (Ubuntu Trusty):
status: Triaged → Confirmed
Changed in linux (Ubuntu Utopic):
status: Triaged → Confirmed
Changed in linux (Ubuntu):
status: In Progress → Fix Committed
Andy Whitcroft (apw)
Changed in linux (Ubuntu):
status: Fix Committed → Fix Released
Revision history for this message
Diego Rodriguez (habaner0) wrote :

I'm still seeing this issue in Ubuntu 14.04 on Ec2, despite using the latest kernel release:

 3.13.0-44-generic #73-Ubuntu SMP Tue Dec 16 00:22:43 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux

Here are some of the logs I found:

kern.log:1634:Jan 15 00:22:59 staging-cool-load-balancer-20150114-10-36-6-94 kernel: [10070.280631] xen_netfront: xennet: skb rides the rocket: 22 slots
kern.log:3523:Jan 15 20:01:23 staging-cool-load-balancer-20150114-10-36-6-94 kernel: [80773.747470] xen_netfront: xennet: skb rides the rocket: 19 slots
kern.log:3524:Jan 15 20:01:23 staging-cool-load-balancer-20150114-10-36-6-94 kernel: [80773.791014] xen_netfront: xennet: skb rides the rocket: 19 slots
kern.log:3525:Jan 15 20:02:14 staging-cool-load-balancer-20150114-10-36-6-94 kernel: [80824.734485] xen_netfront: xennet: skb rides the rocket: 19 slots
kern.log:3526:Jan 15 20:02:22 staging-cool-load-balancer-20150114-10-36-6-94 kernel: [80833.403077] xen_netfront: xennet: skb rides the rocket: 19 slots
kern.log:3871:Jan 15 23:39:20 staging-cool-load-balancer-20150114-10-36-6-94 kernel: [93850.874250] xen_netfront: xennet: skb rides the rocket: 20 slots
kern.log:3872:Jan 15 23:39:20 staging-cool-load-balancer-20150114-10-36-6-94 kernel: [93851.452640] xen_netfront: xennet: skb rides the rocket: 19 slots
kern.log:3873:Jan 15 23:39:20 staging-cool-load-balancer-20150114-10-36-6-94 kernel: [93851.453131] xen_netfront: xennet: skb rides the rocket: 20 slots
kern.log:3874:Jan 15 23:39:21 staging-cool-load-balancer-20150114-10-36-6-94 kernel: [93851.695471] xen_netfront: xennet: skb rides the rocket: 19 slots

Revision history for this message
Stefan Bader (smb) wrote :

Not surprising as we held back for Trusty and Utopic after being told that there was a regression. And as the task status show this only became fixed in current development (Vivid). But Now that both parts are there it is time to get back to stable.

Stefan Bader (smb)
description: updated
Revision history for this message
Jon Schewe (jpschewe) wrote :

So will this be fixed in 14.04 at all? I just upgraded to kernel 3.13.0-44 and I'm seeing more of these messages then before. This is on a system that does NAT and DNS.

Revision history for this message
Brian Scholl (btscholl) wrote :

Just chiming in with Jon, I'm using 14.04.1 LTS on EC2 hs1.8xlarge with kernel 3.16.0-29 and I can still reliably produce this error. I thought that this was fixed in 3.14+ but no such luck. Under a particular load the server becomes unresponsive to network requests.

I've tried turning off tso and sg on eth0 but this did not resolve the issue. I'm not sure if there is another feature that is causing this in my configuration but I'd be willing to test for it if someone could point me at documentation.

Also if there are any logs I can provide to help diagnose this issue please let me know, I'm really eager to see this bug resolved.

Revision history for this message
Stefan Bader (smb) wrote :

This will be fixed in Utopic and Trusty. This was only delayed because the upstream fix was found to cause another regression just about when it would have been picked up. I just re-submitted that and the fix for the regression for getting picked up by our stable trees.

Revision history for this message
dragosr (dragosr) wrote :

Any updates on when will the fix come out ?

Revision history for this message
Durzo (durzo) wrote :

also waiting on this

Revision history for this message
Stefan Bader (smb) wrote :

Unfortunately I cannot speed up the process. The fixes have been picked into our stable trees and got a chance to move over to the distro trees next week (which would get them into the next update). Meanwhile you could be working around it by disabling scatter gather (see comment #22).

Andy Whitcroft (apw)
Changed in linux (Ubuntu Utopic):
status: Confirmed → Fix Committed
Seth Forshee (sforshee)
Changed in linux (Ubuntu Trusty):
status: Confirmed → Fix Committed
Revision history for this message
Brad Figg (brad-figg) wrote :

This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-trusty' to 'verification-done-trusty'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-trusty
Revision history for this message
Stéphan Kochen (stephank) wrote :

I believe my test case is flawed, so I cannot verify with certainty if the issue is fixed or not. This is the same test case as I used before, for which I posted code in a gist: https://gist.github.com/stephank/764e3414d57bc3bcb6b3

Here's what I tried:

 - I started two new c3.large machines from ami-69e76c1e (eu-west-1 HVM 64-bit trusty with instance store)

 - I downloaded io.js 1.2.0 on machine A, together with the pub.js and sub.js scripts from my gist.

 - I installed redis-server on machine B and reconfigured redis to bind on to the internal IP (in 10.x.x.x)

 - The machines were initially running linux-virtual 3.13.0.45.52. I reproduced the issue in this setup by running sub.js twice, then pub.js once on machine A, connecting them to redis on machine B. The 'rides the rocket' message showed up in the logs, and the subs lost their connection.

 - I enabled trusty proposed on both machines with a pin, and selectively upgraded linux-virtual on both machines. Then rebooted on both. The kernel on both machines is now linux-virtual 3.13.0.46.53.

 - I ran the same test again, sub.js twice, pub.js once on machine A, connecting to machine B. There were no 'rides the rocket' messages, but the subs still lose their connections. I sporadically get 'net_ratelimit: x callbacks suppressed', but not on every test run.

 - I disabled scather/gather on both machines, which also dropped their MTU to 1500, and ran the test again several times. There were no more 'net_ratelimit' messages, but the subs still lose their connections.

 - I installed redis-server on machine A the same way, listening on the internal IP, and ran the same test on machine A, but this time connecting to itself on the internal IP. The test now runs indefinitely. (But this probably doesn't touch the driver.)

So I'm not sure what to take away from this. I suppose I could continue by trying to fix my test case to run properly without scather/gather, before again enabling it. Or find a way to trigger it using a different test, such as with redis-benchmark.

Stefan, is it sufficient verification if your own testing now shows it fixed?

Revision history for this message
Heikki Hannikainen (hessu) wrote :

Installed 3.13.0-46.75 on 6 VMs which exhibited this problem daily. I'll confirm tomorrow evening, if it's gone away.

Revision history for this message
Stefan Bader (smb) wrote :

I usually only do a redis-benchmark (with -d1000) which triggers the fragmentation on the server side. The benchmark itself never complained. Though I could verify that with the old kernel the tx.dropped count in ifconfig went up. And with the proposed kernel (I picked Utopic/3.16 since you and Heikki look at 3.13) there were no dropped packages. I realize now that the way the message was changed from a normal ratelimited one to a debug ratelimited one is a bit useless as this results in a lot of "callback suppressed" without showing a single line of what that might be. But basically there will be one of those messages whenever the fragments do not fit into the 19 pages of ring buffer. Just before the change this always meant the package was dropped and now it depends on whether the skb could be serialized (which might happen under memory pressure).

I would say we wait for the feedback from Heikki and call it good if his VMs survive.

Revision history for this message
Stefan Bader (smb) wrote :

Doh! What I meant was the serialization of the skb may fail under memory pressure.

Revision history for this message
Joni-Pekka Kurronen (joni-kurronen) wrote :

hi,

I have had two mysteriouse problem's, bacula stop's and says as reason connecyion lost due too big packets's ( tried to cange mtu no success) second have been aiccu that dose not recover time to time after connection get up againg, need's service stop & start,...

I just gave sudo ethtool -K eth0 sg off and bacula seem's yo work now,... tomorrow morning if all backup's done this might been the corrective action. Backup causes at giganet around 17-40mb/s continouse trafic + internet usage at server/router/4Gtongle connection point.

Any ideas how to test? Could this explain my problem?

Revision history for this message
Heikki Hannikainen (hessu) wrote :

Since yesterday morning I've had 3.13.0-46.75 running on 6 VMs. Those VMs haven't had any "xen_netfront: xennet: skb rides the rocket" messages. Meanwhile, 8 other VMs with 3.13.0-44 did have these errors.

So, looks good to me. Not absolute proof, but looks good.

Revision history for this message
Joni-Pekka Kurronen (joni-kurronen) wrote :

I can confirm that ethtool -K eth0 sg off did correct bacula backup problem:

-bacula-sd - bacula-fd communication error that stops backup process saying " Error: bsock.c:427 Write error sending reset by peer"

-so far no IPV6 trafic jam's whit aiccu, but single missing packet's should not stop aiccu?

At Ubuntu 14.04 3.13.0-39-generic LTS, twin server configuration ( suricata, logstash, NFQUEUE, shoreline, keepalived, haproxy, mariadb-galera-cluster, aiccu IPCV6,...), one of twin servers has 4G (Hawei E398) tongle, wifi (hostapd) tongle and dose firewall, routing,... 4 cores so it's not busy ever,...

So could these missing packet's realy stop lenghty prosess like backup where 0.1T to 0.7T transfered and other mechanisms can not correct, hide the problem ?

Revision history for this message
Joni-Pekka Kurronen (joni-kurronen) wrote :

Crazy idea, could there be situations when sender increases mtu over receiver side?

Fundamental question, I tried to understan what ethtool -K eth0 sg off dose in protocoll level
can anyone explain, it look's medicine at moment.

Revision history for this message
Stefan Bader (smb) wrote :

I think that is at least enough data to claim this issue verified. @Jon, you could check whether you have the same issue by looking at dmesg and ifconfig output on the sender side. If you see the "rides the rocket" messages together with an incrementing tx drop count, that is this bug. And it should be going away even without disabling sg when using the proposed kernel for the sender.

And though it probably should not cause that bad effects, but on the other hand higher layer applications can expect the tcp layer to correct dropped packets. Which maybe is done but possibly ending up (by trying to avoid copying data in memory) with the same (or similar) fragmented send buffer. And in the end all re-tries are dropped.

tags: added: verification-done-trusty
removed: verification-needed-trusty
tags: added: verification-done-utopic
Changed in linux (Ubuntu):
assignee: Stefan Bader (smb) → nobody
Revision history for this message
Stefan Bader (smb) wrote :

Turning off scatter gather disables the use of fragments in send buffers. So for the xen-netfront driver there is no chance that those fragments end up requiring more than the 19 pages that it can handle in one transaction.

Revision history for this message
Heikki Hannikainen (hessu) wrote :

On the other hand, turning off scatter gather caused a rather constant 0.1% packet loss (transmit drops) in my setup (trusty on Xen). VPN gateways, with NAT and firewalling, relatively high throughput at times. Enabling scatter gather removes the constant small packet loss but then I do get the "rides the rocket" events.

Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (12.8 KiB)

This bug was fixed in the package linux - 3.13.0-46.75

---------------
linux (3.13.0-46.75) trusty; urgency=low

  [ Seth Forshee ]

  * Release Tracking Bug
    - LP: #1419963

  [ Andy Whitcroft ]

  * [Debian] arm64 -- build ubuntu drivers
    - LP: #1411284
  * hyper-v -- fix comment handing in /etc/network/interfaces
    - LP: #1413020

  [ Kamal Mostafa ]

  * [Packaging] force "dpkg-source -I -i" behavior

  [ Upstream Kernel Changes ]

  * Revert "[SCSI] mpt2sas: Remove phys on topology change."
    - LP: #1419838
  * Revert "[SCSI] mpt3sas: Remove phys on topology change"
    - LP: #1419838
  * Btrfs: fix transaction abortion when remounting btrfs from RW to RO
    - LP: #1411320
  * Btrfs: fix a crash of clone with inline extents's split
    - LP: #1413129
  * net/mlx4_en: Add VXLAN ndo calls to the PF net device ops too
    - LP: #1407760
  * KVM: x86: SYSENTER emulation is broken
    - LP: #1414651
    - CVE-2015-0239
  * powerpc/xmon: Fix another endiannes issue in RTAS call from xmon
    - LP: #1415919
  * ipv6: fix swapped ipv4/ipv6 mtu_reduced callbacks
    - LP: #1404558, #1419837
  * usb: gadget: at91_udc: move prepare clk into process context
    - LP: #1419837
  * KVM: x86: Fix far-jump to non-canonical check
    - LP: #1419837
  * x86/tls: Validate TLS entries to protect espfix
    - LP: #1419837
  * userns: Check euid no fsuid when establishing an unprivileged uid
    mapping
    - LP: #1419837
  * userns: Document what the invariant required for safe unprivileged
    mappings.
    - LP: #1419837
  * userns: Only allow the creator of the userns unprivileged mappings
    - LP: #1419837
  * x86_64, switch_to(): Load TLS descriptors before switching DS and ES
    - LP: #1419837
  * isofs: Fix infinite looping over CE entries
    - LP: #1419837
  * batman-adv: Calculate extra tail size based on queued fragments
    - LP: #1419837
  * KEYS: close race between key lookup and freeing
    - LP: #1419837
  * isofs: Fix unchecked printing of ER records
    - LP: #1419837
  * x86_64, vdso: Fix the vdso address randomization algorithm
    - LP: #1419837
  * groups: Consolidate the setgroups permission checks
    - LP: #1419837
  * userns: Don't allow setgroups until a gid mapping has been setablished
    - LP: #1419837
  * userns: Don't allow unprivileged creation of gid mappings
    - LP: #1419837
  * move d_rcu from overlapping d_child to overlapping d_alias
    - LP: #1419837
  * deal with deadlock in d_walk()
    - LP: #1419837
  * Linux 3.13.11-ckt14
    - LP: #1419837
  * gre: fix the inner mac header in nbma tunnel xmit path
    - LP: #1419838
  * netlink: Always copy on mmap TX.
    - LP: #1419838
  * netlink: Don't reorder loads/stores before marking mmap netlink frame
    as available
    - LP: #1419838
  * in6: fix conflict with glibc
    - LP: #1419838
  * tg3: tg3_disable_ints using uninitialized mailbox value to disable
    interrupts
    - LP: #1419838
  * batman-adv: Unify fragment size calculation
    - LP: #1419838
  * batman-adv: avoid NULL dereferences and fix if check
    - LP: #1419838
  * net: Fix stacked vlan offload features computation
    - LP: #1419838
  * net: Reset secmark when scrubbing packet
    - L...

Changed in linux (Ubuntu Trusty):
status: Fix Committed → Fix Released
status: Fix Committed → Fix Released
Andy Whitcroft (apw)
Changed in linux (Ubuntu Utopic):
status: Fix Committed → Fix Released
tags: removed: kernel-bug-break-fix
Revision history for this message
Jyothikumar (jyothikumar) wrote :

xen_netfront: xennet: skb rides the rocket: 19 slots

when i did dmesg from linux mechine i got this output, please sugest to what i have to do

 im using kernal

3.10.0-123.8.1.el7.x86_64

Revision history for this message
Jyothikumar (jyothikumar) wrote :

 ifconfig
eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
        inet 10.50.3.37 netmask 255.255.255.192 broadcast 10.50.3.63
        inet6 fe80::c36:5fff:fea1:893b prefixlen 64 scopeid 0x20<link>
        ether 0e:36:5f:a1:89:3b txqueuelen 1000 (Ethernet)
        RX packets 10653395098 bytes 10991518671473 (9.9 TiB)
        RX errors 0 dropped 0 overruns 0 frame 0
        TX packets 5091742032 bytes 8916018268199 (8.1 TiB)
        TX errors 0 dropped 190064 overruns 0 carrier 0 collisions 0

tx dropped 190064

Revision history for this message
Jyothikumar (jyothikumar) wrote :

 ifconfig
eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
        inet 10.50.3.37 netmask 255.255.255.192 broadcast 10.50.3.63
        inet6 fe80::c36:5fff:fea1:893b prefixlen 64 scopeid 0x20<link>
        ether 0e:36:5f:a1:89:3b txqueuelen 1000 (Ethernet)
        RX packets 10654784591 bytes 10992813714262 (9.9 TiB)
        RX errors 0 dropped 0 overruns 0 frame 0
        TX packets 5092521006 bytes 8917322665845 (8.1 TiB)
        TX errors 0 dropped 190075 overruns 0 carrier 0 collisions 0

next 1 min
 dropped 190075

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.