3.16.0.48 kernel in hyper-v cause hv_netvsc problem

Bug #1491957 reported by Lőrinc on 2015-09-03
84
This bug affects 14 people
Affects Status Importance Assigned to Milestone
linux-lts-utopic (Ubuntu)
High
Joseph Salisbury
Trusty
High
Joseph Salisbury

Bug Description

I have an Ubuntu 14.04 LTS in Hyper-V. It works well. Today I apt upgrade the kernel from 3.16.0.46 to 3.16.0.48 and after reboot, I have network connection problems, and have a lot of error message on screen like the above ones.

Sep 3 18:48:07 RailGun kernel: [ 71.437967] hv_netvsc vmbus_0_16 eth0: unable to send receive completion pkt (tid 2068)...retrying 4
Sep 3 18:48:07 RailGun kernel: [ 71.438276] hv_netvsc vmbus_0_16 eth0: unable to send receive completion pkt (tid 2068)...give up retrying
Sep 3 18:48:07 RailGun kernel: [ 71.438596] hv_netvsc vmbus_0_16 eth0: unable to send receive completion pkt (tid 2069)...retrying 1
Sep 3 18:48:07 RailGun kernel: [ 71.439016] hv_netvsc vmbus_0_16 eth0: unable to send receive completion pkt (tid 2069)...retrying 2
Sep 3 18:48:07 RailGun kernel: [ 71.439434] hv_netvsc vmbus_0_16 eth0: unable to send receive completion pkt (tid 2069)...retrying 3
Sep 3 18:48:07 RailGun kernel: [ 71.439847] hv_netvsc vmbus_0_16 eth0: unable to send receive completion pkt (tid 2069)...retrying 4
Sep 3 18:48:07 RailGun kernel: [ 71.440155] hv_netvsc vmbus_0_16 eth0: unable to send receive completion pkt (tid 2069)...give up retrying
Sep 3 18:48:07 RailGun kernel: [ 71.465593] hv_netvsc vmbus_0_16 eth0: unable to send receive completion pkt (tid 206a)...retrying 1
Sep 3 18:48:07 RailGun kernel: [ 71.466165] hv_netvsc vmbus_0_16 eth0: unable to send receive completion pkt (tid 206a)...retrying 2
Sep 3 18:48:07 RailGun kernel: [ 71.466827] hv_netvsc vmbus_0_16 eth0: unable to send receive completion pkt (tid 206a)...retrying 3
Sep 3 18:48:07 RailGun kernel: [ 71.467508] hv_netvsc vmbus_0_16 eth0: unable to send receive completion pkt (tid 206a)...retrying 4
Sep 3 18:48:07 RailGun kernel: [ 71.468056] hv_netvsc vmbus_0_16 eth0: unable to send receive completion pkt (tid 206a)...give up retrying
Sep 3 18:48:07 RailGun kernel: [ 71.474507] hv_netvsc vmbus_0_16 eth0: unable to send receive completion pkt (tid 206b)...retrying 1
Sep 3 18:48:07 RailGun kernel: [ 71.474998] hv_netvsc vmbus_0_16 eth0: unable to send receive completion pkt (tid 206b)...retrying 2
Sep 3 18:48:07 RailGun kernel: [ 71.475419] hv_netvsc vmbus_0_16 eth0: unable to send receive completion pkt (tid 206b)...retrying 3
Sep 3 18:48:07 RailGun kernel: [ 71.475833] hv_netvsc vmbus_0_16 eth0: unable to send receive completion pkt (tid 206b)...retrying 4
Sep 3 18:48:07 RailGun kernel: [ 71.476145] hv_netvsc vmbus_0_16 eth0: unable to send receive completion pkt (tid 206b)...give up retrying
Sep 3 18:48:07 RailGun kernel: [ 71.483356] hv_netvsc vmbus_0_16 eth0: unable to send receive completion pkt (tid 206c)...retrying 1
Sep 3 18:48:07 RailGun kernel: [ 71.483823] hv_netvsc vmbus_0_16 eth0: unable to send receive completion pkt (tid 206c)...retrying 2
Sep 3 18:48:07 RailGun kernel: [ 71.484266] hv_netvsc vmbus_0_16 eth0: unable to send receive completion pkt (tid 206c)...retrying 3
Sep 3 18:48:07 RailGun kernel: [ 71.484683] hv_netvsc vmbus_0_16 eth0: unable to send receive completion pkt (tid 206c)...retrying 4
Sep 3 18:48:07 RailGun kernel: [ 71.484993] hv_netvsc vmbus_0_16 eth0: unable to send receive completion pkt (tid 206c)...give up retrying
Sep 3 18:48:07 RailGun kernel: [ 71.491738] hv_netvsc vmbus_0_16 eth0: unable to send receive completion pkt (tid 206d)...retrying 1
Sep 3 18:48:07 RailGun kernel: [ 71.492223] hv_netvsc vmbus_0_16 eth0: unable to send receive completion pkt (tid 206d)...retrying 2
Sep 3 18:48:07 RailGun kernel: [ 71.492645] hv_netvsc vmbus_0_16 eth0: unable to send receive completion pkt (tid 206d)...retrying 3
Sep 3 18:48:07 RailGun kernel: [ 71.493060] hv_netvsc vmbus_0_16 eth0: unable to send receive completion pkt (tid 206d)...retrying 4
Sep 3 18:48:07 RailGun kernel: [ 71.493369] hv_netvsc vmbus_0_16 eth0: unable to send receive completion pkt (tid 206d)...give up retrying
Sep 3 18:48:07 RailGun kernel: [ 71.493689] hv_netvsc vmbus_0_16 eth0: unable to send receive completion pkt (tid 206e)...retrying 1
Sep 3 18:48:07 RailGun kernel: [ 71.494109] hv_netvsc vmbus_0_16 eth0: unable to send receive completion pkt (tid 206e)...retrying 2
Sep 3 18:48:07 RailGun kernel: [ 71.494527] hv_netvsc vmbus_0_16 eth0: unable to send receive completion pkt (tid 206e)...retrying 3
Sep 3 18:48:07 RailGun kernel: [ 71.494941] hv_netvsc vmbus_0_16 eth0: unable to send receive completion pkt (tid 206e)...retrying 4
Sep 3 18:48:07 RailGun kernel: [ 71.495249] hv_netvsc vmbus_0_16 eth0: unable to send receive completion pkt (tid 206e)...give up retrying
Sep 3 18:48:07 RailGun kernel: [ 71.509010] hv_netvsc vmbus_0_16 eth0: unable to send receive completion pkt (tid 206f)...retrying 1
Sep 3 18:48:07 RailGun kernel: [ 71.509467] hv_netvsc vmbus_0_16 eth0: unable to send receive completion pkt (tid 206f)...retrying 2
Sep 3 18:48:07 RailGun kernel: [ 71.509892] hv_netvsc vmbus_0_16 eth0: unable to send receive completion pkt (tid 206f)...retrying 3
Sep 3 18:48:07 RailGun kernel: [ 71.510308] hv_netvsc vmbus_0_16 eth0: unable to send receive completion pkt (tid 206f)...retrying 4
Sep 3 18:48:07 RailGun kernel: [ 71.510617] hv_netvsc vmbus_0_16 eth0: unable to send receive completion pkt (tid 206f)...give up retrying
Sep 3 18:48:07 RailGun kernel: [ 71.510938] hv_netvsc vmbus_0_16 eth0: unable to send receive completion pkt (tid 2070)...retrying 1
Sep 3 18:48:07 RailGun kernel: [ 71.511358] hv_netvsc vmbus_0_16 eth0: unable to send receive completion pkt (tid 2070)...retrying 2
Sep 3 18:48:07 RailGun kernel: [ 71.511828] hv_netvsc vmbus_0_16 eth0: unable to send receive completion pkt (tid 2070)...retrying 3
Sep 3 18:48:07 RailGun kernel: [ 71.512245] hv_netvsc vmbus_0_16 eth0: unable to send receive completion pkt (tid 2070)...retrying 4
Sep 3 18:48:07 RailGun kernel: [ 71.512669] hv_netvsc vmbus_0_16 eth0: unable to send receive completion pkt (tid 2070)...give up retrying

This bug is missing log files that will aid in diagnosing the problem. From a terminal window please run:

apport-collect 1491957

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete
Joseph Salisbury (jsalisbury) wrote :

Does this issue go away if you boot back into the prior kernel?

Changed in linux (Ubuntu):
importance: Undecided → Medium
tags: added: kernel-da-key kernel-hyper-v
Lőrinc (as-5) wrote :

Joseph, yes, I moved back to the prior kernel.
And now it works again.

Lőrinc (as-5) wrote :

And because I moved back to previous kernel, I cannot create more logs. But is you really need the "apport-collect 1491957" then maybe I can upgrade again to the new kernel, but after I removed it, now I cannot see it if I try apt upgrade, so in this case please tell me how I can re-update the kernel after I removed it.

Changed in linux (Ubuntu):
status: Incomplete → Confirmed
Lőrinc (as-5) wrote :

Due to the nature of the issue I am unable to run "apport-collect 1491957", because I changed back the kernel to the previos one.

Changed in linux (Ubuntu):
importance: Medium → High
affects: linux (Ubuntu) → linux-lts-utopic (Ubuntu)
Changed in linux-lts-utopic (Ubuntu Trusty):
status: New → Confirmed
importance: Undecided → High
Joseph Salisbury (jsalisbury) wrote :

I started a kernel bisect between 3.16.0-46 and v3.16.0-48. The kernel bisect will require testing of about 5 test kernels.

I built the first test kernel, up to the following commit:
0875e364f729b44ad8b0e196e04c2595a2f9bb51

The test kernel can be downloaded from:
http://kernel.ubuntu.com/~jsalisbury/lp1491957

Can you test that kernel and report back if it has the bug or not? I will build the next test kernel based on your test results.

Thanks in advance

Joseph Salisbury (jsalisbury) wrote :

Note, for the test kernel, you need to install both the linux-image and the linux-image-extra .deb packages.

Changed in linux-lts-utopic (Ubuntu Trusty):
status: Confirmed → In Progress
Changed in linux-lts-utopic (Ubuntu):
status: Confirmed → In Progress
assignee: nobody → Joseph Salisbury (jsalisbury)
Changed in linux-lts-utopic (Ubuntu Trusty):
assignee: nobody → Joseph Salisbury (jsalisbury)
Aljebro (launch4ad) wrote :

Same problem, I made update 9 hours ago and now 4 of 8 ubuntus server 14.04 LTS on HyperV crashed (no network) with same message. Return to 3.16.0-46.

Lőrinc (as-5) wrote :

Joseph, it is a live server, but I can test on it, but I am not a linux genius. So I can start is if you show me what I should do exactly from command prompt. (wget? apt?)
And one more thing is that I have currently this:
Ubuntu 14.04.3 LTS (GNU/Linux 3.16.0-46-generic x86_64)

So it is not 14.04.1 as I see the version in the filenames here: http://kernel.ubuntu.com/~jsalisbury/lp1491957/

I can create snapshot from Hyper-V console, so you should not describe how I can revert the changes.

Matthias R. Wiora (matthias-z) wrote :

same issue here

# cat /proc/version
Linux version 3.16.0-48-generic (buildd@lcy01-10) (gcc version 4.8.2 (Ubuntu 4.8.2-19ubuntu1) ) #64~14.04.1-Ubuntu SMP Thu Aug 20 23:03:57 UTC 2015

running on Windows Server 2012 R2 Hyper-V - latest updates installed.

Kretov Michael (coolmiha) wrote :

Same problem on 3.16.0-48-generic, but 3.16.0-46-generic works perfect.

Mi Tom (crazy-k) wrote :

Same problem after update to 3.16.0-48-generic.
Before 'hv_netvsc vmbus_0_16 eth0: unable to send receive completion pkt' message appears virtual machine is loosing network connectivity.
ifconfig -a shows lots of dropped frames in eth0.

Running on Hyper-V Server 2012 R2 having latest updates installed.

Joseph Salisbury (jsalisbury) wrote :

There is a test kernel available here:
http://people.canonical.com/~henrix/lp1492146/v1/amd64/

Can folks affected by this bug test this kernel?

Thanks in advance!

Joseph Salisbury (jsalisbury) wrote :

The test kernel in comment #13 has a revert of all hyper-v patches due to bug 1492146 .

I also built a test kernel with the current lts-backport-utopic branch of Trusty with a cherry pick of the following commit:
 6f19e12 igb: flush when in xmit_more mode and under descriptor pressure

The test kernel can be downloaded from:
http://kernel.ubuntu.com/~jsalisbury/lp1491957

If possible, can folks also test this kernel?

Thanks in advance

Angel (angelm) on 2015-09-09
no longer affects: hwe-next
Matthias R. Wiora (matthias-z) wrote :

running test - will report

# cat /proc/version
Linux version 3.16.0-48-generic (root@gloin) (gcc version 4.8.4 (Ubuntu 4.8.4-2ubuntu1~14.04) ) #64~14.04.1 SMP Tue Sep 8 22:01:57 UTC 2015

Matthias R. Wiora (matthias-z) wrote :

unfortunately the issue reoccures

Matthias R. Wiora (matthias-z) wrote :

seems to be fixed with Kernel 3.16.0-49-generic, which has been released to General Availability.

Keno Medenbach (k-medenbach) wrote :

I can confirm that this issue did not reoccur with kernel 3.16.0-49-generic.

Changed in linux-lts-utopic (Ubuntu):
status: In Progress → Fix Released
Changed in linux-lts-utopic (Ubuntu Trusty):
status: In Progress → Fix Released
Joseph Salisbury (jsalisbury) wrote :

The commits that caused this bug were introduced by the fixes for bug 1454892.

I've created a new test kernel for bug 1454892, but I would like to ensure it does not introduce this regression again. Could folks affected by this bug test my new test kernel? It can be downloaded from:

http://kernel.ubuntu.com/~jsalisbury/lp1454892/lts-backport-utopic/

Note, with this test kernel you would need to install both the linux-image and linux-image-extra .deb packages.

Thanks in advance!

Michal Fiala (michal-fiala) wrote :

We are using Hyper-V Windows 2012 R2 (updated on
3.12.2015) and Ubuntu 14.04 lts (updated 3.12.2015), kernel version
3.19.0-33-generic and the problem still exists.
We are also using integration services:

linux-cloud-tools-3.19.0-33-generic
linux-cloud-tools-common
linux-cloud-tools-generic-lts-vivid
linux-lts-vivid-cloud-tools-3.19.0-33

errors in kernel.log

Dec 6 08:22:41 sdgat02 kernel: [290029.720506] hv_netvsc vmbus_0_13
eth0: unable to send receive completion pkt (tid 2422)...give up retrying
Dec 6 08:22:41 sdgat02 kernel: [290029.721019] hv_netvsc vmbus_0_13
eth0: unable to send receive completion pkt (tid 2423)...retrying 1
Dec 6 08:22:41 sdgat02 kernel: [290029.721374] hv_netvsc vmbus_0_13
eth0: unable to send receive completion pkt (tid 2423)...retrying 2
Dec 6 08:22:41 sdgat02 kernel: [290029.721738] hv_netvsc vmbus_0_13
eth0: unable to send receive completion pkt (tid 2423)...retrying 3
Dec 6 08:22:41 sdgat02 kernel: [290029.722269] hv_netvsc vmbus_0_13
eth0: unable to send receive completion pkt (tid 2423)...retrying 4

and the network interface is not working.

Please, what state is this issue?

Thanks

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers