Azure Instance never recovered during series of instance reboots.
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
linux-azure (Ubuntu) |
Fix Released
|
High
|
Colin Ian King |
Bug Description
Description: During SRU Testing of various Azure Instances, there will be some cases where the instance will not respond following a system reboot. SRU Testing only restarts a giving instance once, after it preps all of the necessary files to-be-tested.
Series: Disco
Instance Size: Basic_A3
Region: (Default) US-WEST-2
Kernel Version: 4.18.0-1013-azure #13-Ubuntu SMP Thu Feb 28 22:54:16 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
I initiated a series of tests which rebooted Azure Cloud instances 50 times. During the 49th Reboot, an Instance failed to return from a reboot.. Upon grabbing the console output the following was seen scrolling endlessly. I have seen this failure in cases where the instance only restarted a handful of times >5
[ 84.247704]
[ 84.247704]
[ 84.247704]
[ 84.247704]
[ 84.247704]
[ 84.247704]
[ 84.247704]
[ 84.247704]
In another test attempt I saw the following failure:
ERROR ExtHandler /proc/net/route contains no routes
ERROR ExtHandler /proc/net/route contains no routes
ERROR ExtHandler /proc/net/route contains no routes
ERROR ExtHandler /proc/net/route contains no routes
ERROR ExtHandler /proc/net/route contains no routes
ERROR ExtHandler /proc/net/route contains no routes
ERROR ExtHandler /proc/net/route contains no routes
Both of these failures broke networking, Both of these failures were seen at least twice to three times, thus may explain why in some cases we never recover from an instance reboot.
description: | updated |
Changed in linux-azure (Ubuntu): | |
assignee: | nobody → Colin Ian King (colin-king) |
description: | updated |
Changed in linux-azure (Ubuntu): | |
importance: | Undecided → High |
status: | New → In Progress |
tags: | added: kernel-hyper-v |
Changed in linux-azure (Ubuntu): | |
status: | In Progress → Incomplete |
The "hyperv_fb: Unable to send packet via vmbus" message is from synthvid_send(), drivers/ video/fbdev/ hyperv_ fb.c (the Microsoft Hyper-V Synthetic Video Frame Buffer Driver). This error occurs when vmbus_sendpacket() fails to send a packet via the write ring buffer (see hv_ringbuffer_ write() )
This failure is either because:
1. the channel->rescind is zero
2. the ring buffer is full
adding some more debug into hv_ringbuffer_write in the error paths may shed some more light on that.