Ubuntu
linux-aws package

Activity log for bug #1864041

Date	Who	What changed	Old value	New value	Message
2020-02-20 14:04:32	Francis Ginther	bug			added bug
2020-02-20 14:09:16	Francis Ginther	description	The xen_netfront device is sometimes unresponsive after a hibernate and resume event. This is limited to the c4, c5, m4, m5, r4, r5 instance families, all of which are xen based, and support hibernation. When the issue occurrs, the instance is inaccessible without a full restart. Debugging by running a process which outputs regularly to the serial console shows that the instance is still running. A workaround is to build the xen_netfront module separately and restart the module and networking during the resume handler. For example: modprobe -r xen_netfront modprobe xen_netfront systemctl restart systemd-networkd With this workaround in place, the unresponsive issue is no longer observed.	The xen_netfront device is sometimes unresponsive after a hibernate and resume event. This is limited to the c4, c5, m4, m5, r4, r5 instance families, all of which are xen based, and support hibernation. When the issue occurrs, the instance is inaccessible without a full restart. Debugging by running a process which outputs regularly to the serial console shows that the instance is still running. A workaround is to build the xen_netfront module separately and restart the module and networking during the resume handler. For example: modprobe -r xen_netfront modprobe xen_netfront systemctl restart systemd-networkd With this workaround in place, the unresponsive issue is no longer observed. To reproduce this problem: 1) Launch an c4, c5, m4, m5, r4, r5 instance type with a 5.0 or 5.3 kernel with on-demand hibernation support enabled. 2) Start a long-running process which generates messages to the serial console 3) Begin observing these messages on the console (using the AWS UI or CLI to grab a screenshot). 4) Suspend and resume the instance, continuing to refresh the console screenshot. 5) The screenshot should continue to show updates even if ssh access is no longer working.
2020-02-20 15:07:07	Francis Ginther	tags		apport-collected bionic ec2-images
2020-02-20 15:07:09	Francis Ginther	description	The xen_netfront device is sometimes unresponsive after a hibernate and resume event. This is limited to the c4, c5, m4, m5, r4, r5 instance families, all of which are xen based, and support hibernation. When the issue occurrs, the instance is inaccessible without a full restart. Debugging by running a process which outputs regularly to the serial console shows that the instance is still running. A workaround is to build the xen_netfront module separately and restart the module and networking during the resume handler. For example: modprobe -r xen_netfront modprobe xen_netfront systemctl restart systemd-networkd With this workaround in place, the unresponsive issue is no longer observed. To reproduce this problem: 1) Launch an c4, c5, m4, m5, r4, r5 instance type with a 5.0 or 5.3 kernel with on-demand hibernation support enabled. 2) Start a long-running process which generates messages to the serial console 3) Begin observing these messages on the console (using the AWS UI or CLI to grab a screenshot). 4) Suspend and resume the instance, continuing to refresh the console screenshot. 5) The screenshot should continue to show updates even if ssh access is no longer working.	The xen_netfront device is sometimes unresponsive after a hibernate and resume event. This is limited to the c4, c5, m4, m5, r4, r5 instance families, all of which are xen based, and support hibernation. When the issue occurrs, the instance is inaccessible without a full restart. Debugging by running a process which outputs regularly to the serial console shows that the instance is still running. A workaround is to build the xen_netfront module separately and restart the module and networking during the resume handler. For example: modprobe -r xen_netfront modprobe xen_netfront systemctl restart systemd-networkd With this workaround in place, the unresponsive issue is no longer observed. To reproduce this problem: 1) Launch an c4, c5, m4, m5, r4, r5 instance type with a 5.0 or 5.3 kernel with on-demand hibernation support enabled. 2) Start a long-running process which generates messages to the serial console 3) Begin observing these messages on the console (using the AWS UI or CLI to grab a screenshot). 4) Suspend and resume the instance, continuing to refresh the console screenshot. 5) The screenshot should continue to show updates even if ssh access is no longer working. --- ProblemType: Bug ApportVersion: 2.20.9-0ubuntu7.9 Architecture: amd64 DistroRelease: Ubuntu 18.04 Ec2AMI: ami-0edf3b95e26a682df Ec2AMIManifest: (unknown) Ec2AvailabilityZone: us-west-2a Ec2InstanceType: m4.large Ec2Kernel: unavailable Ec2Ramdisk: unavailable Package: linux-aws 4.15.0.1058.59 PackageArchitecture: amd64 ProcEnviron: TERM=screen PATH=(custom, no user) XDG_RUNTIME_DIR=<set> LANG=C.UTF-8 SHELL=/bin/bash ProcVersionSignature: User Name 5.0.0-1025.28-aws 5.0.21 Tags: bionic ec2-images Uname: Linux 5.0.0-1025-aws x86_64 UpgradeStatus: No upgrade log present (probably fresh install) UserGroups: adm audio cdrom dialout dip floppy lxd netdev plugdev sudo video _MarkForUpload: True
2020-02-20 15:07:09	Francis Ginther	attachment added		Dependencies.txt https://bugs.launchpad.net/bugs/1864041/+attachment/5329809/+files/Dependencies.txt
2020-02-20 15:07:11	Francis Ginther	attachment added		ProcCpuinfoMinimal.txt https://bugs.launchpad.net/bugs/1864041/+attachment/5329810/+files/ProcCpuinfoMinimal.txt
2020-02-25 18:37:15	Balint Reczey	bug task added		ec2-hibinit-agent (Ubuntu)
2020-02-25 18:38:22	Balint Reczey	nominated for series		Ubuntu Bionic
2020-02-25 18:38:22	Balint Reczey	bug task added		linux-aws (Ubuntu Bionic)
2020-02-25 18:38:22	Balint Reczey	bug task added		ec2-hibinit-agent (Ubuntu Bionic)
2020-02-25 18:38:22	Balint Reczey	nominated for series		Ubuntu Focal
2020-02-25 18:38:22	Balint Reczey	bug task added		linux-aws (Ubuntu Focal)
2020-02-25 18:38:22	Balint Reczey	bug task added		ec2-hibinit-agent (Ubuntu Focal)
2020-02-25 18:38:22	Balint Reczey	nominated for series		Ubuntu Eoan
2020-02-25 18:38:22	Balint Reczey	bug task added		linux-aws (Ubuntu Eoan)
2020-02-25 18:38:22	Balint Reczey	bug task added		ec2-hibinit-agent (Ubuntu Eoan)
2020-02-25 18:38:38	Balint Reczey	ec2-hibinit-agent (Ubuntu Focal): status	New	In Progress
2020-02-25 18:50:38	Balint Reczey	bug			added subscriber Balint Reczey
2020-03-12 15:02:15	Launchpad Janitor	ec2-hibinit-agent (Ubuntu Focal): status	In Progress	Fix Released
2020-03-23 15:52:08	Balint Reczey	description	The xen_netfront device is sometimes unresponsive after a hibernate and resume event. This is limited to the c4, c5, m4, m5, r4, r5 instance families, all of which are xen based, and support hibernation. When the issue occurrs, the instance is inaccessible without a full restart. Debugging by running a process which outputs regularly to the serial console shows that the instance is still running. A workaround is to build the xen_netfront module separately and restart the module and networking during the resume handler. For example: modprobe -r xen_netfront modprobe xen_netfront systemctl restart systemd-networkd With this workaround in place, the unresponsive issue is no longer observed. To reproduce this problem: 1) Launch an c4, c5, m4, m5, r4, r5 instance type with a 5.0 or 5.3 kernel with on-demand hibernation support enabled. 2) Start a long-running process which generates messages to the serial console 3) Begin observing these messages on the console (using the AWS UI or CLI to grab a screenshot). 4) Suspend and resume the instance, continuing to refresh the console screenshot. 5) The screenshot should continue to show updates even if ssh access is no longer working. --- ProblemType: Bug ApportVersion: 2.20.9-0ubuntu7.9 Architecture: amd64 DistroRelease: Ubuntu 18.04 Ec2AMI: ami-0edf3b95e26a682df Ec2AMIManifest: (unknown) Ec2AvailabilityZone: us-west-2a Ec2InstanceType: m4.large Ec2Kernel: unavailable Ec2Ramdisk: unavailable Package: linux-aws 4.15.0.1058.59 PackageArchitecture: amd64 ProcEnviron: TERM=screen PATH=(custom, no user) XDG_RUNTIME_DIR=<set> LANG=C.UTF-8 SHELL=/bin/bash ProcVersionSignature: User Name 5.0.0-1025.28-aws 5.0.21 Tags: bionic ec2-images Uname: Linux 5.0.0-1025-aws x86_64 UpgradeStatus: No upgrade log present (probably fresh install) UserGroups: adm audio cdrom dialout dip floppy lxd netdev plugdev sudo video _MarkForUpload: True	[Impact] The xen_netfront device is sometimes unresponsive after a hibernate and resume event. This is limited to the c4, c5, m4, m5, r4, r5 instance families, all of which are xen based, and support hibernation. When the issue occurrs, the instance is inaccessible without a full restart. Debugging by running a process which outputs regularly to the serial console shows that the instance is still running. [Test Case] 1) Launch an c4, c5, m4, m5, r4, r5 instance type with a 5.0 or 5.3 kernel with on-demand hibernation support enabled. 2) Start a long-running process which generates messages to the serial console 3) Begin observing these messages on the console (using the AWS UI or CLI to grab a screenshot). 4) Suspend and resume the instance, continuing to refresh the console screenshot. 5) The screenshot should continue to show updates even if ssh access is no longer working. [Regression Potential] The workaround in ec2-hibinit-agent is reloading the xen_netfront kernel module before restarting systemd-networkd. If the kernel module is removed (for example when hitting LP: #1615381) the module reloading fails and the instance can not restore network connections. This is expected to a be very rare situation and the module reload is the best workaround the Kernel Team found to mitigate the original issue. [Original Bug Text] The xen_netfront device is sometimes unresponsive after a hibernate and resume event. This is limited to the c4, c5, m4, m5, r4, r5 instance families, all of which are xen based, and support hibernation. When the issue occurrs, the instance is inaccessible without a full restart. Debugging by running a process which outputs regularly to the serial console shows that the instance is still running. A workaround is to build the xen_netfront module separately and restart the module and networking during the resume handler. For example: modprobe -r xen_netfront modprobe xen_netfront systemctl restart systemd-networkd With this workaround in place, the unresponsive issue is no longer observed. To reproduce this problem: 1) Launch an c4, c5, m4, m5, r4, r5 instance type with a 5.0 or 5.3 kernel with on-demand hibernation support enabled. 2) Start a long-running process which generates messages to the serial console 3) Begin observing these messages on the console (using the AWS UI or CLI to grab a screenshot). 4) Suspend and resume the instance, continuing to refresh the console screenshot. 5) The screenshot should continue to show updates even if ssh access is no longer working. --- ProblemType: Bug ApportVersion: 2.20.9-0ubuntu7.9 Architecture: amd64 DistroRelease: Ubuntu 18.04 Ec2AMI: ami-0edf3b95e26a682df Ec2AMIManifest: (unknown) Ec2AvailabilityZone: us-west-2a Ec2InstanceType: m4.large Ec2Kernel: unavailable Ec2Ramdisk: unavailable Package: linux-aws 4.15.0.1058.59 PackageArchitecture: amd64 ProcEnviron: TERM=screen PATH=(custom, no user) XDG_RUNTIME_DIR=<set> LANG=C.UTF-8 SHELL=/bin/bash ProcVersionSignature: User Name 5.0.0-1025.28-aws 5.0.21 Tags: bionic ec2-images Uname: Linux 5.0.0-1025-aws x86_64 UpgradeStatus: No upgrade log present (probably fresh install) UserGroups: adm audio cdrom dialout dip floppy lxd netdev plugdev sudo video _MarkForUpload: True
2020-03-23 16:07:10	Balint Reczey	description	[Impact] The xen_netfront device is sometimes unresponsive after a hibernate and resume event. This is limited to the c4, c5, m4, m5, r4, r5 instance families, all of which are xen based, and support hibernation. When the issue occurrs, the instance is inaccessible without a full restart. Debugging by running a process which outputs regularly to the serial console shows that the instance is still running. [Test Case] 1) Launch an c4, c5, m4, m5, r4, r5 instance type with a 5.0 or 5.3 kernel with on-demand hibernation support enabled. 2) Start a long-running process which generates messages to the serial console 3) Begin observing these messages on the console (using the AWS UI or CLI to grab a screenshot). 4) Suspend and resume the instance, continuing to refresh the console screenshot. 5) The screenshot should continue to show updates even if ssh access is no longer working. [Regression Potential] The workaround in ec2-hibinit-agent is reloading the xen_netfront kernel module before restarting systemd-networkd. If the kernel module is removed (for example when hitting LP: #1615381) the module reloading fails and the instance can not restore network connections. This is expected to a be very rare situation and the module reload is the best workaround the Kernel Team found to mitigate the original issue. [Original Bug Text] The xen_netfront device is sometimes unresponsive after a hibernate and resume event. This is limited to the c4, c5, m4, m5, r4, r5 instance families, all of which are xen based, and support hibernation. When the issue occurrs, the instance is inaccessible without a full restart. Debugging by running a process which outputs regularly to the serial console shows that the instance is still running. A workaround is to build the xen_netfront module separately and restart the module and networking during the resume handler. For example: modprobe -r xen_netfront modprobe xen_netfront systemctl restart systemd-networkd With this workaround in place, the unresponsive issue is no longer observed. To reproduce this problem: 1) Launch an c4, c5, m4, m5, r4, r5 instance type with a 5.0 or 5.3 kernel with on-demand hibernation support enabled. 2) Start a long-running process which generates messages to the serial console 3) Begin observing these messages on the console (using the AWS UI or CLI to grab a screenshot). 4) Suspend and resume the instance, continuing to refresh the console screenshot. 5) The screenshot should continue to show updates even if ssh access is no longer working. --- ProblemType: Bug ApportVersion: 2.20.9-0ubuntu7.9 Architecture: amd64 DistroRelease: Ubuntu 18.04 Ec2AMI: ami-0edf3b95e26a682df Ec2AMIManifest: (unknown) Ec2AvailabilityZone: us-west-2a Ec2InstanceType: m4.large Ec2Kernel: unavailable Ec2Ramdisk: unavailable Package: linux-aws 4.15.0.1058.59 PackageArchitecture: amd64 ProcEnviron: TERM=screen PATH=(custom, no user) XDG_RUNTIME_DIR=<set> LANG=C.UTF-8 SHELL=/bin/bash ProcVersionSignature: User Name 5.0.0-1025.28-aws 5.0.21 Tags: bionic ec2-images Uname: Linux 5.0.0-1025-aws x86_64 UpgradeStatus: No upgrade log present (probably fresh install) UserGroups: adm audio cdrom dialout dip floppy lxd netdev plugdev sudo video _MarkForUpload: True	[Impact] The xen_netfront device is sometimes unresponsive after a hibernate and resume event. This is limited to the c4, c5, m4, m5, r4, r5 instance families, all of which are xen based, and support hibernation. When the issue occurrs, the instance is inaccessible without a full restart. Debugging by running a process which outputs regularly to the serial console shows that the instance is still running. [Test Case] 1) Launch an c4, c5, m4, m5, r4, r5 instance type with a 5.0 or 5.3 kernel with on-demand hibernation support enabled. 2) Start a long-running process which generates messages to the serial console 3) Begin observing these messages on the console (using the AWS UI or CLI to grab a screenshot). 4) Suspend and resume the instance, continuing to refresh the console screenshot. 5) The screenshot should continue to show updates even if ssh access is no longer working. [Regression Potential] The workaround in ec2-hibinit-agent is reloading the xen_netfront kernel module before restarting systemd-networkd. If the kernel module is removed (for example when hitting LP: #1615381) the module reloading fails and the instance can not restore network connections. This is expected to a be very rare situation and the module reload is the best workaround the Kernel Team found to mitigate the original issue. The workaround also adds a 2 second delay before reloading the modules to let things settle a bit after resuming. The 2 seconds is very short compared to the overall time needed resuming an instance. [Original Bug Text] The xen_netfront device is sometimes unresponsive after a hibernate and resume event. This is limited to the c4, c5, m4, m5, r4, r5 instance families, all of which are xen based, and support hibernation. When the issue occurrs, the instance is inaccessible without a full restart. Debugging by running a process which outputs regularly to the serial console shows that the instance is still running. A workaround is to build the xen_netfront module separately and restart the module and networking during the resume handler. For example: modprobe -r xen_netfront modprobe xen_netfront systemctl restart systemd-networkd With this workaround in place, the unresponsive issue is no longer observed. To reproduce this problem: 1) Launch an c4, c5, m4, m5, r4, r5 instance type with a 5.0 or 5.3 kernel with on-demand hibernation support enabled. 2) Start a long-running process which generates messages to the serial console 3) Begin observing these messages on the console (using the AWS UI or CLI to grab a screenshot). 4) Suspend and resume the instance, continuing to refresh the console screenshot. 5) The screenshot should continue to show updates even if ssh access is no longer working. --- ProblemType: Bug ApportVersion: 2.20.9-0ubuntu7.9 Architecture: amd64 DistroRelease: Ubuntu 18.04 Ec2AMI: ami-0edf3b95e26a682df Ec2AMIManifest: (unknown) Ec2AvailabilityZone: us-west-2a Ec2InstanceType: m4.large Ec2Kernel: unavailable Ec2Ramdisk: unavailable Package: linux-aws 4.15.0.1058.59 PackageArchitecture: amd64 ProcEnviron: TERM=screen PATH=(custom, no user) XDG_RUNTIME_DIR=<set> LANG=C.UTF-8 SHELL=/bin/bash ProcVersionSignature: User Name 5.0.0-1025.28-aws 5.0.21 Tags: bionic ec2-images Uname: Linux 5.0.0-1025-aws x86_64 UpgradeStatus: No upgrade log present (probably fresh install) UserGroups: adm audio cdrom dialout dip floppy lxd netdev plugdev sudo video _MarkForUpload: True
2020-03-23 17:28:27	Łukasz Zemczak	ec2-hibinit-agent (Ubuntu Eoan): status	New	Fix Committed
2020-03-23 17:28:33	Łukasz Zemczak	bug			added subscriber Ubuntu Stable Release Updates Team
2020-03-23 17:28:35	Łukasz Zemczak	bug			added subscriber SRU Verification
2020-03-23 17:28:41	Łukasz Zemczak	tags	apport-collected bionic ec2-images	apport-collected bionic ec2-images verification-needed verification-needed-eoan
2020-03-23 17:29:40	Łukasz Zemczak	ec2-hibinit-agent (Ubuntu Bionic): status	New	Fix Committed
2020-03-23 17:29:47	Łukasz Zemczak	tags	apport-collected bionic ec2-images verification-needed verification-needed-eoan	apport-collected bionic ec2-images verification-needed verification-needed-bionic verification-needed-eoan
2020-04-02 11:49:36	Francis Ginther	tags	apport-collected bionic ec2-images verification-needed verification-needed-bionic verification-needed-eoan	apport-collected bionic ec2-images verification-done-bionic verification-needed verification-needed-eoan
2020-04-07 19:23:56	Launchpad Janitor	ec2-hibinit-agent (Ubuntu Bionic): status	Fix Committed	Fix Released
2020-04-07 19:24:06	Brian Murray	removed subscriber Ubuntu Stable Release Updates Team
2020-04-22 12:51:11	Francis Ginther	tags	apport-collected bionic ec2-images verification-done-bionic verification-needed verification-needed-eoan	apport-collected bionic ec2-images id-5e459f823f8a2435d44842eb verification-done-bionic verification-needed verification-needed-eoan
2020-07-02 20:33:04	Francis Ginther	tags	apport-collected bionic ec2-images id-5e459f823f8a2435d44842eb verification-done-bionic verification-needed verification-needed-eoan	apport-collected bionic ec2-images id-5e459f823f8a2435d44842eb verification-done-bionic verification-done-eoan verification-needed
2020-07-07 18:24:51	Launchpad Janitor	ec2-hibinit-agent (Ubuntu Eoan): status	Fix Committed	Fix Released
2020-08-18 17:01:36	Brian Murray	linux-aws (Ubuntu Eoan): status	New	Won't Fix
2020-10-20 16:09:12	Balint Reczey	nominated for series		Ubuntu Xenial
2020-10-20 16:09:12	Balint Reczey	bug task added		linux-aws (Ubuntu Xenial)
2020-10-20 16:09:12	Balint Reczey	bug task added		ec2-hibinit-agent (Ubuntu Xenial)
2020-10-20 16:09:28	Balint Reczey	ec2-hibinit-agent (Ubuntu Xenial): status	New	Incomplete
2020-10-20 19:01:30	Balint Reczey	ec2-hibinit-agent (Ubuntu Xenial): status	Incomplete	Invalid

Ubuntulinux-aws package

Activity log for bug #1864041

Ubuntu
linux-aws package