2018-05-14 08:22:38 |
KJ Tsanaktsidis |
bug |
|
|
added bug |
2018-05-14 08:22:38 |
KJ Tsanaktsidis |
attachment added |
|
Dmesg output during soft lockup https://bugs.launchpad.net/bugs/1771075/+attachment/5139122/+files/soft_lockup.log |
|
2018-05-14 08:23:02 |
KJ Tsanaktsidis |
attachment added |
|
Dmesg output after general protection fault https://bugs.launchpad.net/ubuntu/+source/linux-gcp/+bug/1771075/+attachment/5139125/+files/protection_fault.log |
|
2018-05-14 08:23:32 |
KJ Tsanaktsidis |
affects |
linux-gcp (Ubuntu) |
linux (Ubuntu) |
|
2018-05-14 08:30:07 |
Ubuntu Kernel Bot |
linux (Ubuntu): status |
New |
Incomplete |
|
2018-05-14 08:47:07 |
KJ Tsanaktsidis |
tags |
amd64 apport-bug uec-images xenial |
amd64 apport-bug apport-collected uec-images xenial |
|
2018-05-14 08:47:09 |
KJ Tsanaktsidis |
description |
We've run into some issues where upgrading the kernel from a 4.10 series to a 4.13 series on Ubuntu 16.04 hosts that make heavy use of inotify causes panics and lockups in the kernel in inotify-related code. Our particular use case seemed to hit these at a rate of one every 30 minutes or so when serving up production traffic. Unfortunately, I have been unable to replicate the issue so far with a simulated load-testing environment.
When the issue occurs, we get dmesg entries like "BUG: soft lockup - CPU#0 stuck for 22s!" or "General protection fault: 0000 [#1] SMP PTI". In the soft lockup case, the host is still up but all I/O operations stall indefinitely (e.g. typing "sync" into the console will hang forever). In the protection fault case, the system reboots. I've attached dmesg output from the two cases to this bugreport.
We have noticed the issue with the following kernels:
- linux-image-4.13.0-1013-gcp
- linux-image-4.13.0-1015-gcp
- linux-image-4.13.0-36-generic
We did _not_ have the issue with
- linux-image-4.10.0-32-generic
I've submitted this bug report from a system which should be configured identically to our production hosts that were having issue (the affected hosts were immediately rolled back to 4.10).
This bug appears to have been fixed upstream as of 4.17-rc3 in this commit: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=d90a10e2444ba5a351fa695917258ff4c5709fa5
I would guess that perhaps this patch should be backported into both the 4.13 HWE and GCP Ubuntu kernel series?
Thanks,
KJ
ProblemType: Bug
DistroRelease: Ubuntu 16.04
Package: linux-image-4.13.0-1013-gcp 4.13.0-1013.17
ProcVersionSignature: Ubuntu 4.13.0-1013.17-gcp 4.13.16
Uname: Linux 4.13.0-1013-gcp x86_64
ApportVersion: 2.20.1-0ubuntu2.16
Architecture: amd64
Date: Mon May 14 07:58:29 2018
ProcEnviron:
TERM=xterm-256color
PATH=(custom, no user)
LANG=en_US.UTF-8
SHELL=/bin/bash
SourcePackage: linux-gcp
UpgradeStatus: No upgrade log present (probably fresh install) |
We've run into some issues where upgrading the kernel from a 4.10 series to a 4.13 series on Ubuntu 16.04 hosts that make heavy use of inotify causes panics and lockups in the kernel in inotify-related code. Our particular use case seemed to hit these at a rate of one every 30 minutes or so when serving up production traffic. Unfortunately, I have been unable to replicate the issue so far with a simulated load-testing environment.
When the issue occurs, we get dmesg entries like "BUG: soft lockup - CPU#0 stuck for 22s!" or "General protection fault: 0000 [#1] SMP PTI". In the soft lockup case, the host is still up but all I/O operations stall indefinitely (e.g. typing "sync" into the console will hang forever). In the protection fault case, the system reboots. I've attached dmesg output from the two cases to this bugreport.
We have noticed the issue with the following kernels:
- linux-image-4.13.0-1013-gcp
- linux-image-4.13.0-1015-gcp
- linux-image-4.13.0-36-generic
We did _not_ have the issue with
- linux-image-4.10.0-32-generic
I've submitted this bug report from a system which should be configured identically to our production hosts that were having issue (the affected hosts were immediately rolled back to 4.10).
This bug appears to have been fixed upstream as of 4.17-rc3 in this commit: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=d90a10e2444ba5a351fa695917258ff4c5709fa5
I would guess that perhaps this patch should be backported into both the 4.13 HWE and GCP Ubuntu kernel series?
Thanks,
KJ
ProblemType: Bug
DistroRelease: Ubuntu 16.04
Package: linux-image-4.13.0-1013-gcp 4.13.0-1013.17
ProcVersionSignature: Ubuntu 4.13.0-1013.17-gcp 4.13.16
Uname: Linux 4.13.0-1013-gcp x86_64
ApportVersion: 2.20.1-0ubuntu2.16
Architecture: amd64
Date: Mon May 14 07:58:29 2018
ProcEnviron:
TERM=xterm-256color
PATH=(custom, no user)
LANG=en_US.UTF-8
SHELL=/bin/bash
SourcePackage: linux-gcp
UpgradeStatus: No upgrade log present (probably fresh install)
---
AlsaDevices:
total 0
crw-rw---- 1 root audio 116, 1 May 10 07:57 seq
crw-rw---- 1 root audio 116, 33 May 10 07:57 timer
AplayDevices: Error: [Errno 2] No such file or directory
ApportVersion: 2.20.1-0ubuntu2.16
Architecture: amd64
ArecordDevices: Error: [Errno 2] No such file or directory
AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', '/dev/snd/timer'] failed with exit code 1:
CRDA: N/A
DistroRelease: Ubuntu 16.04
IwConfig: Error: [Errno 2] No such file or directory
Lsusb: Error: command ['lsusb'] failed with exit code 1:
MachineType: Google Google Compute Engine
Package: linux (not installed)
PciMultimedia:
ProcEnviron:
TERM=xterm-256color
PATH=(custom, no user)
LANG=en_US.UTF-8
SHELL=/bin/bash
ProcFB:
ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-4.10.0-32-generic root=UUID=73ea38ed-7fcd-4871-8afa-17d36f4e4bfc ro scsi_mod.use_blk_mq=Y console=ttyS0
ProcVersionSignature: Ubuntu 4.10.0-32.36~16.04.1-generic 4.10.17
RelatedPackageVersions:
linux-restricted-modules-4.10.0-32-generic N/A
linux-backports-modules-4.10.0-32-generic N/A
linux-firmware N/A
RfKill: Error: [Errno 2] No such file or directory
Tags: xenial uec-images xenial uec-images
Uname: Linux 4.10.0-32-generic x86_64
UnreportableReason: The report belongs to a package that is not installed.
UpgradeStatus: No upgrade log present (probably fresh install)
UserGroups:
WifiSyslog:
_MarkForUpload: False
dmi.bios.date: 01/01/2011
dmi.bios.vendor: Google
dmi.bios.version: Google
dmi.board.asset.tag: 98BEC19B-1DEB-1A9F-1146-C6E4D8577ADB
dmi.board.name: Google Compute Engine
dmi.board.vendor: Google
dmi.chassis.type: 1
dmi.chassis.vendor: Google
dmi.modalias: dmi:bvnGoogle:bvrGoogle:bd01/01/2011:svnGoogle:pnGoogleComputeEngine:pvr:rvnGoogle:rnGoogleComputeEngine:rvr:cvnGoogle:ct1:cvr:
dmi.product.name: Google Compute Engine
dmi.sys.vendor: Google |
|
2018-05-14 08:47:09 |
KJ Tsanaktsidis |
attachment added |
|
CurrentDmesg.txt https://bugs.launchpad.net/bugs/1771075/+attachment/5139132/+files/CurrentDmesg.txt |
|
2018-05-14 08:47:10 |
KJ Tsanaktsidis |
attachment added |
|
HookError_generic.txt https://bugs.launchpad.net/bugs/1771075/+attachment/5139133/+files/HookError_generic.txt |
|
2018-05-14 08:47:12 |
KJ Tsanaktsidis |
attachment added |
|
Lspci.txt https://bugs.launchpad.net/bugs/1771075/+attachment/5139134/+files/Lspci.txt |
|
2018-05-14 08:47:13 |
KJ Tsanaktsidis |
attachment added |
|
ProcCpuinfoMinimal.txt https://bugs.launchpad.net/bugs/1771075/+attachment/5139135/+files/ProcCpuinfoMinimal.txt |
|
2018-05-14 08:47:14 |
KJ Tsanaktsidis |
attachment added |
|
ProcInterrupts.txt https://bugs.launchpad.net/bugs/1771075/+attachment/5139136/+files/ProcInterrupts.txt |
|
2018-05-14 08:47:15 |
KJ Tsanaktsidis |
attachment added |
|
ProcModules.txt https://bugs.launchpad.net/bugs/1771075/+attachment/5139137/+files/ProcModules.txt |
|
2018-05-14 08:47:16 |
KJ Tsanaktsidis |
attachment added |
|
UdevDb.txt https://bugs.launchpad.net/bugs/1771075/+attachment/5139138/+files/UdevDb.txt |
|
2018-05-14 08:48:55 |
KJ Tsanaktsidis |
linux (Ubuntu): status |
Incomplete |
Confirmed |
|
2018-05-14 18:28:08 |
Joseph Salisbury |
nominated for series |
|
Ubuntu Artful |
|
2018-05-14 18:28:08 |
Joseph Salisbury |
bug task added |
|
linux (Ubuntu Artful) |
|
2018-05-14 18:28:15 |
Joseph Salisbury |
linux (Ubuntu Artful): status |
New |
Triaged |
|
2018-05-14 18:28:20 |
Joseph Salisbury |
linux (Ubuntu Artful): importance |
Undecided |
Medium |
|
2018-05-14 18:28:22 |
Joseph Salisbury |
linux (Ubuntu): importance |
Undecided |
Medium |
|
2018-05-14 18:32:51 |
Joseph Salisbury |
linux (Ubuntu Artful): assignee |
|
Joseph Salisbury (jsalisbury) |
|
2018-05-14 18:32:56 |
Joseph Salisbury |
linux (Ubuntu Artful): status |
Triaged |
In Progress |
|
2018-05-14 18:32:59 |
Joseph Salisbury |
linux (Ubuntu): status |
Confirmed |
In Progress |
|
2018-05-14 18:33:04 |
Joseph Salisbury |
nominated for series |
|
Ubuntu Cosmic |
|
2018-05-14 18:33:04 |
Joseph Salisbury |
bug task added |
|
linux (Ubuntu Cosmic) |
|
2018-05-14 18:33:04 |
Joseph Salisbury |
nominated for series |
|
Ubuntu Bionic |
|
2018-05-14 18:33:04 |
Joseph Salisbury |
bug task added |
|
linux (Ubuntu Bionic) |
|
2018-05-14 18:33:10 |
Joseph Salisbury |
linux (Ubuntu Bionic): status |
New |
In Progress |
|
2018-05-14 18:33:12 |
Joseph Salisbury |
linux (Ubuntu Bionic): importance |
Undecided |
Medium |
|
2018-05-14 18:33:15 |
Joseph Salisbury |
linux (Ubuntu Bionic): assignee |
|
Joseph Salisbury (jsalisbury) |
|
2018-05-14 18:33:18 |
Joseph Salisbury |
linux (Ubuntu Cosmic): assignee |
|
Joseph Salisbury (jsalisbury) |
|
2018-05-23 18:19:54 |
Joseph Salisbury |
linux (Ubuntu Cosmic): status |
In Progress |
Fix Committed |
|
2018-05-23 18:20:00 |
Joseph Salisbury |
linux (Ubuntu Bionic): status |
In Progress |
Fix Committed |
|
2018-05-23 18:25:28 |
Joseph Salisbury |
linux (Ubuntu Artful): status |
In Progress |
Fix Committed |
|
2018-05-23 18:25:32 |
Joseph Salisbury |
linux (Ubuntu Cosmic): status |
Fix Committed |
In Progress |
|
2018-05-23 18:29:49 |
Joseph Salisbury |
bug task deleted |
linux (Ubuntu Artful) |
|
|
2018-05-23 18:29:54 |
Joseph Salisbury |
bug task deleted |
linux (Ubuntu Bionic) |
|
|
2018-05-24 11:29:26 |
Joseph Salisbury |
bug task added |
|
linux-gcp (Ubuntu) |
|
2018-05-24 11:29:42 |
Joseph Salisbury |
bug task deleted |
linux-gcp (Ubuntu Cosmic) |
|
|
2018-05-24 11:29:48 |
Joseph Salisbury |
linux-gcp (Ubuntu): importance |
Undecided |
Medium |
|
2018-05-24 11:29:52 |
Joseph Salisbury |
linux-gcp (Ubuntu): status |
New |
In Progress |
|
2018-05-24 11:29:56 |
Joseph Salisbury |
linux-gcp (Ubuntu): assignee |
|
Joseph Salisbury (jsalisbury) |
|
2019-01-23 01:10:04 |
Joseph Salisbury |
linux (Ubuntu Cosmic): status |
In Progress |
Confirmed |
|
2019-01-23 01:10:07 |
Joseph Salisbury |
linux (Ubuntu): status |
In Progress |
Confirmed |
|
2019-01-23 01:10:09 |
Joseph Salisbury |
linux (Ubuntu): assignee |
Joseph Salisbury (jsalisbury) |
|
|
2019-01-23 01:10:11 |
Joseph Salisbury |
linux (Ubuntu Cosmic): assignee |
Joseph Salisbury (jsalisbury) |
|
|
2019-01-23 01:10:13 |
Joseph Salisbury |
linux-gcp (Ubuntu): assignee |
Joseph Salisbury (jsalisbury) |
|
|
2019-01-23 01:10:15 |
Joseph Salisbury |
linux-gcp (Ubuntu): status |
In Progress |
Confirmed |
|
2019-07-24 21:09:17 |
Brad Figg |
tags |
amd64 apport-bug apport-collected uec-images xenial |
amd64 apport-bug apport-collected cscc uec-images xenial |
|
2019-07-30 08:22:37 |
Po-Hsu Lin |
linux-gcp (Ubuntu): status |
Confirmed |
Fix Released |
|
2019-07-30 08:22:41 |
Po-Hsu Lin |
linux (Ubuntu): status |
Confirmed |
Fix Released |
|
2019-07-30 08:22:44 |
Po-Hsu Lin |
bug task deleted |
linux (Ubuntu Cosmic) |
|
|