Random freezes on 4.15.0-60

Bug #1842726 reported by Antz
36
This bug affects 6 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Fix Released
Undecided
Unassigned

Bug Description

lsb_release -rd:
`
Description: Ubuntu 18.04.3 LTS
Release: 18.04
`

Hi,

I get random freezes on 4.15.0-60. Neither */var/log/kern.log* nor */var/crash* contains anything useful. When it happens it's like it went into a NOP-loop, everything freezes (including mouse pointer) and it reacts to nothing until power cut or reset.

I wasn't able to find any steps to reproduce it, it seems fully random.

Kernel 4.15.0-58 works fine.

ProblemType: Bug
DistroRelease: Ubuntu 18.04
Package: linux-image-4.15.0-60-generic 4.15.0-60.67
ProcVersionSignature: Ubuntu 4.15.0-60.67-generic 4.15.18
Uname: Linux 4.15.0-60-generic x86_64
ApportVersion: 2.20.9-0ubuntu7.7
Architecture: amd64
AudioDevicesInUse:
 USER PID ACCESS COMMAND
 /dev/snd/controlC1: andi 3522 F.... pulseaudio
 /dev/snd/controlC0: andi 3522 F.... pulseaudio
CurrentDesktop: ubuntu:GNOME
Date: Wed Sep 4 19:36:13 2019
HibernationDevice: RESUME=/dev/mapper/vgad--ssd-swap
InstallationDate: Installed on 2018-09-16 (352 days ago)
InstallationMedia: Ubuntu-Server 18.04.1 LTS "Bionic Beaver" - Release amd64 (20180725)
MachineType: System manufacturer System Product Name
ProcEnviron:
 TERM=xterm-256color
 PATH=(custom, no user)
 XDG_RUNTIME_DIR=<set>
 LANG=de_DE.UTF-8
 SHELL=/bin/bash
ProcFB: 0 amdgpudrmfb
ProcKernelCmdLine: BOOT_IMAGE=/vmlinuz-4.15.0-60-generic root=/dev/mapper/vgad--ssd-root ro
RelatedPackageVersions:
 linux-restricted-modules-4.15.0-60-generic N/A
 linux-backports-modules-4.15.0-60-generic N/A
 linux-firmware 1.173.9
RfKill:

SourcePackage: linux
UpgradeStatus: No upgrade log present (probably fresh install)
dmi.bios.date: 04/19/2018
dmi.bios.vendor: American Megatrends Inc.
dmi.bios.version: 0601
dmi.board.asset.tag: Default string
dmi.board.name: ROG CROSSHAIR VII HERO
dmi.board.vendor: ASUSTeK COMPUTER INC.
dmi.board.version: Rev 1.xx
dmi.chassis.asset.tag: Default string
dmi.chassis.type: 3
dmi.chassis.vendor: Default string
dmi.chassis.version: Default string
dmi.modalias: dmi:bvnAmericanMegatrendsInc.:bvr0601:bd04/19/2018:svnSystemmanufacturer:pnSystemProductName:pvrSystemVersion:rvnASUSTeKCOMPUTERINC.:rnROGCROSSHAIRVIIHERO:rvrRev1.xx:cvnDefaultstring:ct3:cvrDefaultstring:
dmi.product.family: To be filled by O.E.M.
dmi.product.name: System Product Name
dmi.product.version: System Version
dmi.sys.vendor: System manufacturer

Revision history for this message
Antz (theantz) wrote :
Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote : Status changed to Confirmed

This change was made by a bot.

Changed in linux (Ubuntu):
status: New → Confirmed
Revision history for this message
Kai-Heng Feng (kaihengfeng) wrote :

Can you please attach `journalctl -b -1 -k` after reboot from freeze?

Revision history for this message
Antz (theantz) wrote :

Sure, see attachment.

I also noticed that it seems to require some system load for the freeze to happen.

Revision history for this message
Jean-Daniel Dupas (xooloo) wrote :

I also have issues and as I'm running VM, I managed to get a crash dump and got this line in GDB:

kernel BUG at /build/linux-5mCauq/linux-4.15.0/net/ipv4/ip_output.c:636

Revision history for this message
Jean-Daniel Dupas (xooloo) wrote :

Look like this is a duplicate of https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1842447

Just for the record: here is the kernel stack trace:

crash> bt
PID: 0 TASK: ffffffff82413480 CPU: 0 COMMAND: "swapper/0"
 #0 [ffff88807fc037d8] die at ffffffff81031d32
 #1 [ffff88807fc03808] do_trap at ffffffff8102dbf1
 #2 [ffff88807fc03850] do_error_trap at ffffffff8102e1b6
 #3 [ffff88807fc03910] do_invalid_op at ffffffff8102e670
 #4 [ffff88807fc03920] invalid_op at ffffffff81a00edb
    [exception RIP: ip_do_fragment+1154]
    RIP: ffffffff818ba2d2 RSP: ffff88807fc039d8 RFLAGS: 00010202
    RAX: 0000000000000001 RBX: ffff888077027900 RCX: ffffffff8184cdf0
    RDX: 0000000000000034 RSI: 00000000000005c8 RDI: ffff888078d8cb00
    RBP: ffff88807fc03a40 R8: ffff888078cf3600 R9: 00000000000005dc
    R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000014
    R13: ffff888078d8c800 R14: 0000000000000678 R15: ffff888078cf364e
    ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018
 #5 [ffff88807fc03a48] ip_fragment.constprop.45 at ffffffff818ba6b3
 #6 [ffff88807fc03a60] ip_finish_output at ffffffff818ba872
 #7 [ffff88807fc03aa0] ip_output at ffffffff818bbd50
 #8 [ffff88807fc03b00] ip_forward_finish at ffffffff818b7951
 #9 [ffff88807fc03b28] ip_forward at ffffffff818b7cf6
#10 [ffff88807fc03b98] ip_rcv_finish at ffffffff818b57f9
#11 [ffff88807fc03bd0] ip_rcv at ffffffff818b61e6
#12 [ffff88807fc03c38] __netif_receive_skb_core at ffffffff81867f82
#13 [ffff88807fc03cd8] __netif_receive_skb at ffffffff81868708
#14 [ffff88807fc03d08] netif_receive_skb_internal at ffffffff8186aa95
#15 [ffff88807fc03d38] napi_gro_receive at ffffffff8186b815
#16 [ffff88807fc03d60] receive_buf at ffffffffc0034f92 [virtio_net]
#17 [ffff88807fc03e48] virtnet_poll at ffffffffc00361e2 [virtio_net]
#18 [ffff88807fc03ec0] net_rx_action at ffffffff8186aef0
#19 [ffff88807fc03f40] __softirqentry_text_start at ffffffff81c000e4
#20 [ffff88807fc03fa8] irq_exit at ffffffff81096ae5
#21 [ffff88807fc03fb8] do_IRQ at ffffffff81a02736
--- <IRQ stack> ---

Revision history for this message
Antz (theantz) wrote :

Could be, I currently don't have any docker stuff running but I have lot's of bridges, network namespaces, and NAT going on in some of them. Increasing the load in there seems to increase the likelihood of running into the bug.

Revision history for this message
Jean-Daniel Dupas (xooloo) wrote :

I don't have any docker, but I'm also using this machine as a NAT gateway with very high load, and it reliably crash in less than 2 seconds when taking over master role and start handling traffic.

Note that I'm using a pretty basic configuration (NAT only, no bridge, no fancy network stuff, …)

Revision history for this message
Roddie Hasan (roddie) wrote :

I am having this same issue and I do run Docker Compose, but not with any name-servers set explicitly. I just downgraded to 4.15.0-58 to see if that stabilizes things.

Revision history for this message
Kai-Heng Feng (kaihengfeng) wrote :

Can you please try -62 in bionic-proposed?

Revision history for this message
Antz (theantz) wrote :

Seems to run fine for me, thx for the fix!

Antz (theantz)
Changed in linux (Ubuntu):
status: Confirmed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.