System freezes on specific kernel

Bug #1883681 reported by Anand Sakthivel
20
This bug affects 3 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Confirmed
Undecided
Unassigned

Bug Description

The servers will be running fine for some time and suddenly will freeze. It won't even respond to ping requests. We have to reboot the servers again to bring it back online but it won't be stable and will not respond again.
---
AlsaDevices:
 total 0
 crw-rw---- 1 root audio 116, 1 Jun 15 17:52 seq
 crw-rw---- 1 root audio 116, 33 Jun 15 17:52 timer
AplayDevices: Error: [Errno 2] No such file or directory
ApportVersion: 2.20.1-0ubuntu2.23
Architecture: amd64
ArecordDevices: Error: [Errno 2] No such file or directory
AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', '/dev/snd/timer'] failed with exit code 1:
DistroRelease: Ubuntu 16.04
HibernationDevice: RESUME=UUID=fe587596-cd68-4bbf-ae67-0e501fd6159c
IwConfig:
 lo no wireless extensions.

 ens160 no wireless extensions.
Lsusb: Error: command ['lsusb'] failed with exit code 1:
MachineType: VMware, Inc. VMware Virtual Platform
Package: linux (not installed)
PciMultimedia:

ProcEnviron:
 TERM=screen
 PATH=(custom, no user)
 LANG=en_US
 SHELL=/bin/bash
ProcFB: 0 svgadrmfb
ProcKernelCmdLine: BOOT_IMAGE=/vmlinuz-4.4.0-184-generic root=/dev/mapper/cdns01--vg-root ro crashkernel=384M-:128M
ProcVersionSignature: Ubuntu 4.4.0-184.214-generic 4.4.223
RelatedPackageVersions:
 linux-restricted-modules-4.4.0-184-generic N/A
 linux-backports-modules-4.4.0-184-generic N/A
 linux-firmware 1.157.23
RfKill: Error: [Errno 2] No such file or directory
Tags: xenial
Uname: Linux 4.4.0-184-generic x86_64
UpgradeStatus: Upgraded to xenial on 2017-05-15 (1127 days ago)
UserGroups:

_MarkForUpload: True
dmi.bios.date: 09/17/2015
dmi.bios.vendor: Phoenix Technologies LTD
dmi.bios.version: 6.00
dmi.board.name: 440BX Desktop Reference Platform
dmi.board.vendor: Intel Corporation
dmi.board.version: None
dmi.chassis.asset.tag: No Asset Tag
dmi.chassis.type: 1
dmi.chassis.vendor: No Enclosure
dmi.chassis.version: N/A
dmi.modalias: dmi:bvnPhoenixTechnologiesLTD:bvr6.00:bd09/17/2015:svnVMware,Inc.:pnVMwareVirtualPlatform:pvrNone:rvnIntelCorporation:rn440BXDesktopReferencePlatform:rvrNone:cvnNoEnclosure:ct1:cvrN/A:
dmi.product.name: VMware Virtual Platform
dmi.product.version: None
dmi.sys.vendor: VMware, Inc.

Revision history for this message
Anand Sakthivel (asakth22) wrote :
Paul White (paulw2u)
affects: ubuntu → linux (Ubuntu)
Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote : Missing required logs.

This bug is missing log files that will aid in diagnosing the problem. While running an Ubuntu kernel (not a mainline or third-party kernel) please enter the following command in a terminal window:

apport-collect 1883681

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete
Revision history for this message
Anand Sakthivel (asakth22) wrote : CRDA.txt

apport information

tags: added: apport-collected xenial
description: updated
Revision history for this message
Anand Sakthivel (asakth22) wrote : CurrentDmesg.txt

apport information

Revision history for this message
Anand Sakthivel (asakth22) wrote : Lspci.txt

apport information

Revision history for this message
Anand Sakthivel (asakth22) wrote : ProcCpuinfo.txt

apport information

Revision history for this message
Anand Sakthivel (asakth22) wrote : ProcCpuinfoMinimal.txt

apport information

Revision history for this message
Anand Sakthivel (asakth22) wrote : ProcInterrupts.txt

apport information

Revision history for this message
Anand Sakthivel (asakth22) wrote : ProcModules.txt

apport information

Revision history for this message
Anand Sakthivel (asakth22) wrote : UdevDb.txt

apport information

Revision history for this message
Anand Sakthivel (asakth22) wrote : WifiSyslog.txt

apport information

Revision history for this message
Paul White (paulw2u) wrote :

Changing to confirmed as requested in comment #2.

Changed in linux (Ubuntu):
status: Incomplete → Confirmed
Revision history for this message
Mauricio Faria de Oliveira (mfo) wrote :
Download full text (4.7 KiB)

Hi Anand,

Thanks for reporting this bug.

Could you please try the kernel version in xenial-proposed? [1]
(version: 4.4.0-185.215)

It has a patch for what seems to be this problem, according
to the stack trace seen in apport's kernel crash dump below.

The patch is: 'net: handle no dst on skb in icmp6_send'

[1] https://wiki.ubuntu.com/Testing/EnableProposed

cheers,
Mauricio

...

The stacktrace from apport's 'kernel crash dump' attachment
(linux-image-4.4.0-184-generic-202006151751.crash):

$ apport-unpack linux-image-4.4.0-184-generic-202006151751.crash k/
$ ls k
Architecture Date DistroRelease Package ProblemType Uname VmCoreDmesg
$ cat k/VmCoreDmesg
...
[ 13.702003] nf_conntrack version 0.5.0 (16384 buckets, 65536 max)
[ 962.936170] BUG: unable to handle kernel NULL pointer dereference at 0000000000000018
[ 962.936250] IP: [<ffffffff818288ab>] icmp6_send+0x1fb/0x970
[ 962.936296] PGD 0
[ 962.936314] Oops: 0000 [#1] SMP
[ 962.936341] Modules linked in: xt_tcpudp nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack iptable_filter ip_tables x_tables vmw_vsock_vmci_transport vsock coretemp ppdev vmw_balloon input_leds joydev serio_raw shpchp vmw_vmci i2c_piix4 8250_fintek parport_pc mac_hid lp parport autofs4 xfs libcrc32c vmwgfx psmouse ttm drm_kms_helper syscopyarea sysfillrect mptspi sysimgblt mptscsih fb_sys_fops mptbase drm vmxnet3 scsi_transport_spi pata_acpi floppy fjes
[ 962.936723] CPU: 3 PID: 0 Comm: swapper/3 Not tainted 4.4.0-184-generic #214-Ubuntu
[ 962.936775] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 09/17/2015
[ 962.936844] task: ffff88013a562700 ti: ffff88013a570000 task.ti: ffff88013a570000
[ 962.936893] RIP: 0010:[<ffffffff818288ab>] [<ffffffff818288ab>] icmp6_send+0x1fb/0x970
[ 962.936950] RSP: 0018:ffff88013fd83d00 EFLAGS: 00010246
[ 962.936986] RAX: 0000000000000000 RBX: ffff880139f88a00 RCX: 0000000000000020
[ 962.937032] RDX: 0000000000000001 RSI: 0000000000000200 RDI: ffff8800b8448fd6
[ 962.937079] RBP: ffff88013fd83e20 R08: 0000000000000000 R09: ffff8800b8448fe6
[ 962.937126] R10: 0000000000000080 R11: 0000000000000000 R12: ffff8800b8448fce
[ 962.937172] R13: ffffffff81efb6c0 R14: 0000000000000001 R15: 0000000000000003
[ 962.937219] FS: 0000000000000000(0000) GS:ffff88013fd80000(0000) knlGS:0000000000000000
[ 962.937272] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 962.937310] CR2: 0000000000000018 CR3: 00000000ba602000 CR4: 0000000000000670
[ 962.937429] Stack:
[ 962.937448] 0000000000000000 0000000000000000 000000e032fd4a6a ffff88010026d82b
[ 962.937505] ffffffff810baaea ffff88013fd963b0 0000000000000000 ffff8800b8448fd6
[ 962.937565] ffff880100000001 0000000000000000 ffff8800b8448fe6 ffffffff810c077a
[ 962.937623] Call Trace:
[ 962.937642] <IRQ>
[ 962.937664] [<ffffffff810baaea>] ? select_idle_sibling+0x2a/0x120
[ 962.937708] [<ffffffff810c077a>] ? enqueue_task_fair+0xaa/0x8b0
[ 962.937753] [<ffffffff81038119>] ? sched_clock+0x9/0x10
[ 962.937790] [<ffffffff810b8c8f>] ? sched_clock_cpu+0x8f/0xa0
[ 962.937832] [<ffffffff810b2524>] ? check_preempt_curr+0x54/0x90
[ 962.939091] [<ffffffff81868280>]...

Read more...

Revision history for this message
Anand Sakthivel (asakth22) wrote :

Hi,

Is an update for the patch "net: handle no dst on skb in icmp6_send" now available in the release of the kernel version "4.4.0.186.192".

linux-generic/xenial-updates,xenial-security 4.4.0.186.192 amd64 [upgradable from: 4.4.0.179.187]

Br,
Anand

Revision history for this message
Mauricio Faria de Oliveira (mfo) wrote :

Hi Anand,

Yes, that patch is available since 4.4.0-185.215.

Please note that this version if for the linux-image-4.4.0-185-generic package.

You mentioned version numbers for the linux-generic meta package (which pulls in linux-image-<version>-generic package as a dependency.)

Either way, the first number (-185, -186, etc) match between both packages.

If you're on -186 the fix should be available since it's present since -185.

cheers,
Mauricio

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.