Kernel Oops, possibly VMWare ballooning related

Bug #1641403 reported by Pehr Söderman
16
This bug affects 3 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Confirmed
High
Unassigned
Xenial
Confirmed
High
Unassigned

Bug Description

Dear Maintainers

I have encountered a kernel oops, which causes the VM to hard hang. This is an Ubuntu VM running on an ESXi 5.5 host. The hang happens regularly (as in every few days). There is some memory pressure on the host, which explain the ballooning.

ProblemType: Bug
DistroRelease: Ubuntu 16.04
Package: linux-image-4.4.0-43-generic 4.4.0-43.63
ProcVersionSignature: Ubuntu 4.4.0-43.63-generic 4.4.21
Uname: Linux 4.4.0-43-generic i686
AlsaDevices:
 total 0
 crw-rw---- 1 root audio 116, 1 Nov 13 13:46 seq
 crw-rw---- 1 root audio 116, 33 Nov 13 13:46 timer
AplayDevices: Error: [Errno 2] No such file or directory: 'aplay'
ApportVersion: 2.20.1-0ubuntu2.1
Architecture: i386
ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord'
AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', '/dev/snd/timer'] failed with exit code 1:
Date: Sun Nov 13 14:13:05 2016
HibernationDevice: RESUME=UUID=ce916614-f122-49a8-a4ae-3d4ac5c515d9
InstallationDate: Installed on 2011-10-30 (1841 days ago)
InstallationMedia: Ubuntu-Server 11.04 "Natty Narwhal" - Release i386 (20110426)
IwConfig:
 ens32 no wireless extensions.

 lo no wireless extensions.
Lsusb: Error: command ['lsusb'] failed with exit code 1:
MachineType: VMware, Inc. VMware Virtual Platform
PciMultimedia:

ProcFB: 0 svgadrmfb
ProcKernelCmdLine: BOOT_IMAGE=/vmlinuz-4.4.0-43-generic root=/dev/mapper/dev2-root ro quiet crashkernel=384M-:128M crashkernel=384M-:128M crashkernel=384M-:128M crashkernel=384M-:128M crashkernel=384M-:128M
RelatedPackageVersions:
 linux-restricted-modules-4.4.0-43-generic N/A
 linux-backports-modules-4.4.0-43-generic N/A
 linux-firmware 1.157.4
RfKill: Error: [Errno 2] No such file or directory: 'rfkill'
SourcePackage: linux
UpgradeStatus: Upgraded to xenial on 2016-08-30 (75 days ago)
dmi.bios.date: 09/17/2015
dmi.bios.vendor: Phoenix Technologies LTD
dmi.bios.version: 6.00
dmi.board.name: 440BX Desktop Reference Platform
dmi.board.vendor: Intel Corporation
dmi.board.version: None
dmi.chassis.asset.tag: No Asset Tag
dmi.chassis.type: 1
dmi.chassis.vendor: No Enclosure
dmi.chassis.version: N/A
dmi.modalias: dmi:bvnPhoenixTechnologiesLTD:bvr6.00:bd09/17/2015:svnVMware,Inc.:pnVMwareVirtualPlatform:pvrNone:rvnIntelCorporation:rn440BXDesktopReferencePlatform:rvrNone:cvnNoEnclosure:ct1:cvrN/A:
dmi.product.name: VMware Virtual Platform
dmi.product.version: None
dmi.sys.vendor: VMware, Inc.

Revision history for this message
Pehr Söderman (pehrs-7) wrote :
Revision history for this message
Pehr Söderman (pehrs-7) wrote :

Here is the complete Ooops:

[134346.139427] BUG: unable to handle kernel NULL pointer dereference at 00000104
[134346.139570] IP: [<f8c71e63>] vmballoon_work+0x283/0x767 [vmw_balloon]
[134346.139715] *pdpt = 0000000035347001 *pde = 0000000000000000
[134346.139804] Oops: 0002 [#1] SMP
[134346.139857] Modules linked in: vmw_vsock_vmci_transport vsock binfmt_misc ppdev vmw_balloon joydev input_leds serio_raw 8250_fintek shpchp i2c_piix4 parport_pc vmw_vmci mac_hid quota_v2 quota_tree ib_iser rdma_cm iw_cm ib_cm ib_sa ib_mad ib_core ib_addr iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi lp parport autofs4 btrfs raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear vmwgfx psmouse ttm pcnet32 mptspi drm_kms_helper mptscsih syscopyarea sysfillrect mptbase sysimgblt fb_sys_fops mii drm scsi_transport_spi pata_acpi floppy fjes
[134346.140778] CPU: 0 PID: 13294 Comm: kworker/0:1 Not tainted 4.4.0-43-generic #63-Ubuntu
[134346.140879] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 09/17/2015
[134346.141047] Workqueue: events_freezable vmballoon_work [vmw_balloon]
[134346.141146] task: f1ee9180 ti: ed3be000 task.ti: ed3be000
[134346.141239] EIP: 0060:[<f8c71e63>] EFLAGS: 00010246 CPU: 0
[134346.141315] EIP is at vmballoon_work+0x283/0x767 [vmw_balloon]
[134346.141497] EAX: f76e4a3c EBX: 00000100 ECX: f76f8000 EDX: 000001bb
[134346.141713] ESI: 000001bb EDI: f8c74410 EBP: ed3bfefc ESP: ed3bfeb8
[134346.141769] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
[134346.141818] CR0: 8005003b CR2: 00000104 CR3: 35a57d00 CR4: 000006f0
[134346.141938] Stack:
[134346.141962] f8c74330 ed3bfed4 00000286 000000fa f2826d00 ed3bfee4 00000286 f8c74330
[134346.142050] 00000000 00000001 f8c74300 f8c74300 00000100 000000ec f8c74410 f1694c00
[134346.142153] f5dbb8c0 ed3bff34 c1087751 aa510bc3 00007a2f aa4e1e10 00007a2f f5dbbdc0
[134346.142240] Call Trace:
[134346.142272] [<c1087751>] process_one_work+0x121/0x3f0
[134346.142321] [<c1087a57>] worker_thread+0x37/0x490
[134346.142444] [<c1087a20>] ? process_one_work+0x3f0/0x3f0
[134346.142496] [<c108cf06>] kthread+0xa6/0xc0
[134346.142547] [<c17b3309>] ret_from_kernel_thread+0x21/0x38
[134346.142600] [<c108ce60>] ? kthread_create_on_node+0x170/0x170
[134346.142653] Code: ff ff ff 85 c0 74 1c 8b 5c 24 24 8b 97 1c ff ff ff 05 00 02 00 00 0f af de 29 da 39 c2 0f 82 95 01 00 00 8b 59 14 8b 41 18 89 f2 <89> 43 04 89 18 8d 5e 01 c7 41 14 00 01 00 00 c7 41 18 00 02 00
[134346.148533] EIP: [<f8c71e63>] vmballoon_work+0x283/0x767 [vmw_balloon] SS:ESP 0068:ed3bfeb8
[134346.150469] CR2: 0000000000000104

Revision history for this message
Brad Figg (brad-figg) wrote : Status changed to Confirmed

This change was made by a bot.

Changed in linux (Ubuntu):
status: New → Confirmed
Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

Would it be possible for you to test the latest upstream stable kernel? Refer to https://wiki.ubuntu.com/KernelMainlineBuilds . Please test the latest v4.4 stable kernel[0].

If this bug is fixed in the mainline kernel, please add the following tag 'kernel-fixed-upstream'.

If the mainline kernel does not fix this bug, please add the tag: 'kernel-bug-exists-upstream'.

If you are unable to test the mainline kernel, for example it will not boot, please add the tag: 'kernel-unable-to-test-upstream'.
Once testing of the upstream kernel is complete, please mark this bug as "Confirmed".

Thanks in advance.

[0] http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.4.31/

Changed in linux (Ubuntu):
importance: Undecided → High
Changed in linux (Ubuntu Xenial):
status: New → Confirmed
importance: Undecided → High
tags: added: kernel-da-key
Revision history for this message
Giovanni Panozzo (giox069) wrote :
Download full text (5.8 KiB)

Same problem here: I have an Ubuntu 16.04.6 LTS VM, 32bit, running under vmware ESXi 6.5
The VM freezes randomly once a week. With kdump the machine crashes and leaves me crash dumps in /var/crash

I could try with mainline 4.4.31 kernel as suggested on the last post, but that post is almost 3 years old. Which mainline kernel would you suggest to try now or Ubuntu 16.04.6 LTS 32bit ?

-----
Crash of 201907080446
-----------------------------
[331600.730775] BUG: unable to handle kernel paging request at f7c02000
[331600.730999] IP: [<c1178e6b>] set_pageblock_migratetype+0x1b/0x60
[331600.731207] *pdpt = 0000000001c93001 *pde = 0000000000000000
[331600.731374] Oops: 0000 [#1] SMP
[331600.731551] Modules linked in: vmw_vsock_vmci_transport vsock ppdev joydev input_leds vmw_balloon serio_raw nfit 8250_fintek shpchp vmw_vmci i2c_piix4 parport_pc parport mac_hid autofs4 btrfs raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear vmwgfx ttm drm_kms_helper syscopyarea psmouse sysfillrect sysimgblt fb_sys_fops drm pcnet32 vmw_pvscsi mii pata_acpi fjes
[331600.732702] CPU: 1 PID: 12401 Comm: kworker/1:0 Not tainted 4.4.0-79-generic #100-Ubuntu
[331600.732909] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 09/19/2018
[331600.733254] Workqueue: events_freezable vmballoon_work [vmw_balloon]
[331600.733424] task: f5b5e900 ti: efc66000 task.ti: efc66000
[331600.733593] EIP: 0060:[<c1178e6b>] EFLAGS: 00010046 CPU: 1
[331600.733754] EIP is at set_pageblock_migratetype+0x1b/0x60
[331600.733914] EAX: f7c02000 EBX: c1b7c808 ECX: 00000000 EDX: 00000000
[331600.734097] ESI: f7c07000 EDI: 00007e30 EBP: efc67d84 ESP: efc67d78
[331600.734265] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
[331600.734425] CR0: 8005003b CR2: f7c02000 CR3: 350b2620 CR4: 000006f0
[331600.734691] Stack:
[331600.734838] 00000002 00000007 c1b7c808 efc67dcc c1179543 00000000 efc67dbb 00000000
[331600.735036] 00000001 0000000a c1b7c580 00000000 f72f2000 f72f2014 00000009 c1b7c808
[331600.735224] 01000002 e6b9af96 efc67e7c 00000141 00000000 efc67e44 c11799fe 00000002
[331600.735422] Call Trace:
[331600.735564] [<c1179543>] __rmqueue.isra.91+0x433/0x4c0
[331600.735749] [<c11799fe>] get_page_from_freelist+0x42e/0x860
[331600.735960] [<c10a3417>] ? dequeue_entity+0x3a7/0xf50
[331600.736164] [<c10a07e3>] ? set_next_entity+0xe3/0xcd0
[331600.736330] [<c117a9d4>] __alloc_pages_nodemask+0x114/0x280
[331600.736535] [<f8d5210e>] vmballoon_work+0x52e/0x767 [vmw_balloon]
[331600.736706] [<c10877c1>] process_one_work+0x121/0x3f0
[331600.736898] [<c1087ac7>] worker_thread+0x37/0x490
[331600.738698] [<c108cf83>] kthread+0xb3/0xd0
[331600.739900] [<c1087a90>] ? process_one_work+0x3f0/0x3f0
[331600.741157] [<c17c3e09>] ret_from_kernel_thread+0x21/0x38
[331600.742361] [<c108ced0>] ? kthread_create_on_node+0x170/0x170
[331600.743563] Code: 09 d1 f0 0f b1 0e 39 c3 75 ee 5b 5e 5f 5d c3 90 55 89 e5 53 83 ec 08 66 66 66 66 90 8b 0d 64 b4 b9 c1 85 c9 74 05 83 fa 02 7e 3d <8b> 08 89 c3 c7 44 24 04 07 00 00 00 c7 04 24 02 00 00 00 c1 e9
[331600.747346] EIP: [<c1178e6b>] set_pageblo...

Read more...

Brad Figg (brad-figg)
tags: added: cscc
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.