Ubuntu 18.04.1 - [s390x] Kernel panic while stressing network bonding

Bug #1797367 reported by bugproxy on 2018-10-11
16
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Ubuntu on IBM z Systems
Critical
Canonical Kernel Team
linux (Ubuntu)
High
Skipper Bug Screeners
Bionic
High
Unassigned

Bug Description

== Comment: #0 - Athira Rajeev
---Problem Description---
While running a series of stress tests for network bonding on UBUNTU 18.04.1 with kernel 4.15.0-36.39, kernel panic is observed.
There are two instance of panic experienced with the same test procedures one of which indicates to be a kernel BUG.

Contact Information = Athira Rajeev <email address hidden>, Waiki Wright < <email address hidden> >

---uname output---
#39-Ubuntu SMP Mon Sep 24 16:13:24 UTC 2018 4.15.0-36.39

Machine Type = This issue is observed on z13 system
 ---Debugger---
A debugger was configured,

---Steps to Reproduce---
This happens while running stress tests for network bonding. kernel memory exposure attempt is detected and the BUG() is called from the code snippet: mm/usercopy.c:72
dump was configured and crash dump is available.
Results of few crash commands like bt, log are added in Attachment

Relevant part of dmesg pointing to kernel BUG

<<>>
[14746.977364] kernel BUG at /build/linux-PABIrW/linux-4.15.0/mm/usercopy.c:72!
[14746.977377] illegal operation: 0001 ilc:1 [#1] SMP
[14746.977378] Modules linked in: macsec vsock_diag vsock sctp_diag sctp dccp_diag dccp tcp_diag udp_diag raw_diag inet_diag unix_diag af_packet_diag netlink_diag bonding binfmt_misc qeth_l3 8021q garp mrp stp llc xt_tcpudp qeth_l2 nf_conntrack_ipv6 nf_defrag_ipv6 scsi_dh_rdac scsi_dh_emc scsi_dh_alua s390_trng ghash_s390 prng sha512_s390 sha256_s390 sha1_s390 sha_common chsc_sch eadm_sch nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack libcrc32c crc32_vx_s390 qeth ccwgroup ip6table_filter ip6_tables vfio_ccw vfio_mdev mdev vfio_iommu_type1 vfio iptable_filter sch_fq_codel ip_tables x_tables aes_s390 des_s390 des_generic dm_crypt dm_service_time dm_multipath zfcp scsi_transport_fc qdio dasd_eckd_mod dasd_mod btrfs xor zstd_compress raid6_pq zlib_deflate
[14746.977401] CPU: 1 PID: 20905 Comm: dump2tar Tainted: G OE 4.15.0-36-generic #39-Ubuntu
[14746.977403] Hardware name: IBM 3906 M02 757 (LPAR)
[14746.977404] Krnl PSW : 000000000f2d230d 000000006abe14d5 (__check_object_size+0x15a/0x1e0)
[14746.977408] R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:2 PM:0 RI:0 EA:3
[14746.977410] Krnl GPRS: 0000000000000002 0000000000e95334 0000000000000064 00000001e6518828
[14746.977412] 000000000037cc8e 0000000000000000 0000000000a9577c 0000000000000000
[14746.977413] 000000000000647b 00000001d8c120a8 0000000000000001 0000000000008088
[14746.977433] 00000001d8c0a020 000000000090da38 000000000037cc8e 000000016fdfbcd0
[14746.977440] Krnl Code: 000000000037cc82: c0200038ef69 larl %r2,a9ab54
                          000000000037cc88: c0e5fff32838 brasl %r14,1e1cf8
                         #000000000037cc8e: a7f40001 brc 15,37cc90
                         >000000000037cc92: e330d0080004 lg %r3,8(%r13)
                          000000000037cc98: e320d0000004 lg %r2,0(%r13)
                          000000000037cc9e: ecc2001a4065 clgrj %r12,%r2,4,37ccd2
                          000000000037cca4: b9040013 lgr %r1,%r3
                          000000000037cca8: ec31ff868064 cgrj %r3,%r1,8,37cbb4
[14746.977458] Call Trace:
[14746.977460] ([<000000000037cc8e>] __check_object_size+0x156/0x1e0)
[14746.977462] [<000000000010ac40>] debug_output+0x150/0x2f8
[14746.977464] [<00000000004e02c0>] full_proxy_read+0x80/0xe0
[14746.977466] [<0000000000382592>] vfs_read+0x8a/0x150
[14746.977467] [<0000000000382b2e>] SyS_read+0x66/0xe0
[14746.977469] [<00000000008e3c94>] system_call+0xd8/0x2c8
[14746.977470] Last Breaking-Event-Address:
[14746.977472] [<000000000037cc8e>] __check_object_size+0x156/0x1e0
[14746.977473]
<<>>

Adding one more occurrence of panic_on_oops below which appears to correlate to above .

Stack trace output:
Available traces added below

Oops output:
 [ 2140.467261] 8021q: adding VLAN 0 to HW filter on device bond0
[ 2140.467979] IPv6: ADDRCONF(NETDEV_UP): bond0: link is not ready
[ 2140.471609] IPv6: ADDRCONF(NETDEV_UP): bond0: link is not ready
[ 2140.471610] 8021q: adding VLAN 0 to HW filter on device bond0
[ 2140.472797] IPv6: ADDRCONF(NETDEV_UP): bond0: link is not ready
[ 2143.278986] Unable to handle kernel pointer dereference in virtual kernel address space
[ 2143.278991] Failing address: 7379732f6b657000 TEID: 7379732f6b657803
[ 2143.278993] Fault in home space mode while using kernel ASCE.
[ 2143.278996] AS:0000000000ea0007 R3:0000000000000024
[ 2143.279052] Oops: 0038 ilc:3 [#1] SMP
[ 2143.279055] Modules linked in: bonding 8021q garp mrp stp llc qeth_l3 binfmt_misc macsec vsock_diag vsock sctp_diag sctp dccp_diag dccp tcp_diag udp_diag raw_diag inet_diag unix_diag af_packet_diag netlink_diag xt_tcpudp qeth_l2 nf_conntrack_ipv6 nf_defrag_ipv6 scsi_dh_rdac scsi_dh_emc scsi_dh_alua nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack libcrc32c crc32_vx_s390 ghash_s390 prng sha512_s390 sha256_s390 sha1_s390 sha_common chsc_sch eadm_sch ip6table_filter ip6_tables qeth ccwgroup vfio_ccw vfio_mdev mdev vfio_iommu_type1 vfio iptable_filter sch_fq_codel ip_tables x_tables aes_s390 des_s390 des_generic dm_crypt dm_service_time dm_multipath zfcp scsi_transport_fc qdio dasd_eckd_mod dasd_mod btrfs xor zstd_compress raid6_pq zlib_deflate
[ 2143.279099] CPU: 16 PID: 172270 Comm: dump2tar Tainted: G OE 4.15.0-36-generic #39-Ubuntu
[ 2143.279100] Hardware name: IBM 2964 NC9 7A5 (LPAR)
[ 2143.279102] Krnl PSW : 00000000d3630b5f 00000000af8614fc (debug_output+0x188/0x2f8)
[ 2143.279108] R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:2 PM:0 RI:0 EA:3
[ 2143.279110] Krnl GPRS: 0000000000010000 ffffffff000002d8 7379732f6b65726e 00000001db91a020
[ 2143.279112] 0000000000000000 0000000000ea4ac8 00000001db91a020 00000000000009d2
[ 2143.279135] 0000000000000fe5 00000001000ff9ed 00000000000009d2 00000000000009d2
[ 2143.279137] 00000001db91a000 00000001db91a020 000000000010ac54 00000001d16cbd30
[ 2143.279146] Krnl Code: 000000000010ac68: 5810c010 l %r1,16(%r12)
                          000000000010ac6c: ec180063ff7e cij %r1,-1,8,10ad32
                         #000000000010ac72: e320c8280004 lg %r2,2088(%r12)
                         >000000000010ac78: e33020300002 ltg %r3,48(%r2)
                          000000000010ac7e: a784008f brc 8,10ad9c
                          000000000010ac82: 5a102028 a %r1,40(%r2)
                          000000000010ac86: 5010c010 st %r1,16(%r12)
                          000000000010ac8a: a7391000 lghi %r3,4096
[ 2143.279167] Call Trace:
[ 2143.279169] ([<000000000010ac40>] debug_output+0x150/0x2f8)
[ 2143.279172] [<00000000004e02c4>] full_proxy_read+0x84/0xe0
[ 2143.279175] [<0000000000382592>] vfs_read+0x8a/0x150
[ 2143.279177] [<0000000000382b2e>] SyS_read+0x66/0xe0
[ 2143.279180] [<00000000008e3c98>] system_call+0xdc/0x2c8
[ 2143.279182] Last Breaking-Event-Address:
[ 2143.279184] [<00000000008e7614>] __s390_indirect_jump_r14+0x0/0xc
[ 2143.279185]
[ 2143.279187] Kernel panic - not syncing: Fatal exception: panic_on_oops

System Dump Location:
 kdump was configured and crash dump is available. since crash dump is huge to be added as bugzilla attachment, results of few crash commands like bt, log will be added in Attachment

== Comment: #5 - Athira Rajeev
Hi,

since crash dump was huge to be added as bugzilla attachment, results of few crash commands like bt, log were added in the Attachment. Please let me know if required where to upload the dump files.

Thanks
Athira

Default Comment by Bridge

tags: added: architecture-s39064 bugnameltc-172244 severity-high targetmilestone-inin---
Changed in ubuntu:
assignee: nobody → Skipper Bug Screeners (skipper-screen-team)
affects: ubuntu → linux (Ubuntu)
Changed in ubuntu-z-systems:
importance: Undecided → High
assignee: nobody → Canonical Kernel Team (canonical-kernel-team)
Changed in linux (Ubuntu):
importance: Undecided → Critical
tags: added: kernel-key
Changed in linux (Ubuntu):
importance: Critical → High
Changed in linux (Ubuntu Bionic):
importance: Undecided → High
status: New → Triaged
Changed in linux (Ubuntu):
status: New → Triaged
Stefan Bader (smb) wrote :

Has this been observed only with this kernel version (or not tried), Or in other words did any 4.15 kernel not cause these issues.

Also we currently have the next scheduled kernel being in testing (-proposed). It contains a bunch of accumulated stable updates, so it would be good to see whether it still fails or maybe solves the issue already (https://launchpad.net/ubuntu/+source/linux/4.15.0-38.41).

Changed in ubuntu-z-systems:
status: New → Triaged
Andy Whitcroft (apw) wrote :

The stack trace looks to indicate a debugfs read triggering an out of bounds read against a kernel object backed by the slab. That is that the read is offset into the object such that the attempted read would have traversed onto the following object exposing kernel memory for an unrelated item. Do we know what the userspace application triggering this is actually doing when the fault triggers?

------- Comment From <email address hidden> 2018-10-12 05:58 EDT-------
Thanks for the update. Please find the response below.

1. The issue is reproducible on below versions of kernel:

4.15.0-36.39
4.15.0-35.38
4.15.0-33.36
4.15.0-20.21

Also the panic is reproducible on z14 machine.

We will re run with kernel 4.15.0-38.41 and update the results here

2. With respect to what userspace application triggering happens during the panic, this issue occurs during stress tests on network bonding and at a high level , the set of operations are as below:

# Create bond for which slaves are already configured
# Try to create bond for a slave with wrong port number. ( Failure test case scenario )
# Delete the created bond

The test infrastructure uses REST requests and we have collected the coredump files through nmcli commands along with other operations. As mentioned already in this bugzilla, since crash dump was huge to be added as bugzilla attachment, results of few crash commands like bt, log only were added in the Attachment.

One other observation is, at the time of the crash observed that the gzip tool is running with high cpu consumption reported through top.

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
191999 root 20 0 2424 1424 1152 R 100.0 0.0 0:37.86 gzip

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
68704 root 20 0 2424 1424 1152 R 93.8 0.0 0:41.97 gzip

Will respond back with testing results on 4.15.0-38.41 and a reduced set of exact test scenario. Meanwhile, please let me know where to upload the dump files and sosreport to help in further analysis

Frank Heimes (frank-heimes) wrote :

You may use apport (apport-cli / apport-bug)
and upload any data to this launchpad bug.
https://help.ubuntu.com/lts/serverguide/reporting-bugs.html.en

Alternatively you can do an (anonymous) ftp upload of very huge files here: archive.admin.canonical.com

bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2018-10-12 07:28 EDT-------
(In reply to comment #10)
> You may use apport (apport-cli / apport-bug)
> and upload any data to this launchpad bug.
> https://help.ubuntu.com/lts/serverguide/reporting-bugs.html.en
>
> Alternatively you can do an (anonymous) ftp upload of very huge files here:
> archive.admin.canonical.com

Thanks, I have uploaded two dump files under directory /1797367 in archive.admin.canonical.com, Please let me know if its accessible

P0000M96-TESTREST-zSWTestAppliance-2018-10-08T06_50_42-AZIZ0104E-39181eaf-7093-455c-975e-f454936e1c92.bin
P0000S27-STEVE-zSWTestAppliance-2018-10-03T23_01_58-AZIZ0104E-1763fc60-2b01-415b-abd7-07c7502b3045.bin

Frank Heimes (frank-heimes) wrote :

I opened a RT ticket to retrieve the files.
They are now accessible (Canonical internal) via my home (fheimes) at 'mombin':

94293893 Oct 12 11:22 P0000M96-TESTREST-zSWTestAppliance-2018-10-08T06_50_42-AZIZ0104E-39181eaf-7093-455c-975e-f454936e1c92.bin
129272177 Oct 12 11:26 P0000S27-STEVE-zSWTestAppliance-2018-10-03T23_01_58-AZIZ0104E-1763fc60-2b01-415b-abd7-07c7502b3045.bin

Eberhard Pasch (epasch-u) wrote :

Any update? We do have more dumps if needed.

Frank Heimes (frank-heimes) wrote :

We would like to know the results from kernel 4.15.0-38 from proposed,
since that comes with several patches on top.
If dumps from that kernel are available,
please upload to the ftp server (#5) and I make them internally available, too.

bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2018-10-17 05:32 EDT-------
Hi,

The issue still exists and we can recreate the kernel panic with kernel 4.15.0-38 proposed.
I have uploaded the dump for the crash on 4.15.0-38 proposed under the directory /1797367 in archive.admin.canonical.com
This is the new dump file: P0000S27-JOHN-zSWTestAppliance-2018-10-16T15_07_56-AZIZ0104E-8196323d-3fcf-44d8-9b1a-bd1e6a694bad.bin

Frank Heimes (frank-heimes) wrote :

Thanks for testing on 4.15.0-38 from proposed.
Requested the retrieval of the dump file ...

Frank Heimes (frank-heimes) wrote :

The dump using 4.15.0-38 from proposed is now accessible (Canonical internal) via my home (fheimes) at 'mombin':
$ ls P0000S27-JOHN-*.bin
P0000S27-JOHN-zSWTestAppliance-2018-10-16T15_07_56-AZIZ0104E-8196323d-3fcf-44d8-9b1a-bd1e6a694bad.bin

bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2018-10-18 08:27 EDT-------
Increased priority for getting thsi fixed very urgent. Ship issue....Many thx in advance.....

tags: added: severity-critical targetmilestone-inin1804
removed: severity-high targetmilestone-inin---
Changed in ubuntu-z-systems:
importance: High → Critical

------- Comment on attachment From <email address hidden> 2018-10-18 09:05 EDT-------

With the following bash script alone we are able to recreate the panic within minutes.
It basically performs a series of network device enable/disable commands in parallel with collecting debug data using /sbin/dbginfo.sh from the s390tools package.
Replace the network device id "0.0.1d40" in the script to a device that exists on system running the OS.

Seems the problem happens when /sysfs files are being altered while dbginfo is used to collect data from sysfs at the same time.

Stefan Bader (smb) wrote :

So with the script we are able to re-create the panic on one of our systems. Which is good because I was not yet able to extract anything from the uploaded dump (using makedumpfile to convert from kdump_flat format seems to fail finding an end marker).

Generally from the data seen so far all point to a race between disabling a qeth device and accessing debugfs. Since tear down of network device structures is handled asynchronously there is a chance that this might be racing with the access to debug data. This is something that is generally known to upstream but did not receive that much attention either. Access to debugfs should be rather limited to, as its name indicates, debug use.

In that light, the question is whether this really should be treated release critical. Trying to get this race free might turn into a bigger task and at the same time should not be something any customer should run into during normal operation.

Changed in linux (Ubuntu Bionic):
status: Triaged → In Progress
Changed in ubuntu-z-systems:
status: Triaged → Confirmed
Stefan Bader (smb) wrote :

With more time spent on this, it feels like some question to be passed along to my former colleagues. Unfortunately the dump does not contain any data that is simple to track because the initial BUG() trigger already ended by the time the final panic is done.

However debug_output() is rather unique for s390x and related to the fast dmesg like debug feature. Toggling the ccw device will attach/detach it to/from the qeth driver which registers/deregisters s390dbf views. De-registration is the problematic part. While that call is taking a mutex and then decrements a ref-count to the debug info area and that in theory should be protected by debug_open() incrementing that reference. But then there is also a snapshot copy copy included in the file info structure and I am not really sure how that exactly relates to the original structure and whether debug_output() would need additional ref-counting.

Alternatively I also notice that qeth_core_exit() does not exactly undoes qeth_core_init(). It destroys the workqueue first while that was the first thing to create and the error path of qeth_core_init() would destroy the workqueue as the last thing, after having unregistered the debug views. Whether that really matters in this context, I am not sure.

Changed in linux (Ubuntu):
status: Triaged → In Progress
Changed in ubuntu-z-systems:
status: Confirmed → In Progress
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers