Ubuntu 18.04.1 - [s390x] Kernel panic while stressing network bonding

Bug #1797367 reported by bugproxy
18
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Ubuntu on IBM z Systems
Fix Released
Critical
Canonical Kernel Team
linux (Ubuntu)
Fix Released
Critical
Skipper Bug Screeners
Bionic
Fix Released
Critical
Unassigned
Cosmic
Fix Released
Critical
Unassigned

Bug Description

== SRU Justification ==

While running a series of stress tests for network on a bond device on Ubuntu 18.04.1 with kernel 4.15.0-36.39,
kernel panic is observed (btw. also on non-bond devices).
This looks like a race between disabling a qeth device and accessing debugfs.
This is critical and leads repeatedly to a crash (sooner or later).

== Fix ==

e19e5be8b4ca ("s390/qeth: sanitize strings in debug messages")

pre-reqs:
750b162 ("s390/qeth: reduce hard-coded access to ccw channels")
d857e11 ("s390/qeth: remove outdated portname debug msg")
9d0a58f ("s390/qeth: avoid using is_multicast_ether_addr_64bits on (u8 *)[6]")
8174aa8 ("s390/qeth: consolidate qeth MAC address helpers")
4641b02 ("s390/qeth: don't keep track of MAC address's cast type")

== Regression Potential ==

Low, because:

- limited to s390x
- and again limited to qeth driver
- patches a problem identified during testing
- fix was tested by IBM before submitted

== Test Case ==

run:
   #!/bin/bash
   var=0
   while :
   do
        var=$((var + 1))
        echo "DBG count is $var"
        mkdir /tmp/DBGINFO
        dbginfo.sh -d /tmp/DBGINFO
        rm -rf /tmp/DBGINFO*
        echo "chzdev now is $var"
        chzdev -e <qeth device>
        chzdev -d <qeth device>
   done
and in avg. in less than 20 cycles a crash happens (usually < 10).

__________

== Comment: #0 - Athira Rajeev
---Problem Description---
While running a series of stress tests for network bonding on UBUNTU 18.04.1 with kernel 4.15.0-36.39, kernel panic is observed.
There are two instance of panic experienced with the same test procedures one of which indicates to be a kernel BUG.

Contact Information = Athira Rajeev <email address hidden>, Waiki Wright < <email address hidden> >

---uname output---
#39-Ubuntu SMP Mon Sep 24 16:13:24 UTC 2018 4.15.0-36.39

Machine Type = This issue is observed on z13 system
 ---Debugger---
A debugger was configured,

---Steps to Reproduce---
This happens while running stress tests for network bonding. kernel memory exposure attempt is detected and the BUG() is called from the code snippet: mm/usercopy.c:72
dump was configured and crash dump is available.
Results of few crash commands like bt, log are added in Attachment

Relevant part of dmesg pointing to kernel BUG

<<>>
[14746.977364] kernel BUG at /build/linux-PABIrW/linux-4.15.0/mm/usercopy.c:72!
[14746.977377] illegal operation: 0001 ilc:1 [#1] SMP
[14746.977378] Modules linked in: macsec vsock_diag vsock sctp_diag sctp dccp_diag dccp tcp_diag udp_diag raw_diag inet_diag unix_diag af_packet_diag netlink_diag bonding binfmt_misc qeth_l3 8021q garp mrp stp llc xt_tcpudp qeth_l2 nf_conntrack_ipv6 nf_defrag_ipv6 scsi_dh_rdac scsi_dh_emc scsi_dh_alua s390_trng ghash_s390 prng sha512_s390 sha256_s390 sha1_s390 sha_common chsc_sch eadm_sch nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack libcrc32c crc32_vx_s390 qeth ccwgroup ip6table_filter ip6_tables vfio_ccw vfio_mdev mdev vfio_iommu_type1 vfio iptable_filter sch_fq_codel ip_tables x_tables aes_s390 des_s390 des_generic dm_crypt dm_service_time dm_multipath zfcp scsi_transport_fc qdio dasd_eckd_mod dasd_mod btrfs xor zstd_compress raid6_pq zlib_deflate
[14746.977401] CPU: 1 PID: 20905 Comm: dump2tar Tainted: G OE 4.15.0-36-generic #39-Ubuntu
[14746.977403] Hardware name: IBM 3906 M02 757 (LPAR)
[14746.977404] Krnl PSW : 000000000f2d230d 000000006abe14d5 (__check_object_size+0x15a/0x1e0)
[14746.977408] R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:2 PM:0 RI:0 EA:3
[14746.977410] Krnl GPRS: 0000000000000002 0000000000e95334 0000000000000064 00000001e6518828
[14746.977412] 000000000037cc8e 0000000000000000 0000000000a9577c 0000000000000000
[14746.977413] 000000000000647b 00000001d8c120a8 0000000000000001 0000000000008088
[14746.977433] 00000001d8c0a020 000000000090da38 000000000037cc8e 000000016fdfbcd0
[14746.977440] Krnl Code: 000000000037cc82: c0200038ef69 larl %r2,a9ab54
                          000000000037cc88: c0e5fff32838 brasl %r14,1e1cf8
                         #000000000037cc8e: a7f40001 brc 15,37cc90
                         >000000000037cc92: e330d0080004 lg %r3,8(%r13)
                          000000000037cc98: e320d0000004 lg %r2,0(%r13)
                          000000000037cc9e: ecc2001a4065 clgrj %r12,%r2,4,37ccd2
                          000000000037cca4: b9040013 lgr %r1,%r3
                          000000000037cca8: ec31ff868064 cgrj %r3,%r1,8,37cbb4
[14746.977458] Call Trace:
[14746.977460] ([<000000000037cc8e>] __check_object_size+0x156/0x1e0)
[14746.977462] [<000000000010ac40>] debug_output+0x150/0x2f8
[14746.977464] [<00000000004e02c0>] full_proxy_read+0x80/0xe0
[14746.977466] [<0000000000382592>] vfs_read+0x8a/0x150
[14746.977467] [<0000000000382b2e>] SyS_read+0x66/0xe0
[14746.977469] [<00000000008e3c94>] system_call+0xd8/0x2c8
[14746.977470] Last Breaking-Event-Address:
[14746.977472] [<000000000037cc8e>] __check_object_size+0x156/0x1e0
[14746.977473]
<<>>

Adding one more occurrence of panic_on_oops below which appears to correlate to above .

Stack trace output:
Available traces added below

Oops output:
 [ 2140.467261] 8021q: adding VLAN 0 to HW filter on device bond0
[ 2140.467979] IPv6: ADDRCONF(NETDEV_UP): bond0: link is not ready
[ 2140.471609] IPv6: ADDRCONF(NETDEV_UP): bond0: link is not ready
[ 2140.471610] 8021q: adding VLAN 0 to HW filter on device bond0
[ 2140.472797] IPv6: ADDRCONF(NETDEV_UP): bond0: link is not ready
[ 2143.278986] Unable to handle kernel pointer dereference in virtual kernel address space
[ 2143.278991] Failing address: 7379732f6b657000 TEID: 7379732f6b657803
[ 2143.278993] Fault in home space mode while using kernel ASCE.
[ 2143.278996] AS:0000000000ea0007 R3:0000000000000024
[ 2143.279052] Oops: 0038 ilc:3 [#1] SMP
[ 2143.279055] Modules linked in: bonding 8021q garp mrp stp llc qeth_l3 binfmt_misc macsec vsock_diag vsock sctp_diag sctp dccp_diag dccp tcp_diag udp_diag raw_diag inet_diag unix_diag af_packet_diag netlink_diag xt_tcpudp qeth_l2 nf_conntrack_ipv6 nf_defrag_ipv6 scsi_dh_rdac scsi_dh_emc scsi_dh_alua nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack libcrc32c crc32_vx_s390 ghash_s390 prng sha512_s390 sha256_s390 sha1_s390 sha_common chsc_sch eadm_sch ip6table_filter ip6_tables qeth ccwgroup vfio_ccw vfio_mdev mdev vfio_iommu_type1 vfio iptable_filter sch_fq_codel ip_tables x_tables aes_s390 des_s390 des_generic dm_crypt dm_service_time dm_multipath zfcp scsi_transport_fc qdio dasd_eckd_mod dasd_mod btrfs xor zstd_compress raid6_pq zlib_deflate
[ 2143.279099] CPU: 16 PID: 172270 Comm: dump2tar Tainted: G OE 4.15.0-36-generic #39-Ubuntu
[ 2143.279100] Hardware name: IBM 2964 NC9 7A5 (LPAR)
[ 2143.279102] Krnl PSW : 00000000d3630b5f 00000000af8614fc (debug_output+0x188/0x2f8)
[ 2143.279108] R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:2 PM:0 RI:0 EA:3
[ 2143.279110] Krnl GPRS: 0000000000010000 ffffffff000002d8 7379732f6b65726e 00000001db91a020
[ 2143.279112] 0000000000000000 0000000000ea4ac8 00000001db91a020 00000000000009d2
[ 2143.279135] 0000000000000fe5 00000001000ff9ed 00000000000009d2 00000000000009d2
[ 2143.279137] 00000001db91a000 00000001db91a020 000000000010ac54 00000001d16cbd30
[ 2143.279146] Krnl Code: 000000000010ac68: 5810c010 l %r1,16(%r12)
                          000000000010ac6c: ec180063ff7e cij %r1,-1,8,10ad32
                         #000000000010ac72: e320c8280004 lg %r2,2088(%r12)
                         >000000000010ac78: e33020300002 ltg %r3,48(%r2)
                          000000000010ac7e: a784008f brc 8,10ad9c
                          000000000010ac82: 5a102028 a %r1,40(%r2)
                          000000000010ac86: 5010c010 st %r1,16(%r12)
                          000000000010ac8a: a7391000 lghi %r3,4096
[ 2143.279167] Call Trace:
[ 2143.279169] ([<000000000010ac40>] debug_output+0x150/0x2f8)
[ 2143.279172] [<00000000004e02c4>] full_proxy_read+0x84/0xe0
[ 2143.279175] [<0000000000382592>] vfs_read+0x8a/0x150
[ 2143.279177] [<0000000000382b2e>] SyS_read+0x66/0xe0
[ 2143.279180] [<00000000008e3c98>] system_call+0xdc/0x2c8
[ 2143.279182] Last Breaking-Event-Address:
[ 2143.279184] [<00000000008e7614>] __s390_indirect_jump_r14+0x0/0xc
[ 2143.279185]
[ 2143.279187] Kernel panic - not syncing: Fatal exception: panic_on_oops

System Dump Location:
 kdump was configured and crash dump is available. since crash dump is huge to be added as bugzilla attachment, results of few crash commands like bt, log will be added in Attachment

== Comment: #5 - Athira Rajeev
Hi,

since crash dump was huge to be added as bugzilla attachment, results of few crash commands like bt, log were added in the Attachment. Please let me know if required where to upload the dump files.

Thanks
Athira

Revision history for this message
bugproxy (bugproxy) wrote : crash dump analysis logs

Default Comment by Bridge

tags: added: architecture-s39064 bugnameltc-172244 severity-high targetmilestone-inin---
Changed in ubuntu:
assignee: nobody → Skipper Bug Screeners (skipper-screen-team)
affects: ubuntu → linux (Ubuntu)
Frank Heimes (fheimes)
Changed in ubuntu-z-systems:
importance: Undecided → High
assignee: nobody → Canonical Kernel Team (canonical-kernel-team)
Changed in linux (Ubuntu):
importance: Undecided → Critical
tags: added: kernel-key
Changed in linux (Ubuntu):
importance: Critical → High
Changed in linux (Ubuntu Bionic):
importance: Undecided → High
status: New → Triaged
Changed in linux (Ubuntu):
status: New → Triaged
Revision history for this message
Stefan Bader (smb) wrote :

Has this been observed only with this kernel version (or not tried), Or in other words did any 4.15 kernel not cause these issues.

Also we currently have the next scheduled kernel being in testing (-proposed). It contains a bunch of accumulated stable updates, so it would be good to see whether it still fails or maybe solves the issue already (https://launchpad.net/ubuntu/+source/linux/4.15.0-38.41).

Frank Heimes (fheimes)
Changed in ubuntu-z-systems:
status: New → Triaged
Revision history for this message
Andy Whitcroft (apw) wrote :

The stack trace looks to indicate a debugfs read triggering an out of bounds read against a kernel object backed by the slab. That is that the read is offset into the object such that the attempted read would have traversed onto the following object exposing kernel memory for an unrelated item. Do we know what the userspace application triggering this is actually doing when the fault triggers?

Revision history for this message
bugproxy (bugproxy) wrote : Comment bridged from LTC Bugzilla

------- Comment From <email address hidden> 2018-10-12 05:58 EDT-------
Thanks for the update. Please find the response below.

1. The issue is reproducible on below versions of kernel:

4.15.0-36.39
4.15.0-35.38
4.15.0-33.36
4.15.0-20.21

Also the panic is reproducible on z14 machine.

We will re run with kernel 4.15.0-38.41 and update the results here

2. With respect to what userspace application triggering happens during the panic, this issue occurs during stress tests on network bonding and at a high level , the set of operations are as below:

# Create bond for which slaves are already configured
# Try to create bond for a slave with wrong port number. ( Failure test case scenario )
# Delete the created bond

The test infrastructure uses REST requests and we have collected the coredump files through nmcli commands along with other operations. As mentioned already in this bugzilla, since crash dump was huge to be added as bugzilla attachment, results of few crash commands like bt, log only were added in the Attachment.

One other observation is, at the time of the crash observed that the gzip tool is running with high cpu consumption reported through top.

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
191999 root 20 0 2424 1424 1152 R 100.0 0.0 0:37.86 gzip

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
68704 root 20 0 2424 1424 1152 R 93.8 0.0 0:41.97 gzip

Will respond back with testing results on 4.15.0-38.41 and a reduced set of exact test scenario. Meanwhile, please let me know where to upload the dump files and sosreport to help in further analysis

Revision history for this message
Frank Heimes (fheimes) wrote :

You may use apport (apport-cli / apport-bug)
and upload any data to this launchpad bug.
https://help.ubuntu.com/lts/serverguide/reporting-bugs.html.en

Alternatively you can do an (anonymous) ftp upload of very huge files here: archive.admin.canonical.com

Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2018-10-12 07:28 EDT-------
(In reply to comment #10)
> You may use apport (apport-cli / apport-bug)
> and upload any data to this launchpad bug.
> https://help.ubuntu.com/lts/serverguide/reporting-bugs.html.en
>
> Alternatively you can do an (anonymous) ftp upload of very huge files here:
> archive.admin.canonical.com

Thanks, I have uploaded two dump files under directory /1797367 in archive.admin.canonical.com, Please let me know if its accessible

P0000M96-TESTREST-zSWTestAppliance-2018-10-08T06_50_42-AZIZ0104E-39181eaf-7093-455c-975e-f454936e1c92.bin
P0000S27-STEVE-zSWTestAppliance-2018-10-03T23_01_58-AZIZ0104E-1763fc60-2b01-415b-abd7-07c7502b3045.bin

Revision history for this message
Frank Heimes (fheimes) wrote :

I opened a RT ticket to retrieve the files.
They are now accessible (Canonical internal) via my home (fheimes) at 'mombin':

94293893 Oct 12 11:22 P0000M96-TESTREST-zSWTestAppliance-2018-10-08T06_50_42-AZIZ0104E-39181eaf-7093-455c-975e-f454936e1c92.bin
129272177 Oct 12 11:26 P0000S27-STEVE-zSWTestAppliance-2018-10-03T23_01_58-AZIZ0104E-1763fc60-2b01-415b-abd7-07c7502b3045.bin

Revision history for this message
Eberhard Pasch (epasch-u) wrote :

Any update? We do have more dumps if needed.

Revision history for this message
Frank Heimes (fheimes) wrote :

We would like to know the results from kernel 4.15.0-38 from proposed,
since that comes with several patches on top.
If dumps from that kernel are available,
please upload to the ftp server (#5) and I make them internally available, too.

Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2018-10-17 05:32 EDT-------
Hi,

The issue still exists and we can recreate the kernel panic with kernel 4.15.0-38 proposed.
I have uploaded the dump for the crash on 4.15.0-38 proposed under the directory /1797367 in archive.admin.canonical.com
This is the new dump file: P0000S27-JOHN-zSWTestAppliance-2018-10-16T15_07_56-AZIZ0104E-8196323d-3fcf-44d8-9b1a-bd1e6a694bad.bin

Revision history for this message
Frank Heimes (fheimes) wrote :

Thanks for testing on 4.15.0-38 from proposed.
Requested the retrieval of the dump file ...

Revision history for this message
Frank Heimes (fheimes) wrote :

The dump using 4.15.0-38 from proposed is now accessible (Canonical internal) via my home (fheimes) at 'mombin':
$ ls P0000S27-JOHN-*.bin
P0000S27-JOHN-zSWTestAppliance-2018-10-16T15_07_56-AZIZ0104E-8196323d-3fcf-44d8-9b1a-bd1e6a694bad.bin

Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2018-10-18 08:27 EDT-------
Increased priority for getting thsi fixed very urgent. Ship issue....Many thx in advance.....

tags: added: severity-critical targetmilestone-inin1804
removed: severity-high targetmilestone-inin---
Frank Heimes (fheimes)
Changed in ubuntu-z-systems:
importance: High → Critical
Revision history for this message
bugproxy (bugproxy) wrote : bash script to recreate the panic

------- Comment on attachment From <email address hidden> 2018-10-18 09:05 EDT-------

With the following bash script alone we are able to recreate the panic within minutes.
It basically performs a series of network device enable/disable commands in parallel with collecting debug data using /sbin/dbginfo.sh from the s390tools package.
Replace the network device id "0.0.1d40" in the script to a device that exists on system running the OS.

Seems the problem happens when /sysfs files are being altered while dbginfo is used to collect data from sysfs at the same time.

Revision history for this message
Stefan Bader (smb) wrote :

So with the script we are able to re-create the panic on one of our systems. Which is good because I was not yet able to extract anything from the uploaded dump (using makedumpfile to convert from kdump_flat format seems to fail finding an end marker).

Generally from the data seen so far all point to a race between disabling a qeth device and accessing debugfs. Since tear down of network device structures is handled asynchronously there is a chance that this might be racing with the access to debug data. This is something that is generally known to upstream but did not receive that much attention either. Access to debugfs should be rather limited to, as its name indicates, debug use.

In that light, the question is whether this really should be treated release critical. Trying to get this race free might turn into a bigger task and at the same time should not be something any customer should run into during normal operation.

Frank Heimes (fheimes)
Changed in linux (Ubuntu Bionic):
status: Triaged → In Progress
Changed in ubuntu-z-systems:
status: Triaged → Confirmed
Revision history for this message
Stefan Bader (smb) wrote :

With more time spent on this, it feels like some question to be passed along to my former colleagues. Unfortunately the dump does not contain any data that is simple to track because the initial BUG() trigger already ended by the time the final panic is done.

However debug_output() is rather unique for s390x and related to the fast dmesg like debug feature. Toggling the ccw device will attach/detach it to/from the qeth driver which registers/deregisters s390dbf views. De-registration is the problematic part. While that call is taking a mutex and then decrements a ref-count to the debug info area and that in theory should be protected by debug_open() incrementing that reference. But then there is also a snapshot copy copy included in the file info structure and I am not really sure how that exactly relates to the original structure and whether debug_output() would need additional ref-counting.

Alternatively I also notice that qeth_core_exit() does not exactly undoes qeth_core_init(). It destroys the workqueue first while that was the first thing to create and the error path of qeth_core_init() would destroy the workqueue as the last thing, after having unregistered the debug views. Whether that really matters in this context, I am not sure.

Frank Heimes (fheimes)
Changed in linux (Ubuntu):
status: Triaged → In Progress
Changed in ubuntu-z-systems:
status: Confirmed → In Progress
Revision history for this message
bugproxy (bugproxy) wrote : Comment bridged from LTC Bugzilla

------- Comment From <email address hidden> 2018-10-24 05:58 EDT-------
The dump using 4.15.0-38 from proposed is now also provided on box in folder
https://ibm.ent.box.com/folder/56333504148 along with its md5.
P0000S27-JOHN-zSWTestAppliance-2018-10-16T15_07_56-AZIZ0104E-8196323d-3fcf-44d8-9b1a-bd1e6a694bad

Please verify whether this dump is usable for debugging.

Thanks

Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2018-10-24 06:09 EDT-------
(In reply to comment #22)
> So with the script we are able to re-create the panic on one of our systems.
> Which is good because I was not yet able to extract anything from the
> uploaded dump (using makedumpfile to convert from kdump_flat format seems to
> fail finding an end marker).
>
> Generally from the data seen so far all point to a race between disabling a
> qeth device and accessing debugfs. Since tear down of network device
> structures is handled asynchronously there is a chance that this might be
> racing with the access to debug data. This is something that is generally
> known to upstream but did not receive that much attention either. Access to
> debugfs should be rather limited to, as its name indicates, debug use.
>
> In that light, the question is whether this really should be treated release
> critical. Trying to get this race free might turn into a bigger task and at
> the same time should not be something any customer should run into during
> normal operation.

The debugfs, despite its name, is mounted by default in every linux image at /sys/kernel/debug. The officially supported dbginfo.sh script from s390tools collects data from this debugfs filesystem, therefore we do not agree that it is for debug purposes only. We are using dbginfo for collecting diagnostic data on any kind of userspace or kernel errors concurrently on a running image.

Revision history for this message
Frank Heimes (fheimes) wrote :

The dump from the 'IBM Box' file share has a different md5 hash (and a slightly different size) - compared to the one from the ftp server.
But this time I could successfully 'rearrange' (-R) the dump file.
Hence I replaced the the old file (P0000S27-JOHN-zSWTestAppliance-2018-10-16T15_07_56-AZIZ0104E-8196323d-3fcf-44d8-9b1a-bd1e6a694bad.bin) on mombin for internal access (and double checked the hash again).

Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2018-10-29 09:16 EDT-------
IBM is investigation this problem. A potential fix seems to be available, but need to tested first and upstream accepted ahead of distribution to Canonical.

Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2018-11-07 07:40 EDT-------
Should be resolved by mainline commit e19e5be8b4ca ("s390/qeth: sanitize strings in debug messages"). To cut down on the conflicts when applying against 18.04/master-next, I would suggest to also consider the following cherry-picks:

750b162598ec s390/qeth: reduce hard-coded access to ccw channels
d857e11193a2 s390/qeth: remove outdated portname debug msg
9ecd196f9106 s390/qeth: avoid using is_multicast_ether_addr_64bits on (u8 *)[6]
8174aa8aceef s390/qeth: consolidate qeth MAC address helpers
4641b027f7c3 s390/qeth: don't keep track of MAC address's cast type

Frank Heimes (fheimes)
description: updated
Revision history for this message
Frank Heimes (fheimes) wrote :

SRU submitted:
[SRU][Cosmic][Bionic][PATCH 0/6] Fixes for LP1797367
https://lists.ubuntu.com/archives/kernel-team/2018-November/096556.html

Stefan Bader (smb)
Changed in linux (Ubuntu Cosmic):
importance: Undecided → High
status: New → In Progress
Stefan Bader (smb)
Changed in linux (Ubuntu Bionic):
status: In Progress → Fix Committed
Changed in linux (Ubuntu Cosmic):
status: In Progress → Fix Committed
Frank Heimes (fheimes)
Changed in linux (Ubuntu):
status: In Progress → Fix Committed
Changed in ubuntu-z-systems:
status: In Progress → Fix Committed
Revision history for this message
Brad Figg (brad-figg) wrote :

This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-cosmic' to 'verification-done-cosmic'. If the problem still exists, change the tag 'verification-needed-cosmic' to 'verification-failed-cosmic'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-cosmic
Revision history for this message
Brad Figg (brad-figg) wrote :

This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-bionic' to 'verification-done-bionic'. If the problem still exists, change the tag 'verification-needed-bionic' to 'verification-failed-bionic'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-bionic
Changed in linux (Ubuntu):
importance: High → Critical
Changed in linux (Ubuntu Bionic):
importance: High → Critical
Changed in linux (Ubuntu Cosmic):
importance: High → Critical
Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2018-11-16 07:45 EDT-------
Fix has been successfully verified by the SSC team.
With kernel 4.15.0-40.43 from proposed our test procedure no longer raises a kernel panic or fails in any other way.

Thanks much for fixing

tags: added: verification-done-bionic
removed: verification-needed-bionic
tags: added: kernel-da-key
removed: kernel-key
Frank Heimes (fheimes)
tags: added: verification-done-cosmic
removed: verification-needed-cosmic
Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (39.7 KiB)

This bug was fixed in the package linux - 4.18.0-12.13

---------------
linux (4.18.0-12.13) cosmic; urgency=medium

  * linux: 4.18.0-12.13 -proposed tracker (LP: #1802743)

  * [FEAT] Guest-dedicated Crypto Adapters (LP: #1787405)
    - s390/zcrypt: Add ZAPQ inline function.
    - s390/zcrypt: Review inline assembler constraints.
    - s390/zcrypt: Integrate ap_asm.h into include/asm/ap.h.
    - s390/zcrypt: fix ap_instructions_available() returncodes
    - KVM: s390: vsie: simulate VCPU SIE entry/exit
    - KVM: s390: introduce and use KVM_REQ_VSIE_RESTART
    - KVM: s390: refactor crypto initialization
    - s390: vfio-ap: base implementation of VFIO AP device driver
    - s390: vfio-ap: register matrix device with VFIO mdev framework
    - s390: vfio-ap: sysfs interfaces to configure adapters
    - s390: vfio-ap: sysfs interfaces to configure domains
    - s390: vfio-ap: sysfs interfaces to configure control domains
    - s390: vfio-ap: sysfs interface to view matrix mdev matrix
    - KVM: s390: interface to clear CRYCB masks
    - s390: vfio-ap: implement mediated device open callback
    - s390: vfio-ap: implement VFIO_DEVICE_GET_INFO ioctl
    - s390: vfio-ap: zeroize the AP queues
    - s390: vfio-ap: implement VFIO_DEVICE_RESET ioctl
    - KVM: s390: Clear Crypto Control Block when using vSIE
    - KVM: s390: vsie: Do the CRYCB validation first
    - KVM: s390: vsie: Make use of CRYCB FORMAT2 clear
    - KVM: s390: vsie: Allow CRYCB FORMAT-2
    - KVM: s390: vsie: allow CRYCB FORMAT-1
    - KVM: s390: vsie: allow CRYCB FORMAT-0
    - KVM: s390: vsie: allow guest FORMAT-0 CRYCB on host FORMAT-1
    - KVM: s390: vsie: allow guest FORMAT-1 CRYCB on host FORMAT-2
    - KVM: s390: vsie: allow guest FORMAT-0 CRYCB on host FORMAT-2
    - KVM: s390: device attrs to enable/disable AP interpretation
    - KVM: s390: CPU model support for AP virtualization
    - s390: doc: detailed specifications for AP virtualization
    - KVM: s390: fix locking for crypto setting error path
    - KVM: s390: Tracing APCB changes
    - s390: vfio-ap: setup APCB mask using KVM dedicated function
    - [Config:] Enable CONFIG_S390_AP_IOMMU and set CONFIG_VFIO_AP to module.

  * Bypass of mount visibility through userns + mount propagation (LP: #1789161)
    - mount: Retest MNT_LOCKED in do_umount
    - mount: Don't allow copying MNT_UNBINDABLE|MNT_LOCKED mounts

  * CVE-2018-18955: nested user namespaces with more than five extents
    incorrectly grant privileges over inode (LP: #1801924) // CVE-2018-18955
    - userns: also map extents in the reverse map to kernel IDs

  * kdump fail due to an IRQ storm (LP: #1797990)
    - SAUCE: x86/PCI: Export find_cap() to be used in early PCI code
    - SAUCE: x86/quirks: Add parameter to clear MSIs early on boot
    - SAUCE: x86/quirks: Scan all busses for early PCI quirks

  * crash in ENA driver on removing an interface (LP: #1802341)
    - SAUCE: net: ena: fix crash during ena_remove()

  * Ubuntu 18.04.1 - [s390x] Kernel panic while stressing network bonding
    (LP: #1797367)
    - s390/qeth: reduce hard-coded access to ccw channels
    - s390/qeth: sanitize strings in debug messages

  * Add checksum offload and T...

Changed in linux (Ubuntu Cosmic):
status: Fix Committed → Fix Released
Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (3.1 KiB)

This bug was fixed in the package linux - 4.15.0-42.45

---------------
linux (4.15.0-42.45) bionic; urgency=medium

  * linux: 4.15.0-42.45 -proposed tracker (LP: #1803592)

  * [FEAT] Guest-dedicated Crypto Adapters (LP: #1787405)
    - KVM: s390: reset crypto attributes for all vcpus
    - KVM: s390: vsie: simulate VCPU SIE entry/exit
    - KVM: s390: introduce and use KVM_REQ_VSIE_RESTART
    - KVM: s390: refactor crypto initialization
    - s390: vfio-ap: base implementation of VFIO AP device driver
    - s390: vfio-ap: register matrix device with VFIO mdev framework
    - s390: vfio-ap: sysfs interfaces to configure adapters
    - s390: vfio-ap: sysfs interfaces to configure domains
    - s390: vfio-ap: sysfs interfaces to configure control domains
    - s390: vfio-ap: sysfs interface to view matrix mdev matrix
    - KVM: s390: interface to clear CRYCB masks
    - s390: vfio-ap: implement mediated device open callback
    - s390: vfio-ap: implement VFIO_DEVICE_GET_INFO ioctl
    - s390: vfio-ap: zeroize the AP queues
    - s390: vfio-ap: implement VFIO_DEVICE_RESET ioctl
    - KVM: s390: Clear Crypto Control Block when using vSIE
    - KVM: s390: vsie: Do the CRYCB validation first
    - KVM: s390: vsie: Make use of CRYCB FORMAT2 clear
    - KVM: s390: vsie: Allow CRYCB FORMAT-2
    - KVM: s390: vsie: allow CRYCB FORMAT-1
    - KVM: s390: vsie: allow CRYCB FORMAT-0
    - KVM: s390: vsie: allow guest FORMAT-0 CRYCB on host FORMAT-1
    - KVM: s390: vsie: allow guest FORMAT-1 CRYCB on host FORMAT-2
    - KVM: s390: vsie: allow guest FORMAT-0 CRYCB on host FORMAT-2
    - KVM: s390: device attrs to enable/disable AP interpretation
    - KVM: s390: CPU model support for AP virtualization
    - s390: doc: detailed specifications for AP virtualization
    - KVM: s390: fix locking for crypto setting error path
    - KVM: s390: Tracing APCB changes
    - s390: vfio-ap: setup APCB mask using KVM dedicated function
    - s390/zcrypt: Add ZAPQ inline function.
    - s390/zcrypt: Review inline assembler constraints.
    - s390/zcrypt: Integrate ap_asm.h into include/asm/ap.h.
    - s390/zcrypt: fix ap_instructions_available() returncodes
    - s390/zcrypt: remove VLA usage from the AP bus
    - s390/zcrypt: Remove deprecated ioctls.
    - s390/zcrypt: Remove deprecated zcrypt proc interface.
    - s390/zcrypt: Support up to 256 crypto adapters.
    - [Config:] Enable CONFIG_S390_AP_IOMMU and set CONFIG_VFIO_AP to module.

  * Bypass of mount visibility through userns + mount propagation (LP: #1789161)
    - mount: Retest MNT_LOCKED in do_umount
    - mount: Don't allow copying MNT_UNBINDABLE|MNT_LOCKED mounts

  * CVE-2018-18955: nested user namespaces with more than five extents
    incorrectly grant privileges over inode (LP: #1801924) // CVE-2018-18955
    - userns: also map extents in the reverse map to kernel IDs

  * kdump fail due to an IRQ storm (LP: #1797990)
    - SAUCE: x86/PCI: Export find_cap() to be used in early PCI code
    - SAUCE: x86/quirks: Add parameter to clear MSIs early on boot
    - SAUCE: x86/quirks: Scan all busses for early PCI quirks

 -- Thadeu Lima de Souza Cascardo <email address hidden> Thu, 15 Nov 2018 17:01:46 ...

Read more...

Changed in linux (Ubuntu Bionic):
status: Fix Committed → Fix Released
Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2018-12-03 11:05 EDT-------
IBM bugzilla status-> closed, Fix Released with Bionic, Cosmic

Revision history for this message
Frank Heimes (fheimes) wrote :

Since it's also in disco master-next, I set the overall ticket to Fix Released.

Changed in linux (Ubuntu):
status: Fix Committed → Fix Released
Changed in ubuntu-z-systems:
status: Fix Committed → Fix Released
Brad Figg (brad-figg)
tags: added: cscc
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.