Novalink (mkvterm command failure)

Bug #1892546 reported by bugproxy on 2020-08-21
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
The Ubuntu-power-systems project
Critical
Patricia Domingues
linux (Ubuntu)
Undecided
Canonical Kernel Team
Xenial
Undecided
Canonical Kernel Team
Bionic
Undecided
Canonical Kernel Team
Focal
Undecided
Canonical Kernel Team
Groovy
Undecided
Canonical Kernel Team

Bug Description

SRU Justification:

[Impact]
* drmgr command is failing while trying to open console for IBMi on PowerVM LPAR

[Fix]
* 63ffcbdad738 "tty: hvcs: Don't NULL tty->driver_data until hvcs_cleanup()"

[Test Case]
* Create a VM from PowerVC on a Novalink managed host.
* Unmanage and manage the VM, this problem gets reproduced.
* grep the logs for "drmgr"

[Regression Potential]
The regression can be considered as very low, since:
* limited to the IBM Hypervisor Virtual Console Server.
* patched kernel where shared and successfully tested by IBM.
* the worse case, if the cleanup is broken a new opened connection might break.

[Other]
* The patch got upstream accepted.

bugproxy (bugproxy) on 2020-08-21
tags: added: architecture-ppc64le bugnameltc-187036 severity-critical targetmilestone-inin16046
Changed in ubuntu:
assignee: nobody → Ubuntu on IBM Power Systems Bug Triage (ubuntu-power-triage)
affects: ubuntu → linux (Ubuntu)
Frank Heimes (fheimes) wrote :

It looks like the patch just got submitted to upstream (and landed in a staging tree); but we usually need to wait for it's upstream acceptance - at least to Torvalds tree or linux-next (to be sure that the patch is stable, reviewed and community accepted).

Am I got it right reading that this happens on PowerVM DLPARs, controlled by PowerVC?

Changed in ubuntu-power-systems:
importance: Undecided → Critical
assignee: nobody → Ubuntu on IBM Power Systems Bug Triage (ubuntu-power-triage)
status: New → Triaged
Changed in linux (Ubuntu):
assignee: Ubuntu on IBM Power Systems Bug Triage (ubuntu-power-triage) → Canonical Kernel Team (canonical-kernel-team)
Frank Heimes (fheimes) wrote :

I saw that the tag 'targetmilestone-inin16046' is set, which points to 16.04 as affected release. Is this the only affected Ubuntu release?

------- Comment From <email address hidden> 2020-08-24 12:34 EDT-------
(In reply to comment #24)
> It looks like the patch just got submitted to upstream (and landed in a
> staging tree); but we usually need to wait for it's upstream acceptance - at
> least to Torvalds tree or linux-next (to be sure that the patch is stable,
> reviewed and community accepted).

Understood, but is it possible to get a one-off/test kernel in the interim? We have cloud customer for whom this problem is causing intermittent outages.

>
> Am I got it right reading that this happens on PowerVM DLPARs, controlled by
> PowerVC?

Correct, but the problem isn't limited to that scenario. It can be recreated on any PowerVM LPAR, but hvcs is generally not used outside of PowerVC setups and as such that is where this got exposed.

------- Comment From <email address hidden> 2020-08-24 12:37 EDT-------
(In reply to comment #25)
> I saw that the tag 'targetmilestone-inin16046' is set, which points to 16.04
> as affected release. Is this the only affected Ubuntu release?

This is the LTS that PowerVC uses, but as mentioned previously this issue effects all PowerVM LPARs should someone choose to use hvcs so it is desirable in any other releases that support PowerVM.

Andrew Cloke (andrew-cloke) wrote :

Is this issue associated with the Xenial GA (4.4) kernel or the Xenial HWE (4.15) kernel?

Changed in ubuntu-power-systems:
assignee: Ubuntu on IBM Power Systems Bug Triage (ubuntu-power-triage) → Patricia Domingues (patriciasd)
status: Triaged → In Progress
Patricia Domingues (patriciasd) wrote :

Here the patch applied to Xenial-GA kernel:
https://people.canonical.com/~patriciasd/kernel-lp1892546/kernel_custom_xenial_ga/

Added some information about the system where I have tested/installed it and how:
https://people.canonical.com/~patriciasd/kernel-lp1892546/notes

Let me know if you have any questions.

bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2020-08-26 21:26 EDT-------
(In reply to comment #28)
> Is this issue associated with the Xenial GA (4.4) kernel or the Xenial HWE
> (4.15) kernel?

(In reply to comment #28)
> Is this issue associated with the Xenial GA (4.4) kernel or the Xenial HWE
> (4.15) kernel?

They are experiencing the issue with 4.4 kernel, but the fix is also applicable to 4.15.

Andrew Cloke (andrew-cloke) wrote :

Moving to "Incomplete" while waiting for the test results from the test kernel referenced in comment #5.

Changed in ubuntu-power-systems:
status: In Progress → Incomplete
Changed in linux (Ubuntu):
status: New → Incomplete
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2020-09-01 19:12 EDT-------
(In reply to comment #31)
> Moving to "Incomplete" while waiting for the test results from the test
> kernel referenced in comment #5.

The kernel was tested in two different staging environments and the issue is no longer observed.

bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2020-09-10 12:36 EDT-------
Looks like the patch is not accepted into Greg KH's tty/next tree.

This is a note to let you know that I've just added the patch titled

tty: hvcs: Don't NULL tty->driver_data until hvcs_cleanup()

to my tty git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty.git
in the tty-next branch.

The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)

The patch will also be merged in the next major kernel release
during the merge window.

bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2020-09-11 14:07 EDT-------
(In reply to comment #33)
> Looks like the patch is not accepted into Greg KH's tty/next tree.
>

TYPO - should read "patch is ***NOW*** accepted into Greg KH's tty/next tree."

Frank Heimes (fheimes) on 2020-09-14
Changed in linux (Ubuntu):
status: Incomplete → Triaged
Changed in ubuntu-power-systems:
status: Incomplete → Triaged
Frank Heimes (fheimes) on 2020-09-14
Changed in linux (Ubuntu Groovy):
status: Triaged → New

Ok. I see the patch got upstream accepted (commit `63ffcbdad738e3d1c857027789a2273df3337624`).
We believe this need to be SRUed to Groovy, Focal, Bionic and Xenial. Do you agree?

description: updated
Frank Heimes (fheimes) on 2020-09-16
Changed in linux (Ubuntu Groovy):
status: New → In Progress
Changed in linux (Ubuntu Focal):
status: New → In Progress
Changed in linux (Ubuntu Bionic):
status: New → In Progress
Changed in linux (Ubuntu Xenial):
status: New → In Progress
Changed in ubuntu-power-systems:
status: Triaged → In Progress
Changed in linux (Ubuntu Focal):
assignee: nobody → Canonical Kernel Team (canonical-kernel-team)
Changed in linux (Ubuntu Bionic):
assignee: nobody → Canonical Kernel Team (canonical-kernel-team)
Changed in linux (Ubuntu Xenial):
assignee: nobody → Canonical Kernel Team (canonical-kernel-team)
Changed in linux (Ubuntu Xenial):
status: In Progress → Fix Committed
Changed in linux (Ubuntu Bionic):
status: In Progress → Fix Committed
Changed in linux (Ubuntu Focal):
status: In Progress → Fix Committed
Frank Heimes (fheimes) on 2020-09-18
Changed in ubuntu-power-systems:
status: In Progress → Fix Committed

This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-focal' to 'verification-done-focal'. If the problem still exists, change the tag 'verification-needed-focal' to 'verification-failed-focal'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-focal

This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-bionic' to 'verification-done-bionic'. If the problem still exists, change the tag 'verification-needed-bionic' to 'verification-failed-bionic'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-bionic

This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-xenial' to 'verification-done-xenial'. If the problem still exists, change the tag 'verification-needed-xenial' to 'verification-failed-xenial'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-xenial
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2020-09-29 14:04 EDT-------
Verfied 16.04. Still waiting on Focal and Bionic.

tags: added: verification-done-xenial
removed: verification-needed-xenial
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2020-09-30 13:13 EDT-------
I loaded the kernel on both focal and groovy and connecting/disconnecting from the console seems to work. Is there anything I can do to validate it is working ? thanks

bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2020-09-30 15:03 EDT-------
I loaded the kernel on both focal and groovy and connecting/disconnecting from the console seems to work. Is there anything I can do to validate it is working ? thanks

Thierry, thanks. Did you check it on Bionic ? (comment #15)

Andrew Cloke (andrew-cloke) wrote :

Updated tags to mark focal as verification-done.

tags: added: verification-done-focal
removed: verification-needed-focal
Frank Heimes (fheimes) wrote :

Commit "tty: hvcs: Don't NULL tty->driver_data until hvcs_cleanup()" 9ceefca582e1 is tagged with Ubuntu-5.8.0-20.21 (and later), and the current kernel in groovy is:
linux-generic | 5.8.0.20.25 | groovy | ppc64el
hence this patch landed in groovy, so changing the groovy entry to Fix Released.

Changed in linux (Ubuntu Groovy):
status: In Progress → Fix Released
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2020-10-06 11:40 EDT-------
Barry verified on 18.04 and everything looks good.

tags: added: verification-done-bionic
removed: verification-needed-bionic
Frank Heimes (fheimes) wrote :

Thx for the verification!

Launchpad Janitor (janitor) wrote :
Download full text (31.2 KiB)

This bug was fixed in the package linux - 5.4.0-51.56

---------------
linux (5.4.0-51.56) focal; urgency=medium

  * Packaging resync (LP: #1786013)
    - update dkms package versions

linux (5.4.0-50.55) focal; urgency=medium

  * CVE-2020-16119
    - SAUCE: dccp: avoid double free of ccid on child socket

  * CVE-2020-16120
    - Revert "UBUNTU: SAUCE: overlayfs: ensure mounter privileges when reading
      directories"
    - ovl: pass correct flags for opening real directory
    - ovl: switch to mounter creds in readdir
    - ovl: verify permissions in ovl_path_open()
    - ovl: call secutiry hook in ovl_real_ioctl()
    - ovl: check permission to open real file

linux (5.4.0-49.53) focal; urgency=medium

  * focal/linux: 5.4.0-49.53 -proposed tracker (LP: #1896007)

  * Comet Lake PCH-H RAID not support on Ubuntu20.04 (LP: #1892288)
    - ahci: Add Intel Comet Lake PCH-H PCI ID

  * Novalink (mkvterm command failure) (LP: #1892546)
    - tty: hvcs: Don't NULL tty->driver_data until hvcs_cleanup()

  * Oops and hang when starting LVM snapshots on 5.4.0-47 (LP: #1894780)
    - SAUCE: Revert "mm: memcg/slab: fix memory leak at non-root kmem_cache
      destroy"

  * Intel x710 LOMs do not work on Focal (LP: #1893956)
    - i40e: Fix LED blinking flow for X710T*L devices
    - i40e: enable X710 support

  * Add/Backport EPYC-v3 and EPYC-Rome CPU model (LP: #1887490)
    - kvm: svm: Update svm_xsaves_supported

  * Fix non-working NVMe after S3 (LP: #1895718)
    - SAUCE: PCI: Enable ACS quirk on CML root port

  * Focal update: v5.4.65 upstream stable release (LP: #1895881)
    - ipv4: Silence suspicious RCU usage warning
    - ipv6: Fix sysctl max for fib_multipath_hash_policy
    - netlabel: fix problems with mapping removal
    - net: usb: dm9601: Add USB ID of Keenetic Plus DSL
    - sctp: not disable bh in the whole sctp_get_port_local()
    - taprio: Fix using wrong queues in gate mask
    - tipc: fix shutdown() of connectionless socket
    - net: disable netpoll on fresh napis
    - Linux 5.4.65

  * Focal update: v5.4.64 upstream stable release (LP: #1895880)
    - HID: quirks: Always poll three more Lenovo PixArt mice
    - drm/msm/dpu: Fix scale params in plane validation
    - tty: serial: qcom_geni_serial: Drop __init from qcom_geni_console_setup
    - drm/msm: add shutdown support for display platform_driver
    - hwmon: (applesmc) check status earlier.
    - nvmet: Disable keep-alive timer when kato is cleared to 0h
    - drm/msm: enable vblank during atomic commits
    - habanalabs: validate FW file size
    - habanalabs: check correct vmalloc return code
    - drm/msm/a6xx: fix gmu start on newer firmware
    - ceph: don't allow setlease on cephfs
    - drm/omap: fix incorrect lock state
    - cpuidle: Fixup IRQ state
    - nbd: restore default timeout when setting it to zero
    - s390: don't trace preemption in percpu macros
    - drm/amd/display: Reject overlay plane configurations in multi-display
      scenarios
    - drivers: gpu: amd: Initialize amdgpu_dm_backlight_caps object to 0 in
      amdgpu_dm_update_backlight_caps
    - drm/amd/display: Retry AUX write when fail occurs
    - drm/amd/display: Fix memleak in amdg...

Changed in linux (Ubuntu Focal):
status: Fix Committed → Fix Released
Launchpad Janitor (janitor) wrote :
Download full text (25.4 KiB)

This bug was fixed in the package linux - 4.15.0-121.123

---------------
linux (4.15.0-121.123) bionic; urgency=medium

  * Packaging resync (LP: #1786013)
    - update dkms package versions

linux (4.15.0-120.122) bionic; urgency=medium

  * CVE-2020-16119
    - SAUCE: dccp: avoid double free of ccid on child socket

  * CVE-2020-16120
    - Revert "UBUNTU: SAUCE: overlayfs: ensure mounter privileges when reading
      directories"
    - ovl: pass correct flags for opening real directory
    - ovl: switch to mounter creds in readdir
    - ovl: verify permissions in ovl_path_open()

linux (4.15.0-119.120) bionic; urgency=medium

  * bionic/linux: 4.15.0-119.120 -proposed tracker (LP: #1896040)

  * gtp: unable to associate contextes to interfaces (LP: #1894605)
    - gtp: add GTPA_LINK info to msg sent to userspace

  * uvcvideo: add mapping for HEVC payloads (LP: #1895803)
    - media: videodev2.h: Add v4l2 definition for HEVC
    - SAUCE: media: uvcvideo: Add mapping for HEVC payloads

  * Novalink (mkvterm command failure) (LP: #1892546)
    - tty: hvcs: Don't NULL tty->driver_data until hvcs_cleanup()

  * rtnetlink.sh in net from ubuntu_kernel_selftests is returning 1 for a
    skipped test (LP: #1895258)
    - selftests: net: return Kselftest Skip code for skipped tests

  * Bionic update: upstream stable patchset 2020-09-16 (LP: #1895873)
    - net: Fix potential wrong skb->protocol in skb_vlan_untag()
    - tipc: fix uninit skb->data in tipc_nl_compat_dumpit()
    - ipvlan: fix device features
    - gre6: Fix reception with IP6_TNL_F_RCV_DSCP_COPY
    - ALSA: pci: delete repeated words in comments
    - ASoC: tegra: Fix reference count leaks.
    - mfd: intel-lpss: Add Intel Emmitsburg PCH PCI IDs
    - arm64: dts: qcom: msm8916: Pull down PDM GPIOs during sleep
    - powerpc/xive: Ignore kmemleak false positives
    - media: pci: ttpci: av7110: fix possible buffer overflow caused by bad DMA
      value in debiirq()
    - blktrace: ensure our debugfs dir exists
    - scsi: target: tcmu: Fix crash on ARM during cmd completion
    - iommu/iova: Don't BUG on invalid PFNs
    - drm/amdkfd: Fix reference count leaks.
    - drm/radeon: fix multiple reference count leak
    - drm/amdgpu: fix ref count leak in amdgpu_driver_open_kms
    - drm/amd/display: fix ref count leak in amdgpu_drm_ioctl
    - drm/amdgpu: fix ref count leak in amdgpu_display_crtc_set_config
    - drm/amdgpu/display: fix ref count leak when pm_runtime_get_sync fails
    - scsi: lpfc: Fix shost refcount mismatch when deleting vport
    - selftests/powerpc: Purge extra count_pmc() calls of ebb selftests
    - omapfb: fix multiple reference count leaks due to pm_runtime_get_sync
    - PCI: Fix pci_create_slot() reference count leak
    - rtlwifi: rtl8192cu: Prevent leaking urb
    - mips/vdso: Fix resource leaks in genvdso.c
    - cec-api: prevent leaking memory through hole in structure
    - f2fs: fix use-after-free issue
    - drm/nouveau/drm/noveau: fix reference count leak in nouveau_fbcon_open
    - drm/nouveau: Fix reference count leak in nouveau_connector_detect
    - locking/lockdep: Fix overflow in presentation of average lock-time
    - scsi: iscsi: Do not put h...

Changed in linux (Ubuntu Bionic):
status: Fix Committed → Fix Released
Launchpad Janitor (janitor) wrote :
Download full text (11.7 KiB)

This bug was fixed in the package linux - 4.4.0-193.224

---------------
linux (4.4.0-193.224) xenial; urgency=medium

  * CVE-2020-16119
    - SAUCE: dccp: avoid double free of ccid on child socket

linux (4.4.0-192.222) xenial; urgency=medium

  * xenial/linux: 4.4.0-192.222 -proposed tracker (LP: #1897734)

  * mwifiex stops working after kernel upgrade (LP: #1897299)
    - mwifiex: Increase AES key storage size to 256 bits

  * xenial 4.4.0-191-generic in -proposed has a regression (LP: #1896725)
    - Revert "XEN uses irqdesc::irq_data_common::handler_data to store a per
      interrupt XEN data pointer which contains XEN specific information."

linux (4.4.0-191.221) xenial; urgency=medium

  * xenial/linux: 4.4.0-191.221 -proposed tracker (LP: #1896067)

  * Novalink (mkvterm command failure) (LP: #1892546)
    - tty: hvcs: Don't NULL tty->driver_data until hvcs_cleanup()

  * Xenial update: v4.4.236 upstream stable release (LP: #1895891)
    - HID: core: Correctly handle ReportSize being zero
    - HID: core: Sanitize event code and type when mapping input
    - perf record/stat: Explicitly call out event modifiers in the documentation
    - mm, page_alloc: remove unnecessary variable from free_pcppages_bulk
    - hwmon: (applesmc) check status earlier.
    - ceph: don't allow setlease on cephfs
    - s390: don't trace preemption in percpu macros
    - xen/xenbus: Fix granting of vmalloc'd memory
    - dmaengine: of-dma: Fix of_dma_router_xlate's of_dma_xlate handling
    - batman-adv: Avoid uninitialized chaddr when handling DHCP
    - batman-adv: bla: use netif_rx_ni when not in interrupt context
    - dmaengine: at_hdmac: check return value of of_find_device_by_node() in
      at_dma_xlate()
    - netfilter: nf_tables: incorrect enum nft_list_attributes definition
    - netfilter: nf_tables: fix destination register zeroing
    - dmaengine: pl330: Fix burst length if burst size is smaller than bus width
    - bnxt_en: Check for zero dir entries in NVRAM.
    - fix regression in "epoll: Keep a reference on files added to the check list"
    - tg3: Fix soft lockup when tg3_reset_task() fails.
    - iommu/vt-d: Serialize IOMMU GCMD register modifications
    - thermal: ti-soc-thermal: Fix bogus thermal shutdowns for omap4430
    - include/linux/log2.h: add missing () around n in roundup_pow_of_two()
    - btrfs: drop path before adding new uuid tree entry
    - btrfs: Remove redundant extent_buffer_get in get_old_root
    - btrfs: Remove extraneous extent_buffer_get from tree_mod_log_rewind
    - btrfs: set the lockdep class for log tree extent buffers
    - uaccess: Add non-pagefault user-space read functions
    - uaccess: Add non-pagefault user-space write function
    - btrfs: fix potential deadlock in the search ioctl
    - net: qmi_wwan: MDM9x30 specific power management
    - net: qmi_wwan: support "raw IP" mode
    - net: qmi_wwan: should hold RTNL while changing netdev type
    - net: qmi_wwan: ignore bogus CDC Union descriptors
    - Add Dell Wireless 5809e Gobi 4G HSPA+ Mobile Broadband Card (rev3) to
      qmi_wwan
    - qmi_wwan: Added support for Gemalto's Cinterion PHxx WWAN interface
    - qmi_wwan: add support for Quec...

Changed in linux (Ubuntu Xenial):
status: Fix Committed → Fix Released
Frank Heimes (fheimes) on 2020-10-14
Changed in ubuntu-power-systems:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers