Ubuntu 14.10 freezes when use smt-enabled=off as kernel argument

Bug #1402141 reported by bugproxy
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Fix Released
High
Unassigned
Utopic
Fix Released
Medium
Chris J Arges

Bug Description

SRU Justification:

Impact: Booting a POWER8 machine with smt-enabled=off will cause a system to hang at "Freeing initrd memory", note is only affects kernel with powernv split-core support.
Fix: commit d70a54e2d08510a99b1f10eceeae6f2f7086e226 upstream
Testcase: Boot with smt-enabled=off on a POWER8 machine

--

== Comment: #0 - Paulo Flabiano Smorigo <email address hidden> - 2014-11-18 12:28:42 ==
Using Ubuntu as the host, if you add smt-enabled=off as kernel argument, the system will boot until the "Freeing initrd memory" line:
...
[ 1.371729] vgaarb: loaded
[ 1.372989] SCSI subsystem initialized
[ 1.373977] libata version 3.00 loaded.
[ 1.374158] usbcore: registered new interface driver usbfs
[ 1.374246] usbcore: registered new interface driver hub
[ 1.374382] usbcore: registered new device driver usb
[ 1.374505] pps_core: LinuxPPS API ver. 1 registered
[ 1.374563] pps_core: Software ver. 5.3.6 - Copyright 2005-2007 Rodolfo Giometti <email address hidden>
[ 1.374671] PTP clock support registered
[ 1.377135] NetLabel: Initializing
[ 1.377218] NetLabel: domain hash size = 128
[ 1.377328] NetLabel: protocols = UNLABELED CIPSOv4
[ 1.377472] NetLabel: unlabeled traffic allowed by default
[ 1.377983] Switched to clocksource timebase
[ 1.395029] AppArmor: AppArmor Filesystem Enabled
[ 1.402044] NET: Registered protocol family 2
[ 1.403795] TCP established hash table entries: 524288 (order: 6, 4194304 bytes)
[ 1.408343] TCP bind hash table entries: 65536 (order: 4, 1048576 bytes)
[ 1.409301] TCP: Hash tables configured (established 524288 bind 65536)
[ 1.409490] TCP: reno registered
[ 1.409645] UDP hash table entries: 65536 (order: 5, 2097152 bytes)
[ 1.411943] UDP-Lite hash table entries: 65536 (order: 5, 2097152 bytes)
[ 1.415409] NET: Registered protocol family 1
[ 1.415753] PCI: CLS 128 bytes, default 128
[ 1.415962] Trying to unpack rootfs image as initramfs...
[ 2.250464] Freeing initrd memory: 21952K (c000000003820000 - c000000004d90000)

Machine Type = Power 8 (S822L)

== Comment: #1 - Thadeu Lima De Souza Cascardo <email address hidden> - 2014-11-18 13:42:37 ==
What is the firmware version?

Cascardo.

== Comment: #2 - Paulo Flabiano Smorigo <email address hidden> - 2014-11-19 07:13:35 ==
Currently is FW810.02 (SV810_061). Will update it today.

Smorigo.

== Comment: #3 - Paulo Flabiano Smorigo <email address hidden> - 2014-11-19 12:47:24 ==
Updated to FW810.20 (SV810_101). Nothing changed.

== Comment: #4 - Greg Kurz <email address hidden> - 2014-11-20 05:47:45 ==
I reproduce it on a s824 with the FW810.20 (TV810_101) firmware, running 14.04.2 "alpha" (kernel 3.16.0-25). The issue doesn't show up with kernel 3.13.0-39. I shall try mainline and do some bisect.

== Comment: #5 - Greg Kurz <email address hidden> - 2014-11-20 13:31:03 ==
FYI issue is upstream.

== Comment: #6 - Breno Henrique Leitao <email address hidden> - 2014-11-24 11:23:04 ==
(In reply to comment #5)
> FYI issue is upstream.

Greg, are you working to solve this issue?

== Comment: #7 - Greg Kurz <email address hidden> - 2014-11-24 12:08:33 ==
(In reply to comment #6)
> (In reply to comment #5)
> > FYI issue is upstream.
>
> Greg, are you working to solve this issue?

Yes I am.

== Comment: #8 - Greg Kurz <email address hidden> - 2014-12-01 04:56:07 ==
The hang occurs because all running threads are looping in the split core code:

static void wait_for_sync_step(int step)
{
 int i, cpu = smp_processor_id();

 for (i = cpu + 1; i < cpu + threads_per_core; i++)
> while(per_cpu(split_state, i).step < step)
> barrier();

The problem is that the split core code needs all possible threads to participate... if the kernel is booted with smt-enabled set to something different than the maximum value, some threads are missing and this ruins the sync.

== Comment: #9 - Greg Kurz <email address hidden> - 2014-12-01 05:24:28 ==
The current implementaqtion for smt-enabled= is a hack: it simply leaves hw threads looping where they happen to be (firmware probably)... This isn't acceptable in a production environment.

An "acceptable" fix would be to start all threads anyway and offline the ones that need to be to honour the requested SMT mode AFTER subcores init. This requires a non-trivial patch.

Since changing SMT mode from userspace when the system is booted is really straightforward, Michael Ellerman suggests we simply drop that smt-enabled= feature.

Smorigo,

Why were you using smt-enabled= ? Is there a reason not to do it after the system is booted with
ppc64_cpu --smt or writing directly to /sys/devices/system/cpu/cpu*/online ?

== Comment: #10 - Paulo Flabiano Smorigo <email address hidden> - 2014-12-01 06:23:34 ==
I used smt-enabled= because for me was the easier way to disable it. Like, add this parameter in GRUB_CMDLINE_LINUX and done. :)

I'll check if there is a problem to drop it.

== Comment: #11 - Paulo Flabiano Smorigo <email address hidden> - 2014-12-01 08:30:55 ==
Greg, are you saying to dropping it for good? Maybe we can add that as a feature request for next year. Btw, I'm ok with drop it for now.

== Comment: #12 - Greg Kurz <email address hidden> - 2014-12-01 09:30:00 ==
(In reply to comment #11)
> Greg, are you saying to dropping it for good? Maybe we can add that as a
> feature request for next year. Btw, I'm ok with drop it for now.

Yes, drop it for good as suggested by Michael Ellerman...

<mpe> groug: that smt-enabled stuff is just a hack. It leaves the cpu executing wherever it happens to be, possibly in firmware, possibly busy looping somewhere, it's really no good for use in production
<mpe> the only way we could make it usable I think is to have the cpu come up, and then we offline it
<mpe> but I'm really inclined to say that should just be done in userspace
<groug> mpe, yeah... I had thought of something similar (starting and then offlining) but I agree it should be handled from userspace
<mpe> I'll talk to benh and anton about it tomorrow, but I think we just rip it out

The point is that it is already extremely easy to change SMT mode from an init script and you get the same result... compared to the hassle of doing it in the kernel without breaking things. Not even worth a feature request I would say.

== Comment: #13 - Greg Kurz <email address hidden> - 2014-12-12 08:50:25 ==
I've sent a patch:

powerpc/powernv: force all CPUs to be bootable

http://patchwork.ozlabs.org/patch/420440/

Revision history for this message
bugproxy (bugproxy) wrote : Full kernel log

Default Comment by Bridge

tags: added: architecture-ppc64le bugnameltc-119051 severity-medium targetmilestone-inin---
Revision history for this message
bugproxy (bugproxy) wrote : Comment bridged from LTC Bugzilla

------- Comment From <email address hidden> 2014-12-13 05:30 EDT-------
Hi Canonical,

The request is to pick up the above patch. Thx.

penalvch (penalvch)
affects: ubuntu → linux (Ubuntu)
Changed in linux (Ubuntu):
importance: Undecided → High
status: New → Triaged
Revision history for this message
Chris J Arges (arges) wrote :
Revision history for this message
Chris J Arges (arges) wrote :

Now in Linus' tree:
git describe --contains d70a54e2d08510a99b1f10eceeae6f2f7086e226
v3.19-rc1~22^2

Chris J Arges (arges)
description: updated
Revision history for this message
Chris J Arges (arges) wrote :

We should also check if this affects 3.13.
Vivid is still tracking upstream so we should get these patches in automatically.

Changed in linux (Ubuntu Utopic):
assignee: nobody → Chris J Arges (arges)
status: New → In Progress
importance: Undecided → Critical
importance: Critical → Medium
Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2015-01-06 18:39 EDT-------
(In reply to comment #20)
> We should also check if this affects 3.13.

It doesn't because powernv split-core isn't available in 3.13.

Chris J Arges (arges)
description: updated
Chris J Arges (arges)
Changed in linux (Ubuntu):
status: Triaged → In Progress
Brad Figg (brad-figg)
Changed in linux (Ubuntu Utopic):
status: In Progress → Fix Committed
Chris J Arges (arges)
Changed in linux (Ubuntu):
status: In Progress → Fix Committed
Revision history for this message
Brad Figg (brad-figg) wrote :

This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-utopic' to 'verification-done-utopic'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-utopic
Revision history for this message
Chris J Arges (arges) wrote :

Booted with smt-enabled=off all CPUs booted anyway. (Which is what the patch does).

tags: added: verification-done-utopic
removed: verification-needed-utopic
Revision history for this message
Breno Leitão (breno-leitao) wrote :

Thanks Chris, we also verified internally using 3.16.0-30 kernel in -proposed, and it worked successfully.

Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2015-01-29 19:25 EDT-------
Thanks, closing.

Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (14.2 KiB)

This bug was fixed in the package linux - 3.16.0-30.40

---------------
linux (3.16.0-30.40) utopic; urgency=low

  [ Seth Forshee ]

  * Release Tracking Bug
    - LP: #1409890

  [ Andy Whitcroft ]

  * Revert "SAUCE: scsi: hyper-v storsvc switch up to SPC-3"
  * [Packaging] uploadnum should be the remainder of the version
    - LP: #1407755

  [ K. Y. Srinivasan ]

  * SAUCE: storvsc: force SPC-3 compliance on win8 and win8 r2 hosts
    - LP: #1406867

  [ Upstream Kernel Changes ]

  * Revert "xhci: clear root port wake on bits if controller isn't wake-up
    capable"
    - LP: #1408697
  * KVM: PPC: BOOK3S: HV: CMA: Reserve cma region only in hypervisor mode
    - LP: #1400209
  * powerpc/powernv: Ignore smt-enabled on Power8 and later
    - LP: #1402141
  * net/mlx4_en: Add VXLAN ndo calls to the PF net device ops too
    - LP: #1407760
  * net/mlx4_core: Enable CQE/EQE stride support
    - LP: #1400127
  * net/mlx4_core: Cache line EQE size support
    - LP: #1400127
  * net/mlx4_en: Add mlx4_en_get_cqe helper
    - LP: #1400127
  * net/mlx4_core: Introduce mlx4_get_module_info for cable module info
    reading
    - LP: #1400127
  * ethtool, net/mlx4_en: Cable info, get_module_info/eeprom ethtool
    support
    - LP: #1400127
  * net/mlx4_core: Introduce ACCESS_REG CMD and eth_prot_ctrl dev cap
    - LP: #1400127
  * net/mlx4_core: Add ethernet backplane autoneg device capability
    - LP: #1400127
  * ethtool, net/mlx4_en: Add 100M, 20G, 56G speeds ethtool reporting
    support
    - LP: #1400127
  * net/mlx4_en: Use PTYS register to query ethtool settings
    - LP: #1400127
  * net/mlx4_en: Use PTYS register to set ethtool settings (Speed)
    - LP: #1400127
  * net/mlx4_en: Add support for setting rxvlan offload OFF/ON
    - LP: #1400127
  * net/mlx4_en: Add ethtool support for [rx|tx]vlan offload set to OFF/ON
    - LP: #1400127
  * net/mlx4_core: Prevent VF from changing port configuration
    - LP: #1400127
  * net/mlx4_en: mlx4_en_set_settings() always fails when autoneg is set
    - LP: #1400127
  * sparc64: Fix constraints on swab helpers.
    - LP: #1408697
  * inetdevice: fixed signed integer overflow
    - LP: #1408697
  * ipv4: Fix incorrect error code when adding an unreachable route
    - LP: #1408697
  * ieee802154: fix error handling in ieee802154fake_probe()
    - LP: #1408697
  * qmi_wwan: Add support for HP lt4112 LTE/HSPA+ Gobi 4G Modem
    - LP: #1408697
  * bonding: fix curr_active_slave/carrier with loadbalance arp monitoring
    - LP: #1408697
  * pptp: fix stack info leak in pptp_getname()
    - LP: #1408697
  * ipx: fix locking regression in ipx_sendmsg and ipx_recvmsg
    - LP: #1408697
  * net/mlx4_en: Advertize encapsulation offloads features only when VXLAN
    tunnel is set
    - LP: #1408697
  * target: Don't call TFO->write_pending if data_length == 0
    - LP: #1408697
  * vhost-scsi: Take configfs group dependency during
    VHOST_SCSI_SET_ENDPOINT
    - LP: #1408697
  * srp-target: Retry when QP creation fails with ENOMEM
    - LP: #1408697
  * ASoC: fsi: remove unsupported PAUSE flag
    - LP: #1408697
  * ASoC: rsnd: remove unsupported PAUSE flag
    - LP: #1408697
  * ib_isert: Add max_send_sge...

Changed in linux (Ubuntu Utopic):
status: Fix Committed → Fix Released
bugproxy (bugproxy)
tags: added: targetmilestone-inin1410
removed: targetmilestone-inin--- verification-done-utopic
Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2015-03-20 15:44 EDT-------
I could verify that the issue fixed in the 3.16.0-30.40 linux package.

Revision history for this message
Chris J Arges (arges) wrote :

Fixed in Vivid since it was rebased to v3.19.

Changed in linux (Ubuntu):
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.