arm64: 'reboot' doesn't work, needs to pull the plug

Bug #1696436 reported by Paolo Pisati
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux-raspi2 (Ubuntu)
New
Undecided
Unassigned
Yakkety
Fix Released
Medium
Unassigned

Bug Description

[Impact]
The 'reboot' command doesn't work in the arm64 variant of the raspi2 kernel: after issuing 'reboot' the board starts the reboot process, reaches its end and prints '[ 451.761674] reboot: Restarting system' but then it sits there forever - the only way to reboot the board is to phisically pull the plug. It only affects Yakkety.

After some investigation i found what's going on: the reboot process for the armhf and arm64 variants of the raspi2 kernel is significanly different, even if it uses the same hardware mechanism.

On armhf, in the board code (arch/arm/mach-bcm2709/bcm2709.c::bcm2709_restart()) that is executed early during boot, bcm2709_restart() is registered as the restart callback in the board data structure, and whenever we execute 'reboot' the function is called: it directly initializes the watchdog hardware with a very short timeout, kicks it and then sits there waiting for the timer to expire (and the watchdog to reboot the board)

For arm64, on the other hard, there's no board code (the board code is a relic from the pre-dt period and when arm64 was started they went DT only from the get go), so there isn't a board structure containing custom functions for every board (init board call back, post init callback, restart call back, etc), instead the arm64 reboot code (arch/arm64/kernel/process.c::machine_restart()) invokes the generic reboot code (kernel/reboot.c::do_kernel_restart()) which in turn walks and invokes, every reboot handler that was registered on the restart_handler_list (kernel/reboot::restart_handler_list) - in other words, it relies on another piece of code (able to reset the board) to register its reboot function on that handler list and then it uses it. In the raspberry board, the hardware capable of rebooting the board is the hardware watchdog (that among the other things register a reboot handler once it attaches), but on Y/arm64 the kernel driver for such hardware was built as a module and it doesn't autoload on boot, so restart_handler_list results empy when 'reboot' is invoked.

[Fix]
Built-in the watchdog driver (CONFIG_BCM2835_WDT).

[Test case]
Try to execute 'reboot' on Y/arm64 and watch the board sits at the 'Restarting system', then install a kernel with the watchdog driver built-in and execute reboot again.

Stefan Bader (smb)
Changed in linux-raspi2 (Ubuntu Yakkety):
importance: Undecided → Medium
status: New → Fix Committed
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package linux-raspi2 - 4.8.0-1043.47

---------------
linux-raspi2 (4.8.0-1043.47) yakkety; urgency=low

  * linux-raspi2: 4.8.0-1043.47 -proposed tracker (LP: #1701020)

  * arm64: 'reboot' doesn't work, needs to pull the plug (LP: #1696436)
    - [Config] BCM2835_WDT=y
    - abi: remove bcm2835_wdt from the modules list

  [ Ubuntu: 4.8.0-59.64 ]

  * linux: 4.8.0-59.64 -proposed tracker (LP: #1701019)
  * KILLER1435-S[0489:e0a2] BT cannot search BT 4.0 device (LP: #1699651)
    - Bluetooth: btusb: Add support for 0489:e0a2 QCA_ROME device
  * CVE-2017-7895
    - nfsd4: minor NFSv2/v3 write decoding cleanup
    - nfsd: stricter decoding of write-like NFSv2/v3 ops
  * CVE-2017-5551
    - tmpfs: clear S_ISGID when setting posix ACLs
  * CVE-2017-9605
    - drm/vmwgfx: Make sure backup_handle is always valid
  * CVE-2017-1000380
    - ALSA: timer: Fix race between read and ioctl
    - ALSA: timer: Fix missing queue indices reset at SNDRV_TIMER_IOCTL_SELECT
  * CVE-2017-9150
    - bpf: don't let ldimm64 leak map addresses on unprivileged
  * CVE-2017-5576
    - drm/vc4: Fix an integer overflow in temporary allocation layout.
  * Processes in "D" state due to zap_pid_ns_processes kernel call with Ubuntu +
    Docker (LP: #1698264)
    - pid_ns: Sleep in TASK_INTERRUPTIBLE in zap_pid_ns_processes
  * CVE-2016-9755
    - netfilter: ipv6: nf_defrag: drop mangled skb on ream error
  * CVE-2017-7346
    - drm/vmwgfx: limit the number of mip levels in vmw_gb_surface_define_ioctl()
  * CVE-2017-8924
    - USB: serial: io_ti: fix information leak in completion handler
  * CVE-2017-8925
    - USB: serial: omninet: fix reference leaks at open
  * CVE-2017-9074
    - ipv6: Check ip6_find_1stfragopt() return value properly.
  * CVE-2014-9900
    - net: Zeroing the structure ethtool_wolinfo in ethtool_get_wol()
  * OpenPower: Some multipaths temporarily have only a single path
    (LP: #1696445)
    - scsi: ses: don't get power status of SES device slot on probe

 -- Thadeu Lima de Souza Cascardo <email address hidden> Thu, 29 Jun 2017 16:38:19 -0300

Changed in linux-raspi2 (Ubuntu Yakkety):
status: Fix Committed → Fix Released
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.