[highbank] hvc0 getty causes random hangs

Bug #1171582 reported by dann frazier
12
This bug affects 2 people
Affects Status Importance Assigned to Milestone
finish-install (Ubuntu)
Fix Released
High
dann frazier
Raring
Fix Released
High
dann frazier
Saucy
Fix Released
High
dann frazier
linux (Ubuntu)
Fix Released
High
Paolo Pisati
Raring
Won't Fix
High
Paolo Pisati
Saucy
Won't Fix
High
Paolo Pisati

Bug Description

After doing a fresh install of raring, I've observed several issues across multiple systems, and they all seem to be correlated with the hvc0 upstart job (hvc0 getty) running.

One symptom I've seen on both machines is that the network goes unresponsive for several seconds before recovering. If I run a ping, I see a pattern of 84-85s of responses, followed by about 41s of failures.

The khvcd process consumes a lot of CPU - often 100% of a processor.

Processes seem to randomly hang and are seemingly unrecoverable. I ran an apt-get install that never emitted any output, but remained hung until I rebooted the system.

When I try to kill the hvc0 upstart job (sudo stop hvc0 - or a kill -9), the process does not die.

One one of the two hosts, I observed the console getty (ttyAMA0) successfully starting, but never providing a login prompt to the console. Restarting this process didn't help solve the problem.

Disabling the hvc0 upstart job and rebooting reliably makes these problems go away.

Revision history for this message
dann frazier (dannf) wrote :

[ 0.000000] Linux version 3.8.0-19-generic (buildd@alnasl) (gcc version 4.7.3 (Ubuntu/Linaro 4.7.3-1ubuntu1) ) #29-Ubuntu SMP Wed Apr 17 18:22:15 UTC 2013 (Ubuntu 3.8.0-19.29-generic 3.8.8)

Chris Van Hoof (vanhoof)
Changed in linux (Ubuntu):
importance: Undecided → High
Revision history for this message
Brad Figg (brad-figg) wrote : Missing required logs.

This bug is missing log files that will aid in diagnosing the problem. From a terminal window please run:

apport-collect 1171582

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete
Paolo Pisati (p-pisati)
Changed in linux (Ubuntu):
assignee: nobody → Paolo Pisati (p-pisati)
Paolo Pisati (p-pisati)
tags: added: highbank
tags: added: raring
Revision history for this message
dann frazier (dannf) wrote :

The /etc/init/hvc0.conf file is created by the installer (finish-install, 90console). The code that adds it is in the following conditional:

elif [ -e /sys/bus/xen ] && [ -e /dev/hvc0 ]; then

Until raring, we didn't have Xen or hvc_console support on armhf, so this code branch was never taken.

We're only seeing this now because our testing of highbank raring has been all based on quantal->raring upgrades, where obviously the installer code does not run. But, once we had a working installer (yesterday), we began hitting it reliably.

Here's a change to finish-install that avoids creating an upstart job for the hvc0 getty on armhf/generic systems. Perhaps this can serve as a workaround until the kernel can be fixed?

I've investigated kernel cmdline options we could pass to serve as a documented workaround, but came up empty handed. I found no option to disable xen at runtime (avoiding the instantiation of /sys/bus/xen), and no option to disable the hvc_console driver, which is statically linked.

Revision history for this message
Chris Van Hoof (vanhoof) wrote :

Verified this from raring-proposed using finish-install version 2.42ubuntu1 on a highbank node:
 * 90console did execute per installation logs
 * no existence of /etc/init/hvc0.conf on the installed system
 * no odd hangs, sporadic network access, or getty processes pegging one of the cores

Changed in linux (Ubuntu):
status: Incomplete → Confirmed
tags: added: verification-done
Chris Van Hoof (vanhoof)
Changed in finish-install (Ubuntu):
status: New → Confirmed
importance: Undecided → High
assignee: nobody → dann frazier (dannf)
Changed in linux (Ubuntu Raring):
status: Confirmed → Triaged
status: Triaged → In Progress
milestone: none → raring-updates
Revision history for this message
Ubuntu Foundations Team Bug Bot (crichton) wrote :

The attachment "workaround for this issue in finish-install .udeb" seems to be a debdiff. The ubuntu-sponsors team has been subscribed to the bug report so that they can review and hopefully sponsor the debdiff. If the attachment isn't a patch, please remove the "patch" flag from the attachment, remove the "patch" tag, and if you are member of the ~ubuntu-sponsors, unsubscribe the team.

[This is an automated message performed by a Launchpad user owned by ~brian-murray, for any issue please contact him.]

tags: added: patch
Adam Conrad (adconrad)
Changed in finish-install (Ubuntu Raring):
status: Confirmed → Fix Released
Revision history for this message
Rob Herring (r-herring) wrote :

This fix seems to be applied to highbank, but not midway which is having same soft lockup problems.

Revision history for this message
Adam Conrad (adconrad) wrote :

Applied the same fix to armhf/generic-lpae (actually, armhf/generic*) as armhf/generic, doing an end-to-end installer test to make sure this works around the lockups we were seeing.

Revision history for this message
Adam Conrad (adconrad) wrote :

Installer and stress-test on midway passed, considering this closed (again).

Andy Whitcroft (apw)
Changed in linux (Ubuntu):
milestone: raring-updates → saucy-updates
Revision history for this message
Joseph Salisbury (jsalisbury) wrote : Closing unsupported series nomination.

This bug was nominated against a series that is no longer supported, ie saucy. The bug task representing the saucy nomination is being closed as Won't Fix.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu Saucy):
status: In Progress → Won't Fix
Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

This bug was nominated against a series that is no longer supported, ie raring. The bug task representing the raring nomination is being closed as Won't Fix.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu Raring):
status: In Progress → Won't Fix
Changed in linux (Ubuntu):
status: In Progress → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.