ISST-KVM:Ubuntu14.04: guest console not accessible but ssh and ping works fine

Bug #1391919 reported by bugproxy
14
This bug affects 1 person
Affects Status Importance Assigned to Milestone
finish-install (Ubuntu)
Confirmed
Undecided
Adam Conrad
Trusty
Fix Released
Undecided
Adam Conrad

Bug Description

[ SRU Justification ]
There seem to be any number of ways libvirt/qemu can give us a console on a port that is "non-standard" by PowerVM/OpenFirmware standards, and this leads to a system with no getty on hvc0 because finish-install isn't particularly bright about dealing with consoles appearing at device-tree nodes it wasn't expecting.

This SRU just sidesteps all of that and, if we're running in a KVM-emulated pSeries machine, we unconditionally set up an upstart job for an hvc0 console, rather than trying to do fancy detection.

[ Test Case ]
Run a netinst with apt-setup/proposed=true to pull in the new finish-install and verify that it always creates hvc0 on a qemu machine, regardless of the location of the console.

[ Regression Potential ]
Basically zero, as I opted for the safer route of just cargo-culting the console-creation code into a new if block, rather than trying to alter the logic to reduce code duplication and risk subtly breaking the flow for cases where hvc* might exist, despite not being a qemu PPC machine.

[ Original Report ]
-- Problem Description --
I installed a guest with ubuntu14.04 using vga over vnc using kimchi. All setup was done, i could ping, ssh to guest, access guest using kimchi. But to access guest using "virsh console":

1. Removed video and graphics line from xml
2. Destroyed guest
3. Started guest using "virsh start --console <<guestname>>

After this guest hangs and dmesg is as follows:

 * Starting SystemD login management service [ OK ]
 * Starting load fallback graphics devices [fail]
 * Starting configure network device security [ OK ]
 * Starting system logging daemon [ OK ]
 * Starting save udev log and update rules [ OK ]
 * Stopping rpcsec_gss daemon [ OK ]
 * Stopping save udev log and update rules [ OK ]
 * Starting set console font [ OK ]
 * Starting NFSv4 id <-> name mapper [ OK ]
 * Stopping set console font [ OK ]
 * Starting userspace bootsplash [ OK ]
 * Stopping userspace bootsplash [ OK ]
 * Starting Send an event to indicate plymouth is up [ OK ]
 * Stopping Send an event to indicate plymouth is up [ OK ]
 * Starting configure virtual network devices [ OK ]
 * Starting NFSv4 id <-> name mapper [ OK ]
 * Starting configure network device security [ OK ]
 * Starting configure network device [ OK ]
 * Starting Mount network filesystems [ OK ]
 * Starting Upstart job to start rpcbind on boot only [ OK ]
 * Starting Failsafe Boot Delay [ OK ]
 * Stopping Upstart job to start rpcbind on boot only [ OK ]
 * Stopping Failsafe Boot Delay [ OK ]
 * Starting System V initialisation compatibility [ OK ]
 * Stopping Mount network filesystems [ OK ]
 * Starting configure network device [ OK ]
 * Starting Mount network filesystems [ OK ]
 * Stopping Mount network filesystems [ OK ]
Skipping profile in /etc/apparmor.d/disable: usr.sbin.rsyslogd
 * Starting RPC portmapper replacement [ OK ]
 * Starting NSM status monitor [ OK ]
 * Starting AppArmor profiles [ OK ]
 * Starting Bridge file events into upstart [ OK ]
 * Starting Bridge socket events into upstart [ OK ]
 * Stopping System V initialisation compatibility [ OK ]
 * Starting System V runlevel compatibility [ OK ]
 * Starting save kernel messages [ OK ]
 * Starting internet superserver inetd [ OK ]
 * Stopping save kernel messages [ OK ]
 * Restoring resolver state... [ OK ]
 * Starting CPU interrupts balancing daemon [ OK ]
 * Stopping System V runlevel compatibility [ OK ]

I can ping and ssh to guest when machine is in hung state.

It seems it is not giving console output due to below error

Starting load fallback graphics devices [fail]

Problem occurs when video and graphics lines are deleted, cannot access console of guest. But we can ssh to guest.

Logging into your system:

root@ubu14mdbsvr1:~# cd /etc/init
root@ubu14mdbsvr1:/etc/init# cat hvc0.conf
cat: hvc0.conf: No such file or directory

On my Ubuntu guest, I have the following in hvc0.conf:

# hvc0 - getty
#
# This service maintains a getty on hvc0 from the point the system is
# started until it is shut down again.

start on stopped rc RUNLEVEL=[2345] and (
            not-container or
            container CONTAINER=lxc or
            container CONTAINER=lxc-libvirt)

stop on runlevel [!2345]

respawn
exec /sbin/getty -L hvc0 9600 vt100

I expect if you create that file and issue a `start hvc0`, you will be able to login. I believe the Ubuntu installer when using graphics mode assumes you will maintain graphics mode and you are responsible as the user to setup the hvc0 login.

I created "hvc0.conf" file in /etc/init, issued `start hvc0` and i could access the console of guest. It worked.

Problem persists in 14.10. General sense is that we should require that regardless of the installation method (graphical or not), hvc0 is always enabled on powerkvm guests, from a customer perspective.

Revision history for this message
bugproxy (bugproxy) wrote : XML file before and after removing graphics and video lines

Default Comment by Bridge

tags: added: architecture-ppc64 bugnameltc-116562 severity-high targetmilestone-inin---
Luciano Chavez (lnx1138)
affects: ubuntu → debian-installer (Ubuntu)
Revision history for this message
bugproxy (bugproxy) wrote : Comment bridged from LTC Bugzilla

------- Comment From <email address hidden> 2015-02-02 09:11 EDT-------
Any updates on this bug ?

Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2015-02-21 18:49 EDT-------
*** Bug 121875 has been marked as a duplicate of this bug. ***

Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2015-02-23 21:26 EDT-------
Canonical, do you agree with our assessment of the issue?

Thanks,
Nish

Changed in debian-installer (Ubuntu):
status: New → Incomplete
status: Incomplete → Confirmed
Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2015-05-02 05:57 EDT-------
*** Bug 124322 has been marked as a duplicate of this bug. ***

Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2015-05-05 01:22 EDT-------
(In reply to comment #15)
> (In reply to comment #14)
> > (In reply to comment #13)
> > > (In reply to comment #12)
> > > > (In reply to comment #11)
>
> > Hrm, maybe it's the login you need to look for. On my ubuntu guest:
> >
> > root@ubuntu:~# ps aux | grep login
> > root 1762 0.0 0.3 8704 7744 hvc0 Ss 15:03 0:00 /bin/login
> > --
> >
> > Is there a login process running on your guest?
>
> I checked for login process in my guest, and its running:
>
> root@ubu14mdbsvr1:~# ps aux | grep login
> root 483 0.0 0.0 5120 3008 pts/2 S+ 00:57 0:00 grep
> --color=auto login
> root 1655 0.0 0.0 7040 5696 ? Ss Sep30 0:00
> /lib/systemd/systemd-logind
> root 5360 0.0 0.0 8576 7616 tty1 Ss Sep30 0:00 /bin/login
> --
> root 31455 0.0 0.0 6464 4416 ? S 00:57 0:00 /bin/bash
> /testcases/tcp/ltp/tcp_cmds/rlogin/rlogin01

Just FYI on this particular point: when everything's working you will have a "getty" running on hvc0 *before* you log in and a "login" process running on it *after* you log in.

tags: added: targetmilestone-inin14043
removed: targetmilestone-inin---
Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2015-05-05 04:55 EDT-------
Apparently this is still a problem for 14.04 on PowerKVM. I just looked at a bug that was dup of this (LTC bug 124322). It was a 14.04.2 with 3.13.0-51 kernel, but I think this affects all 14.04 LTS. Adding the "hvc0.conf" as described above fixed it.

This doesn't seem to be a problem for 15.04. GA level 15.04 on the same PowerKVM guest worked fine. There are some differences between 14.04 and 15.04; the latter uses systemd to manage the gettys 14.04 and uses initscripts (upstart)? But the requirement is the same: we must have a getty running on hvc0.

This also doesn't seem to be a problem for 14.04 in the PowerVM environment, and I'm not sure why. Both appear to use hvc0 for the text console, but on PowerVM /etc/init/hvc0.conf is getting created and on PowerKVM it is not.

(It's also not a problem for 14.04 on x86 KVM, which uses a device other than /dev/hvc0...)

I noticed that most of the other, similar files are owned by upstart:

$ dpkg-query -L upstart | egrep "tty|console"
/etc/init/console.conf
/etc/init/tty1.conf
/etc/init/tty2.conf
/etc/init/tty3.conf
/etc/init/tty4.conf
/etc/init/tty5.conf
/etc/init/tty6.conf

But I could not find the owner of hvc0.conf, and by the timestamp is clear it is created at intall time:

$ dpkg-query -S /etc/init/hvc0.conf
dpkg-query: no path found matching pattern /etc/init/hvc0.conf

$ ll /etc/init/hvc0.conf
-rw-r--r-- 1 root root 353 May 5 00:01 /etc/init/hvc0.conf

Can we get someone from Canonical to explain how this file is created, and why we might be getting it in PowerVM but not PowerKVM?

I noticed in both PowerVM and PowerKVM, /proc/consoles lists hvc0...

$ cat /proc/consoles
hvc0 -W- (EC ) 229:0

$ cat /proc/devices | grep 229
229 hvc

So maybe this can be used as an indicator of when hvc0.conf should be created?

Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2015-05-05 05:04 EDT-------
A few more thoughts: The situation may be more complicated on PowerKVM due there being many ways to create the console device in the domain file. What I mean is: it's possible there is some domain configuration that makes 14.04 work as-is, and I just haven't seen it. The bug I just worked on had a configuration like this, with no "video" or "graphics" stanza:

<serial type='pty'>
<source path='/dev/pts/4'/>
<target type='isa-serial' port='0'/>
<alias name='serial0'/>
<address type='spapr-vio' reg='0x30001000'/>
</serial>
<console type='pty' tty='/dev/pts/4'>
<source path='/dev/pts/4'/>
<target type='serial' port='0'/>
<alias name='serial0'/>
<address type='spapr-vio' reg='0x30001000'/>
</console>

Honestly, I'm not even clear how the console gets mapped to /dev/hvc0 on PowerKVM. On x86 KVM a similar-looking config maps the console to /dev/ttyS0. But in all cases, the device can be gleaned from /proc/consoles.

And of course, I'm considering only what is seen over the text console accessed by "virsh console". There are additional ttys mapped to the video console, and that's something that's also a bit different between x86 and Power..

Anyway, without going into more detail that may not be relevant: I'd like to understand how hvc0.conf is created, why it's not being created in some cases, and if ensuring it is created might just be the simple fix.

Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2015-05-05 14:30 EDT-------
*** Bug 124322 has been marked as a duplicate of this bug. ***

Revision history for this message
Adam Conrad (adconrad) wrote :

 <address type='spapr-vio' reg='0x30001000'/>

Wasn't that a bug/misfeature in an older version of kimchi to use that oddly non-standard address? Not that it isn't also a bug in d-i that we should look more aggressively for consoles to set up, and I'll assign this to myself to see if maybe we can do something slightly less silly for 14.04.3

Changed in debian-installer (Ubuntu):
assignee: nobody → Adam Conrad (adconrad)
Revision history for this message
bugproxy (bugproxy) wrote : XML file before and after removing graphics and video lines

Default Comment by Bridge

Revision history for this message
Steve Langasek (vorlon) wrote :

> I'd like to understand how hvc0.conf is created, why it's not
> being created in some cases,

The installer uses various means to detect if a serial console is in use and creates the upstart job on the fly.

> and if ensuring it is created might just be the simple fix.

Yes that is the correct fix.

Revision history for this message
Adam Conrad (adconrad) wrote :

The slightly more long-winded analysis I tossed in an email a few minutes ago:

So, there are two (maybe three) issues in play with this specific bug.

1) The console wasn't in use as "The Console" when we installed, which
   we (probably incorrectly, in this case) key off of to decide what
   console to set up.

2) The console he set up was using an old kimchi default that used a
   non-standard address we don't look for. Newer versions of kimchi
   are meant to use a more standard address to work around that, but
   it's still an Ubuntu bug that we blatantly ignore random consoles.

3) He may not have had that console configured at install time at all,
   and we only probe for and setup up console jobs in the installer,
   we make no attempt to do so at boot time. So, if new consoles were
   to show up on second or fifth boot, tough look, no console for you.

The first two bugs are ones we can and should fix in trusty. The third
is probably a wart we need to live with, but can probably fix a bit
better with systemd in 15.04/15.10/16.04.

Revision history for this message
bugproxy (bugproxy) wrote :

Default Comment by Bridge

Revision history for this message
bugproxy (bugproxy) wrote : Comment bridged from LTC Bugzilla

------- Comment From <email address hidden> 2015-05-19 04:44 EDT-------
(In reply to comment #48)
> > I'd like to understand how hvc0.conf is created, why it's not
> > being created in some cases,
>
> The installer uses various means to detect if a serial console is in use and
> creates the upstart job on the fly.
>
As I suggested above, maybe checking /proc/consoles, if it's not already doing that?

> > and if ensuring it is created might just be the simple fix.
>
> Yes that is the correct fix.
>
Yeah, if we think there's no harm in always adding the file, let's just do that and be done with it, :-)

> The slightly more long-winded analysis I tossed in an email a few minutes
> ago:
>
> So, there are two (maybe three) issues in play with this specific bug.
>
FYI, I just worked on a bug that was a dup of this and...

> 1) The console wasn't in use as "The Console" when we installed, which
> we (probably incorrectly, in this case) key off of to decide what
> console to set up.
>
In my case it was the console used for install: a text-based install using "virsh console", and it was recognized as hvc0.

> 2) The console he set up was using an old kimchi default that used a
> non-standard address we don't look for. Newer versions of kimchi
> are meant to use a more standard address to work around that, but
> it's still an Ubuntu bug that we blatantly ignore random consoles.
>
In my case no kimchi, just generic PowerKVM and virsh.

> 3) He may not have had that console configured at install time at all,
> and we only probe for and setup up console jobs in the installer,
> we make no attempt to do so at boot time. So, if new consoles were
> to show up on second or fifth boot, tough look, no console for you.
>
In my case console was present (and used) at install.

> The first two bugs are ones we can and should fix in trusty. The third
> is probably a wart we need to live with, but can probably fix a bit
> better with systemd in 15.04/15.10/16.04.

Actually it works in 15.04. We only need a fix for 14.04 LTS, and then only on PowerKVM. It works on PowerVM.

Not sure about 14.10, but I'm guessing it's broken there too if that is pre-systemd.

Revision history for this message
Adam Conrad (adconrad) wrote :

If it works on 15.04, but not 14.04, it's by accident. As in, the installer isn't making this work, it's systemd that's spawning a console anyway. Your XML file shows the same non-standard console setup that we previously complained about with kimchi. Maybe this is a libvirt issue that it's picking bizarre port addresses for the serial, but we definitely need to relax how we look at those.

However, the initial bug report was for installing via VGA (so, no text console) and then rebooting. Fixing that case in trusty is a bit tougher, but we might be able to just detect the machine type and decide those should always have an hvc0 regardless, or similar. I'll think about it.

The reason this always works with PowerVM is the combination of (a) all installs are text-mode, so we can always look up the root console(s) during install, and they'll always exist, and (b) PowerVM gives us a consistent address for hvc0/hvc1 and we therefore detect them correctly every time. qemu/kvm allowing people to define these things willy-nilly is marginally unhelpful.

Anyhow, the above paragraph isn't excuses, so much as an explanation for why things work the way they do now. We assumed the world of SLOF-under-qemu would look just like OF-under-PowerVM (and, 99% of the time, this assumption is true), and that is biting us a bit. When I get back from vacation later this week, I'll dream up some solid ways to deal with the this new world order.

Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2015-05-22 18:10 EDT-------
(In reply to comment #48)
> 2) The console he set up was using an old kimchi default that used a
> non-standard address we don't look for. Newer versions of kimchi
> are meant to use a more standard address to work around that, but
> it's still an Ubuntu bug that we blatantly ignore random consoles.

I don't think this has changed in newer versions of Kimchi, and from libvirt documentation, Kimchi is just following libvirt's convention:

https://libvirt.org/formatdomain.html#elementsAddress
type='spapr-vio'
On PowerPC pseries guests, devices can be assigned to the SPAPR-VIO bus. It has a flat 64-bit address space; by convention, devices are generally assigned at a non-zero multiple of 0x1000, but other addresses are valid and permitted by libvirt. Each address has the following additional attribute: reg (the hex value address of the starting register). Since 0.9.9.

Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2015-05-28 16:21 EDT-------
(In reply to comment #54)
> If it works on 15.04, but not 14.04, it's by accident. As in, the installer
> isn't making this work, it's systemd that's spawning a console anyway. Your
> XML file shows the same non-standard console setup that we previously
> complained about with kimchi. Maybe this is a libvirt issue that it's
> picking bizarre port addresses for the serial, but we definitely need to
> relax how we look at those.
>
So, can someone provide an example of the "standard" console setup in the XML, that will make this work?

Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2015-05-28 17:54 EDT-------
(In reply to comment #57)
> (In reply to comment #54)
> > If it works on 15.04, but not 14.04, it's by accident. As in, the installer
> > isn't making this work, it's systemd that's spawning a console anyway. Your
> > XML file shows the same non-standard console setup that we previously
> > complained about with kimchi. Maybe this is a libvirt issue that it's
> > picking bizarre port addresses for the serial, but we definitely need to
> > relax how we look at those.
> >
> So, can someone provide an example of the "standard" console setup in the
> XML, that will make this work?

I suppose that if you use 0x30000000 as the address for the serial console that should work on Ubuntu, as explained before on LTC bug 110860 comment 12.

bugproxy (bugproxy)
tags: added: architecture-ppc64le
removed: architecture-ppc64
Steve Langasek (vorlon)
Changed in debian-installer (Ubuntu):
milestone: none → ubuntu-14.04.3
Changed in debian-installer (Ubuntu Trusty):
milestone: none → ubuntu-14.04.3
Changed in debian-installer (Ubuntu):
milestone: ubuntu-14.04.3 → none
Changed in debian-installer (Ubuntu Trusty):
assignee: nobody → Adam Conrad (adconrad)
Adam Conrad (adconrad)
affects: debian-installer (Ubuntu) → finish-install (Ubuntu)
Adam Conrad (adconrad)
description: updated
Revision history for this message
Steve Langasek (vorlon) wrote : Please test proposed package

Hello bugproxy, or anyone else affected,

Accepted finish-install into trusty-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/finish-install/2.46ubuntu4 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, and change the tag from verification-needed to verification-done. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed. In either case, details of your testing will help us make a better decision.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

Changed in finish-install (Ubuntu Trusty):
status: New → Fix Committed
tags: added: verification-needed
Revision history for this message
Adam Conrad (adconrad) wrote :

Verified that the above SRU creates an upstart job for hvc0 regardless of the VM settings in play.

tags: added: verification-done
removed: verification-needed
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package finish-install - 2.46ubuntu4

---------------
finish-install (2.46ubuntu4) trusty; urgency=medium

  * Always set up hvc0 on KVM-emulated pSeries guests (LP: #1391919)

 -- Adam Conrad <email address hidden> Mon, 03 Aug 2015 15:51:01 -0600

Changed in finish-install (Ubuntu Trusty):
status: Fix Committed → Fix Released
Revision history for this message
Adam Conrad (adconrad) wrote : Update Released

The verification of the Stable Release Update for finish-install has completed successfully and the package has now been released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regressions.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Related blueprints

Remote bug watches

Bug watches keep track of this bug in other bug trackers.