ISST-KVM:Ubuntu14.04:LE guest failed to boot on its first reboot after installation

Bug #1347967 reported by bugproxy
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
finish-install (Ubuntu)
In Progress
Undecided
Adam Conrad
Trusty
New
Undecided
Unassigned

Bug Description

== Comment: #0 - Gopesh Kumar Chaudhary <email address hidden> - 2014-05-26 06:20:02 ==
DEFECT : PowerKVM
 TYPE OF DEFECT:Guest

Defect Description :

I have created a LE guest using below xml and installed it using ubuntu ISO ..
Installation was successful but on reboot it got stuck with below .

I tried with the disk image qcow2 & raw ..hit with same issue .

#46-Ubuntu SMP TSkipping profile in /etc/apparmor.d/disable: usr.sbin.rsyslogd
 * Starting AppArmor profiles [ OK ]
 * Restoring resolver state... [ OK ]
[ 3.932048] sda2: WRITE SAME failed. Manually zeroing.

Guest xml
------------
[root@kimkvm qemu]# virsh dumpxml flyg1
<domain type='kvm' id='37'>
  <name>flyg1</name>
  <uuid>96904896-61bb-4a3c-bc23-e728d208bae7</uuid>
  <memory unit='KiB'>5242880</memory>
  <currentMemory unit='KiB'>5242880</currentMemory>
  <vcpu placement='static'>6</vcpu>
  <resource>
    <partition>/machine</partition>
  </resource>
  <os>
    <type arch='ppc64' machine='pseries'>hvm</type>
    <boot dev='hd'/>
    <boot dev='cdrom'/>
  </os>
  <features>
    <acpi/>
    <apic/>
    <pae/>
  </features>
  <clock offset='utc'/>
  <on_poweroff>destroy</on_poweroff>
  <on_reboot>restart</on_reboot>
  <on_crash>restart</on_crash>
  <devices>
    <emulator>/usr/bin/qemu-kvm</emulator>
    <disk type='file' device='disk'>
      <driver name='qemu' type='raw' cache='none'/>
      <source file='/var/lib/libvirt/images/flyg1.raw'/>
      <target dev='sda' bus='scsi'/>
      <alias name='scsi0-0-0-0'/>
      <address type='drive' controller='0' bus='0' target='0' unit='0'/>
    </disk>
    <disk type='file' device='cdrom'>
      <driver name='qemu' type='raw'/>
      <source file='/var/lib/libvirt/images/ubuntu-14.04-server-ppc64el.iso'/>
      <target dev='sdc' bus='scsi' tray='open'/>
      <readonly/>
      <alias name='scsi0-0-0-2'/>
      <address type='drive' controller='0' bus='0' target='0' unit='2'/>
    </disk>
    <controller type='usb' index='0'>
      <alias name='usb0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x0'/>
    </controller>
    <controller type='pci' index='0' model='pci-root'>
      <alias name='pci.0'/>
    </controller>
    <controller type='scsi' index='0'>
      <alias name='scsi0'/>
      <address type='spapr-vio' reg='0x2000'/>
    </controller>
    <interface type='bridge'>
      <mac address='52:54:00:9a:f6:04'/>
      <source bridge='mig'/>
      <target dev='vnet3'/>
      <model type='virtio'/>
      <alias name='net0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0'/>
    </interface>
    <serial type='pty'>
      <source path='/dev/pts/7'/>
      <target type='isa-serial' port='0'/>
      <alias name='serial0'/>
      <address type='spapr-vio' reg='0x30001000'/>
    </serial>
    <console type='pty' tty='/dev/pts/7'>
      <source path='/dev/pts/7'/>
      <target type='serial' port='0'/>
      <alias name='serial0'/>
      <address type='spapr-vio' reg='0x30001000'/>
    </console>
    <memballoon model='none'>
      <alias name='balloon0'/>
    </memballoon>
  </devices>
  <seclabel type='dynamic' model='selinux' relabel='yes'>
    <label>system_u:system_r:svirt_t:s0:c97,c855</label>
    <imagelabel>system_u:object_r:svirt_image_t:s0:c97,c855</imagelabel>
  </seclabel>
</domain>

O/p from Guest console
--------------------------------

[root@kimkvm ~]# virsh start --console flyg1
Domain flyg1 started
Connected to domain flyg1
Escape character is ^]

SLOF **********************************************************************
QEMU Starting
 Build Date = May 15 2014 10:46:03
 FW Version = mockbuild@ release 20140429
 Press "s" to enter Open Firmware.

Populating /vdevice methods
Populating /vdevice/v-scsi@2000
       SCSI: Looking for devices
          8002000000000000 CD-ROM : "QEMU QEMU CD-ROM 1.6."
          8000000000000000 DISK : "QEMU QEMU HARDDISK 1.6."
Populating /vdevice/vty@30001000
Populating /vdevice/nvram@71000000
Populating /pci@800000020000000
 Adapters on 0800000020000000
                     00 0800 (D) : 106b 003f serial bus [ usb-ohci ]
                     00 1000 (D) : 1af4 1000 virtio [ net ]
No NVRAM common partition, re-initializing...
Scanning USB
  OHCI: initializing
Using default console: /vdevice/vty@30001000

  Welcome to Open Firmware

  Copyright (c) 2004, 2011 IBM Corporation All rights reserved.
  This program and the accompanying materials are made available
  under the terms of the BSD License available at
  http://www.opensource.org/licenses/bsd-license.php

Trying to load: from: /vdevice/v-scsi@2000/disk@8000000000000000 ...
E3404: Not a bootable device!
Trying to load: from: /vdevice/v-scsi@2000/disk@8002000000000000 ... Successfully loaded
* finddevice /memory grub workaround *

                         GNU GRUB version 2.02~beta2-9

 +----------------------------------------------------------------------------+
 |*Install |
 | Rescue mode |
 | |
 | |
 | |
 | |
 | |
 | |
 | |
 | |
 | |
 | |
 +----------------------------------------------------------------------------+

      Use the ^ and v keys to select which entry is highlighted.
      Press enter to boot the selected OS, `e' to edit the commands
      before booting or `c' for a command-line.
* finddevice /memory grub workaround *
* finddevice /memory grub workaround *
OF stdout device is: /vdevice/vty@30001000
Preparing to boot Linux version 3.13.0-24-generic (buildd@fisher04) (gcc version 4.8.2 (Ubuntu 4.8.2-19ubuntu1) ) #46-Ubuntu SMP Thu Apr 10 19:09:21 UTC 2014 (Ubuntu 3.13.0-24.46-generic 3.13.9)
Detected machine type: 0000000000000101
Max number of cores passed to firmware: 2048 (NR_CPUS = 2048)
Calling ibm,client-architecture-support... done
command line: BOOT_IMAGE=/install/vmlinux tasks=standard pkgsel/language-pack-patterns= pkgsel/install-language-support=false -- quiet
memory layout at init:
  memory_limit : 0000000000000000 (16 MB aligned)
  alloc_bottom : 0000000003df0000
  alloc_top : 0000000030000000
  alloc_top_hi : 0000000140000000
  rmo_top : 0000000030000000
  ram_top : 0000000140000000
instantiating rtas at 0x000000002fff0000... done
prom_hold_cpus: skipped
copying OF device tree...
Building dt strings...
Building dt structure...
Device tree strings 0x0000000003e00000 -> 0x0000000003e008ef
Device tree struct 0x0000000003e10000 -> 0x0000000003e20000
Calling quiesce...
returning from prom_init
 -> smp_release_cpus()
spinning_secondaries = 5
 <- smp_release_cpus()
 <- setup_system()
CF000012
CF000015ch
Linux ppc64
#46-Ubuntu SMP TStarting system log daemon: syslogd, klogd.

  ??????????????????????? Finishing the installation ????????????????????????
  ? ?
  ? 96% ?
  ? ?
The system is going down NOW!ystem... ?
Sent SIGTERM to all processes ?
Sent SIGKILL to all processes????????????????????????????????????????????????
Requesting system reboot
[ 1666.600970] reboot: Restarting system

SLOF **********************************************************************
QEMU Starting
 Build Date = May 15 2014 10:46:03
 FW Version = mockbuild@ release 20140429
 Press "s" to enter Open Firmware.

Populating /vdevice methods
Populating /vdevice/v-scsi@2000
       SCSI: Looking for devices
          8002000000000000 CD-ROM : "QEMU QEMU CD-ROM 1.6."
          8000000000000000 DISK : "QEMU QEMU HARDDISK 1.6."
Populating /vdevice/vty@30001000
Populating /vdevice/nvram@71000000
Populating /pci@800000020000000
 Adapters on 0800000020000000
                     00 0800 (D) : 106b 003f serial bus [ usb-ohci ]
                     00 1000 (D) : 1af4 1000 virtio [ net ]
Scanning USB
  OHCI: initializing
Using default console: /vdevice/vty@30001000

  Welcome to Open Firmware

  Copyright (c) 2004, 2011 IBM Corporation All rights reserved.
  This program and the accompanying materials are made available
  under the terms of the BSD License available at
  http://www.opensource.org/licenses/bsd-license.php

Trying to load: from: /vdevice/v-scsi@2000/disk@8000000000000000 ... Successfully loaded
* finddevice /memory grub workaround *
error: no suitable video mode found.
error: failure writing sector 0x2056388 to `ieee1275/disk'.
* finddevice /memory grub workaround *
* finddevice /memory grub workaround *

Press any key to continue...
OF stdout device is: /vdevice/vty@30001000
Preparing to boot Linux version 3.13.0-24-generic (buildd@fisher04) (gcc version 4.8.2 (Ubuntu 4.8.2-19ubuntu1) ) #46-Ubuntu SMP Thu Apr 10 19:09:21 UTC 2014 (Ubuntu 3.13.0-24.46-generic 3.13.9)
Detected machine type: 0000000000000101
Max number of cores passed to firmware: 2048 (NR_CPUS = 2048)
Calling ibm,client-architecture-support... done
command line: BOOT_IMAGE=/boot/vmlinux-3.13.0-24-generic root=UUID=8072c1bf-cf6e-4a7c-8b8d-ba6145f50c1c ro splash quiet vt.handoff=7
memory layout at init:
  memory_limit : 0000000000000000 (16 MB aligned)
  alloc_bottom : 00000000047f0000
  alloc_top : 0000000030000000
  alloc_top_hi : 0000000140000000
  rmo_top : 0000000030000000
  ram_top : 0000000140000000
instantiating rtas at 0x000000002fff0000... done
prom_hold_cpus: skipped
copying OF device tree...
Building dt strings...
Building dt structure...
Device tree strings 0x0000000004800000 -> 0x00000000048008ef
Device tree struct 0x0000000004810000 -> 0x0000000004820000
Calling quiesce...
returning from prom_init
 -> smp_release_cpus()
spinning_secondaries = 5
 <- smp_release_cpus()
 <- setup_system()
CF000012
CF000015ch
Linux ppc64
#46-Ubuntu SMP TSkipping profile in /etc/apparmor.d/disable: usr.sbin.rsyslogd
 * Starting AppArmor profiles [ OK ]
 * Restoring resolver state... [ OK ]
[ 3.932048] sda2: WRITE SAME failed. Manually zeroing.

=========

-----------------------------------------------------------------------------
                             TESTING INFORMATION
-----------------------------------------------------------------------------

SYSTEM INFORMATION
------------------
  HOST NAME or NETWORK ADDRESS: kimkvm.blah.ibm.com

   FSP NAME and FSP ip fsp-kim.blah.ibm.com

  KVM BUILD LEVEL: - Frobisher 16
  Sapphire FIRMWARE LEVEL: - 1419H

== Comment: #1 - Gopesh Kumar Chaudhary <email address hidden> - 2014-05-26 10:54:03 ==
fyi..
I did installation for BE guest (sles) named flyg2 on same host using ISO ,which was successful and guest cam up fine on its first reboot after installation ....

== Comment: #2 - Leonardo Augusto Guimaraes Garcia <email address hidden> - 2014-05-26 14:52:01 ==
I am definitely no kernel specialist, but from what I read on the internet and from what I read in the kernel source code, it seems that this "WRITE SAME" error should not hang the system from executing. It is more related to an optimization than anything else:

----------------------------------------

... <snip> ... WRITE SAME error now believed to be unrelated.

== Comment: #4 - Anton Blanchard <email address hidden> - 2014-05-26 21:15:31 ==
The write same issue is a red herring, it is most likely a QEMU limitation. The kernel works around it just fine.

== Comment: #5 - Samuel Mendoza-Jonas <email address hidden> - 2014-05-27 23:15:08 ==
It seems the only difference between successfully installing and not is the following bit of XML:
<serial type='pty'>
      <source path='/dev/pts/7'/>
      <target type='isa-serial' port='0'/>
      <alias name='serial0'/>
      <address type='spapr-vio' reg='0x30001000'/>
    </serial>
    <console type='pty' tty='/dev/pts/7'>
      <source path='/dev/pts/7'/>
      <target type='serial' port='0'/>
      <alias name='serial0'/>
      <address type='spapr-vio' reg='0x30001000'/>
    </console>

Changing the section to this from a more basic VM results in a successful boot:
 <serial type='pty'>
      <target port='0'/>
      <address type='spapr-vio' reg='0x30000000'/>
    </serial>
    <console type='pty'>
      <target type='serial' port='0'/>
      <address type='spapr-vio' reg='0x30000000'/>
  </console>

I'll dig into this more to find exactly what's breaking. Unless there's some reason why the first XML example is preferred, I'll drop the priority of this bug.

== Comment: #6 - BHARATA BHASKER RAO <email address hidden> - 2014-05-28 02:53:52 ==
(In reply to comment #5)
> I'll dig into this more to find exactly what's breaking. Unless there's some
> reason why the first XML example is preferred, I'll drop the priority of
> this bug.

Reported earlier on LP 1296631 (reg 30001000 vs 30000000)
Specifically look at comment #12 (https://bugs.launchpad.net/tasty-taco/+bug/1296631)

From what I have seen, the system isnt really hung, but console isn't coming up. If you have n/w configured, you can ssh to the guest and see that it is up.

== Comment: #8 - BHARATA BHASKER RAO <email address hidden> - 2014-05-28 06:53:03 ==
Verified with VNC console that system is up, so this not a hang.

I understand from Gopesh that this XML was generated by Kimchi. Why Kimchi is generating 30001000 instead of the regular 30000000 could be investigated separately.

Why 30001000 isn't working and if it is QEMU issue or guest issue should be investigated, but right now deferring this.

== Comment: #9 - Juan G. Rivera-Rivas <email address hidden> - 2014-05-28 07:52:39 ==
We need a documentation bug in the same fashion that Frank has been doing it ( clone this bug and assign to documentation ).

Then I think we need to understand:
1) What is the difference between 30001000 and 30000000.
2) Why does Ubuntu doesn't "like" the "30001000" value? (it works with other distros)
3) Why Kimchi uses 30001000?

== Comment: #10 - LEI LI <email address hidden> - 2014-05-29 03:15:10 ==
(In reply to comment #9)
> We need a documentation bug in the same fashion that Frank has been doing it
> ( clone this bug and assign to documentation ).
>
> Then I think we need to understand:
> 1) What is the difference between 30001000 and 30000000.

From what I know, there is no difference/limitation between these two values to Libvirt, it just address assigned to the device. Althrough 30000000 is defined for VIO_ADDR_SERIAL specificly, both works fine as serial address.

> 2) Why does Ubuntu doesn't "like" the "30001000" value? (it works with
> other distros)

Yes, seems it does, this'd need some investigations.

> 3) Why Kimchi uses 30001000?

== Comment: #12 - Alexey Kardashevskiy <email address hidden> - 2014-07-21 00:23:11 ==
I recently hit this issue. This is happening because nothing is listening on hvc0 after installation
completed. To fix this, I booted with vga, loggen in and created a copy of tty1.conf file like this:

sed -e s/tty1/hvc0/g /etc/init/tty1.conf > /etc/init/hvc0.conf

Now spapr-vty works with either address - 30000000, 30001000, 71000000 and so on.

I consider it an Ubuntu bug - the installer wants to see vty at 0x30000000 only for some reason.

== Comment: #13 - LEI LI <email address hidden> - 2014-07-21 04:02:07 ==
(In reply to comment #12)
> I recently hit this issue. This is happening because nothing is listening on
> hvc0 after installation
> completed. To fix this, I booted with vga, loggen in and created a copy of
> tty1.conf file like this:
>
> sed -e s/tty1/hvc0/g /etc/init/tty1.conf > /etc/init/hvc0.conf
>
> Now spapr-vty works with either address - 30000000, 30001000, 71000000 and
> so on.
>
> I consider it an Ubuntu bug - the installer wants to see vty at 0x30000000
> only for some reason.

Hi Alexey,

I saw your reply in PowerKVM mailing list. So it uses hvc to connect to getty?

Then should we inform Ubuntu to handling this, like creates a new inittab entry for hvc? I am not sure the process of dealing with distribution issues.

Thanks,

Lei

bugproxy (bugproxy)
tags: added: architecture-ppc64 bugnameltc-110860 severity-high targetmilestone-inin1404
Revision history for this message
Ubuntu Foundations Team Bug Bot (crichton) wrote :

Thank you for taking the time to report this bug and helping to make Ubuntu better. It seems that your bug report is not filed about a specific source package though, rather it is just filed against Ubuntu in general. It is important that bug reports be filed about source packages so that people interested in the package can find the bugs about it. You can find some hints about determining what package your bug might be about at https://wiki.ubuntu.com/Bugs/FindRightPackage. You might also ask for help in the #ubuntu-bugs irc channel on Freenode.

To change the source package that this bug is filed about visit https://bugs.launchpad.net/ubuntu/+bug/1347967/+editstatus and add the package name in the text box next to the word Package.

[This is an automated message. I apologize if it reached you inappropriately; please just reply to this message indicating so.]

tags: added: bot-comment
Revision history for this message
Adam Conrad (adconrad) wrote :

06:58 < infinity> pfsmorigo: Kay, so a Kimchi bug (and yes, also an Ubuntu bug)
06:59 < infinity> pfsmorigo: For the record, the default in SLOF appears to be 71000000 (which we handle fine), and we also handle 30000000, but not 30001000.
06:59 < pfsmorigo> infinity: ok, got it
06:59 < infinity> pfsmorigo: Going forward, I plan to try to rewrite this to detect things a bit more sanely but, for now, it might be nice if Kimchi wasn't doing silly non-standard things.
07:00 < infinity> pfsmorigo: If there's already an install base of systems that will be trying 30001000 though, we can certainly fix the hardcoding in the installer to also look there for now.
07:00 < infinity> pfsmorigo: Not for the point release, but for a later update.

affects: ubuntu → finish-install (Ubuntu)
Changed in finish-install (Ubuntu):
assignee: nobody → Adam Conrad (adconrad)
status: New → In Progress
Revision history for this message
bugproxy (bugproxy) wrote : Comment bridged from LTC Bugzilla

------- Comment From <email address hidden> 2014-07-24 17:23 EDT-------
*** Bug 113493 has been marked as a duplicate of this bug. ***

tags: added: architecture-ppc64le
removed: architecture-ppc64
Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2014-09-02 15:54 EDT-------
*** Bug 115231 has been marked as a duplicate of this bug. ***

Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2014-09-24 22:12 EDT-------
There's been changes in KimChi in the latest PowerKVM 2.1.1 builds that fixes this bug, we can probably safely close this bug (unless anyone objects or finds something I'm missing).

Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2014-09-25 01:52 EDT-------
(In reply to comment #24)
> There's been changes in KimChi in the latest PowerKVM 2.1.1 builds that
> fixes this bug, we can probably safely close this bug (unless anyone objects
> or finds something I'm missing).

This does not seems to be generic solution as KVM Host is not only managed by Kimchi ,how about other manage ment software or how about if a user is using virsh and he specified that value then in these cases no solution ,so I guess this should be fixed .

Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2014-10-15 14:27 EDT-------
*** This bug has been marked as a duplicate of bug 111697 ***

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.