Server installations on VMs fail to reboot after the installations

Bug #1100386 reported by Para Siva
14
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Won't Fix
High
Andy Whitcroft
Raring
Won't Fix
High
Andy Whitcroft
systemd (Ubuntu)
Fix Released
High
Andy Whitcroft
Raring
Invalid
Undecided
Unassigned
udev (Ubuntu)
Invalid
Undecided
Unassigned
Raring
Won't Fix
High
Andy Whitcroft

Bug Description

Raring and saucy server installations with both amd64 and i386 fail to reboot normally after the installations on VMs. This occurs with both amd64 and i386 images when installing VMs, (using libvirt and virt-manager and also using VirtBox).

This appears to be a regression started with Ubuntu 3.7.0-6-generic. Earlier versions do not have this issue.

On i386 installations booting via the recovery mode causes "Kernel panic - not syncing: Attempted to kill init! exit code 0x00000600" as shown in the attached image.

Latest amd64 (20130121) installations with virtual-host package selection also reported the kernel panic when booting via recovery mode with the same message above, "Kernel panic - not syncing: Attempted to kill init! exit code 0x00000600".

Standard booting causes the similar type of hang as that in i386 cases. (please see the video attached)

This issue can not be seen in hardware installations.

Steps to reproduce:

A) Manual steps:

1. Install raring server on a VM with no package selected, leaving the default answers for the questions. The host of the VM is irrelevant, this has been observed on raring, quantal and precise 64 bit hosts where the VM is installed.
2. Reboot after the grub installation is complete

B) Automated steps:
1. Do a preseed installation of raring desktop with the attached preseed file (virtual-host.preseed) and the virtual-host.run using utah. The automation instructions are below
2. Reboot the machine
(I used utah for automated installation, the how to is given in http://utah.readthedocs.org/en/latest/introduction.html#how-to-start-running-tests)
The steps are
1. use the attached .preseed file (attachment 15) and .run file (attachment 16) to execute the following command (please provide the absolute path to the files and the iso)
        sudo -i -u utah run_utah_tests.py -i /path/to/iso -p /path/to/preseed /path/to/.run -n -x /etc/utah/bridged-network-vm.xml
2. Reboot the VM after the installation

Revision history for this message
Para Siva (psivaa) wrote :
Revision history for this message
Brad Figg (brad-figg) wrote : Missing required logs.

This bug is missing log files that will aid in diagnosing the problem. From a terminal window please run:

apport-collect 1100386

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete
tags: added: raring
Revision history for this message
Para Siva (psivaa) wrote : Re: Raring server installations on KVM fail to reboot after the installations

The hang on amd64 installations is shown here. The same type of hang could be seen during i386 installation reboots when recovery mode is not selected. (when recovery mode is used in i386 installations the kernel panic as given above occurs)

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

Do you also get a panic when booting i386 normally, not in recovery mode?

Changed in linux (Ubuntu):
importance: Undecided → High
tags: added: kernel-key
Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

Also, did this start happening on a recent daily image? Was there a prior image that did not exhibit this bug?

Revision history for this message
Para Siva (psivaa) wrote :

Contents from /var/log from an amd64 installation is attached herewith. Could not collect those logs for i386 for never being able to loginto an i386 installation

Changed in linux (Ubuntu):
status: Incomplete → Confirmed
Revision history for this message
Para Siva (psivaa) wrote :

I do not see the panic when booting the i386 normally. When booting the i386 is normally booted the behaviour is the same as the video attached. It just hangs the same way amd64 does when booted normally.

I could not say if this is a new issue. The automatic smoke tests passed until yesterday and since we could not run the auto tests reliably today i had to do manual installations.

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

Looking at the video, it seems like the VM is trying to perform a filesystem check. Did you try cancelling the filesystem check or waiting to see if it finishes?

Revision history for this message
Para Siva (psivaa) wrote :

I tried but there is no response. None of the key presses did anything once it went to the screen in the video.

Para Siva (psivaa)
summary: - Raring server installations on KVM fail to reboot after the
+ Raring server installations on VMs fail to reboot after the
installations
Revision history for this message
Para Siva (psivaa) wrote : Re: Raring server installations on VMs fail to reboot after the installations

I tried with 20130117 image of i386 on VBox and the kernel panic occurred when booting on recovery mode. Please see the attached screenshot.
On VBox the normal booting succeeds most of the time though as opposed to almost never on virt-manager.
The contents of /var/log on the i386 server installation on VBox is also attached below.

Revision history for this message
Para Siva (psivaa) wrote :

Contents of /var/log/ in an i386 server installation on VBox

Para Siva (psivaa)
description: updated
Revision history for this message
Para Siva (psivaa) wrote :

This occurred during a virtual-host preseeded installation of amd64 raring server image. The VM hanged when booting normally and threw a kernel panic when booting via recovery mode.

https://jenkins.qa.ubuntu.com/view/Raring/view/Smoke%20Testing/job/raring-server-amd64-smoke-virtual-host/60/
is the impacted job

description: updated
Para Siva (psivaa)
description: updated
Revision history for this message
Para Siva (psivaa) wrote :

The preseed file

description: updated
Revision history for this message
Para Siva (psivaa) wrote :

Utah runlist

description: updated
Revision history for this message
James Hunt (jamesodhunt) wrote :

This is bug 1096531 - looks like a standard job is behaving slightly differently under raring and exposing the issue.

tags: removed: kernel-key
Revision history for this message
Stefan Bader (smb) wrote :

We think this may actually be modeset related. Is it possible to preseed the installation (or change it for the failing installed VMs) to have nomodeset on the grub commandline?

Revision history for this message
Para Siva (psivaa) wrote :

So I installed an i386 server with nomodeset option selected and the normal reboots are working fine ( although the recovery mode path I tried out of curiosity still leads to the kernel panic) - The host is a 64 bit quantal running KVM and virt-manager

When i edited the grub command line on this installation to remove nomodeset, the standard reboot hangs.

Converse also conforms to the above pattern. i.e. An installation without nomodeset hangs on reboot but when I edited the grub command line to include nomodeset, the VM boots fine.

Revision history for this message
Para Siva (psivaa) wrote :

The hang started on images with kernel version of Ubuntu 3.7.0-6-generic. This could be reproduced on the raring server i386 image of 20121213 and later

The hang can not be seen with images that contain kernel version 3.7.0-5-generic and earlier. Tested with raring server i386 images of 20121212 and earlier ones but could not reproduce the issue.

Para Siva (psivaa)
description: updated
Stefan Bader (smb)
Changed in linux (Ubuntu):
assignee: nobody → Andy Whitcroft (apw)
Revision history for this message
Andy Whitcroft (apw) wrote :

I have managed to reproduce the apparent hangs, the recovery mode issues I have not; if they still exist they should be filed under a different bug.

For the apparent hangs, I have managed to confirm they are not hangs at all. What has happened is that we have lost the console completely. The kernel attempted to switch framebuffer devices and failed to do so, it successfully removed the efifb but failed to initialise cirrusfb. Now there is nothing to display console output. If you know the IP address of the image however it is pingable, and with openssh installed it is possible to login. Errors in the dmesg as below:

  [ 2.701082] fb: conflicting fb hw usage cirrusdrmfb vs EFI VGA - removing generic driver
  [ 2.704007] Console: switching to colour dummy device 80x25
  [ 2.717086] [drm:cirrus_vram_init] *ERROR* can't reserve VRAM
  [ 2.717093] cirrus 0000:00:02.0: Fatal error during GPU init: -6

Now this is something we have seen before. We are using efifb (a generic driver) but want to use a device specific driver to get 3d support. If plymouth opens the framebuffer before we switch over then we get into a hole where we cannot completly remove the old driver and as they share the same VRAM we cannot initialise the new one.

The correct solution would be to make the kernel able to force the open driver to close on plymouth and to allow the new one to start. We would also then need to fix plymouth to cope with the case where the framebuffer closes harshly on it and reconnect.

What we have done in the past (for vesafb) was to delay loading vesafb until after the better driver has had a chance to take and use the device, falling back to vesafb only when it did not appear. We cannot do this quite the same for efifb as it has to be builtin, but we can prevent efifb being identified as a primary framebuffer. This will mean we normally not start plymouth splash until after we have had a chance to detect the cirrus driver. If there is no alternative however, we will use efifb from the normal fallback path as used for vesafb. We have confirmed that vesafb will not load in this case as efifb has already claimed the device. Patch to follow.

Revision history for this message
Jamie Strandboge (jdstrand) wrote :

FYI, this is still a problem in 13.04 release. I installed 13.04 amd64 server and could not get a login prompt. Adding 'nomodeset' works fine. I imagine I could also use the 'vmvga' driver instead of 'cirrus'.

Brad Figg (brad-figg)
tags: added: kernel-stable-key
Para Siva (psivaa)
summary: - Raring server installations on VMs fail to reboot after the
- installations
+ Server installations on VMs fail to reboot after the installations
description: updated
Revision history for this message
Andy Whitcroft (apw) wrote :

Although this is fundamentally a kernel issue, the current kernel infrastructure would really only be able to abort anyone use the 'being replaced' framebuffer when we switch from efifb to a DRM framebuffer. This change will likely be extensive and slow to get through upstream. Also plymouth will have to be modified to handle being aborted and reconnect to the replacement transparently. In the short term we can avoid this issue the same way we avoided it for vesafb by demoting the driver to a secondary display. This means we only use efifb at all if no DRM driver appears. Neatly avoiding the issue.

As we have now (early saucy) moved udev over to systemd sources I have proposed this against both udev and systemd. udev for raring and systemd for saucy. I will attach patches once they are tested.

Changed in linux (Ubuntu Raring):
status: New → Confirmed
importance: Undecided → High
assignee: nobody → Andy Whitcroft (apw)
Changed in systemd (Ubuntu Raring):
status: New → Invalid
Changed in udev (Ubuntu):
status: New → Invalid
Changed in systemd (Ubuntu):
status: New → In Progress
Changed in udev (Ubuntu Raring):
status: New → In Progress
Changed in systemd (Ubuntu):
importance: Undecided → High
Changed in udev (Ubuntu Raring):
importance: Undecided → High
assignee: nobody → Andy Whitcroft (apw)
Changed in systemd (Ubuntu):
assignee: nobody → Andy Whitcroft (apw)
Martin Pitt (pitti)
summary: - Server installations on VMs fail to reboot after the installations
+ [udev] Server installations on VMs fail to reboot after the
+ installations
summary: - [udev] Server installations on VMs fail to reboot after the
- installations
+ Server installations on VMs fail to reboot after the installations
Revision history for this message
Andy Whitcroft (apw) wrote :

systemd patch for saucy.

Martin Pitt (pitti)
Changed in systemd (Ubuntu):
status: In Progress → Fix Committed
Revision history for this message
Andy Whitcroft (apw) wrote :

Marking Won't Fix for the kernel as this is a very big effort kernel side and we are going to avoid it in udev rules.

Changed in linux (Ubuntu Raring):
status: Confirmed → Won't Fix
Changed in linux (Ubuntu):
status: Confirmed → Won't Fix
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package systemd - 202-0ubuntu7

---------------
systemd (202-0ubuntu7) saucy; urgency=low

  [ Martin Pitt ]
  * debian/*: Replace remaining "udevadm info --run" invocations with
    /run/udev/. (LP: #1182788)
  * Add 0020-persistent-storage-rule-mmc-partname.patch: Create disk/by-name
    links for mmcblk partitions if they have a PARTNAME property. Patch by
    Ricardo Salveti de Araujo, taken from udev 175-0ubuntu29.

  [ Andy Whitcroft ]
  * debian/extra/rules/78-graphics-card.rules -- demote efifb to a secondary
    display adaptor as in the majority of cases this will be replaced by
    a DRM driver. (LP: #1100386)
 -- Martin Pitt <email address hidden> Wed, 22 May 2013 12:09:59 +0200

Changed in systemd (Ubuntu):
status: Fix Committed → Fix Released
Revision history for this message
Rolf Leggewie (r0lf) wrote :

raring has seen the end of its life and is no longer receiving any updates. Marking the raring task for this ticket as "Won't Fix".

Changed in udev (Ubuntu Raring):
status: In Progress → Won't Fix
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.