Installation stalls randomly until a key is pressed

Bug #217815 reported by Nick Barcet
26
This bug affects 2 people
Affects Status Importance Assigned to Milestone
kvm (Ubuntu)
Fix Released
High
Unassigned
Declined for Karmic by Thierry Carrez
Declined for Lucid by Thierry Carrez

Bug Description

There are several reports, both in KVM and on real hardware, of a problem where the installation process will stall, and no further progress is made until a key is pressed. This is believed to be a kernel issue, which is resolved by initiating some interrupt processing activity.

Tags: iso-testing

Related branches

Revision history for this message
Nick Barcet (nijaba) wrote :

Please find enclosed the content of /var/log/installer

Revision history for this message
Nick Barcet (nijaba) wrote :

Weirdly enough, retrying this, I accidentally clicked inside the window: moving the mouse in the guest seems to unstuck the process for a while (I had to repeat this 5 time to get until the end of the erase process)!

Revision history for this message
Colin Watson (cjwatson) wrote :

Err, the installer itself doesn't do anything with the mouse, so if clicking does something then this sounds like a KVM problem ...

Revision history for this message
QuentinHartman (qhartman) wrote :

Using Hardy Alternate 64-bit disk (both Beta release and recent nightlies) I see this same behavior on my system. I can get things to pick up again by pressing keys, I haven't tried mouse clicks, but hitting any key on the keyboard (I usually hit ALT or ENTER) seems to unstick things. However, I've not been able to get it to completion. Perhaps I wasn't patient enough.

I've collected some more information in another bug I posted before finding this. It is bug #217849

Revision history for this message
Colin Watson (cjwatson) wrote :

http://iso.qa.ubuntu.com/qatracker/result/1453/195 reproduces on real hardware. I must confess I have very little idea of what's going on here!

Revision history for this message
Colin Watson (cjwatson) wrote :

BTW, note that you can cancel the erase process ...

Revision history for this message
QuentinHartman (qhartman) wrote : Re: [Bug 217815] Re: When selecting LVM Crypt, erase never ends
  • unnamed Edit (1.7 KiB, text/html; charset=ISO-8859-1)

On Tue, Apr 15, 2008 at 11:37 AM, Colin Watson <email address hidden>
wrote:

> http://iso.qa.ubuntu.com/qatracker/result/1453/195 reproduces on real
> hardware. I must confess I have very little idea of what's going on
> here!
>

It is mysterious to me as well, although I'm not entirely positive that it
should be assigned to partman-crypto, I think there may be something else
going on. As I sit here right now, the installation is stalled out again on
"Loading Additional Components" at 56%, "Configuring base-installer". All
drive activity has ceased for a couple minutes here. And now... Hitting
"LEFT SHIFT" gets the process moving along again, virtually instantly the
progress bar fills, drive activity picks up and we move along. This is
_before_ I make any partitioning choices, and I believe that whatever is
causing this problem is also the root of the encryption / disk wipe problem.
It seems that it is just more pronounced there. I'd also wager that other
people seeing this touch a key on their keyboard to see if the machine seems
locked up, and since it picks up again, they just shrug it off.

I really ahve no idea where to go to dig further into this. It almost seems
kernel level... I'm going to burn an earlier alpha and see if it does the
same thing.

--
-Regards-

-Quentin Hartman-

Revision history for this message
Stéphane Graber (stgraber) wrote : Re: When selecting LVM Crypt, erase never ends

Reproduced on KVM but not on real HW. User input unstuck the process (moving the mouse or typing)
Looks like something is waiting to get entropy to continue writing to disk ...

If it's confirmed to be something related to lack of entropy displaying a message asking the user to type or move his mouse may be a valid workaround.

Revision history for this message
QuentinHartman (qhartman) wrote : Re: [Bug 217815] Re: When selecting LVM Crypt, erase never ends
  • unnamed Edit (709 bytes, text/html; charset=ISO-8859-1)

On Tue, Apr 15, 2008 at 11:57 AM, Quentin Hartman <email address hidden>
wrote:

>
> I really ahve no idea where to go to dig further into this. It almost
> seems kernel level... I'm going to burn an earlier alpha and see if it does
> the same thing.
>

Hmmm.... well I can't seem to find any mirrors that have any of the alpha
builds on them. If you know where I could one, please drop a line!

--
-Regards-

-Quentin Hartman-

Revision history for this message
QuentinHartman (qhartman) wrote : Re: When selecting LVM Crypt, erase never ends

@Stéphane: I really don't think that is a valid work around, it just takes too long. Further, the problem exists even when not doing an encrypted install. See my comment above and in bug #217849 for some times and an explanation why I believe "waiting for entropy" is not truly the root of the problem.

description: updated
Revision history for this message
Nick Barcet (nijaba) wrote :

Reproduced with 20080416 iso

description: updated
Revision history for this message
Jamie Strandboge (jdstrand) wrote :

I too reproduced this with the server iso from 20080416.

Revision history for this message
QuentinHartman (qhartman) wrote : Re: [Bug 217815] Re: When selecting LVM Crypt, erase never ends
  • unnamed Edit (1.0 KiB, text/html; charset=ISO-8859-1)

On Thu, Apr 17, 2008 at 6:55 AM, Jamie Strandboge <email address hidden> wrote:

> I too reproduced this with the server iso from 20080416.
>

I'm currently using an installation from the nightly on the 14th, and I
noticed something for the first time. I am occasionally getting the
"stalling" behavior on the installed system. The bootsplash progress bar
(which never worked on 64-bit on this machine before! yay!) stalled out and
when I hit a key drive activity instantly resumed. And then it did it again
while Gnome was loading. I was looking at the empty tan post-GDM screen, and
all my drive activity stopped. Hitting the spacebar got it going again...

This seems bad....

--
-Regards-

-Quentin Hartman-

Revision history for this message
Colin Watson (cjwatson) wrote : Re: When selecting LVM Crypt, erase never ends

Ben indicated on the phone that this is probably to do with lost interrupts, and that keyboard and mouse actions tickle the kernel into action; apparently we've had some similar issues before. Reassigning over to linux.

Revision history for this message
Rick Clark (dendrobates) wrote :

Has anyone reproduced this exact bug on real hardware? If so, please post your hardware specs.

I am not yet convinced that #217849 is the same issue.

I can reproduce this easily in kvm, however, I have tried multiple times on multiple real systems and not reproduced it.

Revision history for this message
QuentinHartman (qhartman) wrote :

@Rick

I can very consistently recreate it on my Sager laptop. It is a Core 2 Duo T9300, 4GB RAM, NVidia 8600M GT video, 80GB Fujitsu HDD. more details can be seen at http://www.sagernotebook.com/product_customed.php?pid=46090 on the specifications tab. I unfortunately do not have access to any other 64-bit hardware that I can test this on.

As far as whether or not bug #217849 is the same issue, it very well may not be, but it certainly seems likely. Both I and Nick were able to make the process move along by "tickling" with keyboard or mouse stuff. In Nick's case above, even the mouse is not used at all by the installer, if KVM treats it as a USB device, moving it over the window would likely cause the USB traffic to wake up the system.

The hiccups I've seen on my fully installed (though not lvm-crypt encrypted this time) system exhibit the same behavior that I saw during the install.

Revision history for this message
Jamie Strandboge (jdstrand) wrote :

Though not real hardware, I can somewhat reliably get random hangs that require a keyboard event by creating an x86_64 kvm virtual machine with 4000MB non-preallocated disk, 256MB ram and 2 vcpus. The installer continues after a keyboard event, and this is not specific to lvm+encrypt. I see it often before the partitioner comes up.

Revision history for this message
Jamie Strandboge (jdstrand) wrote :

I should also mention that I am using the server iso from http://iso.qa.ubuntu.com/qatracker/info/1523

Revision history for this message
Jamie Strandboge (jdstrand) wrote :

Also, I used 'Linux/Ubuntu Hardy' for the variant.

Revision history for this message
Rick Clark (dendrobates) wrote :

Since this has been confirmed to affect both kvm and real hardware, I am marking this critical.

I believe this could affect the ability to do an unattended install of Ubuntu Server.

While this has been reproduced once on real hardware, it has proved extremely elusive. We are still unsure how common this could be, since we are testing on a small subset of server hardware.

Changed in linux:
importance: Undecided → Critical
milestone: none → ubuntu-8.04
status: New → Confirmed
Revision history for this message
Matt Zimmerman (mdz) wrote :

I just tested an encrypted configuration on real hardware using the kubuntu alternate amd64 ISO (20080417.1), and can only confirm some of the symptoms:

1. It was impossible to cancel the erasure while it was in progress. The Cancel button did not have focus, and there was no response when pressing tab.

2. Erasing took a very long time! There was no indication of progress, but with the size of the disk I was using, 1% is a lot, so I probably aborted before it would have displayed any progress.

I did not wait for it to finish, but I was able to switch to a text console with alt+f2 and confirm that blockdev-wipe was running, and there was plenty of disk activity, so I saw no indication that there was a hang in my case.

Revision history for this message
Soren Hansen (soren) wrote :

I've been poking around in the kvm console when it's hanging, and the RIP register is always at 0xffffffff8020b3b9, which AFAICS according to System.map is default_idle+0x29. I don't know if this helps a whole lot, but I figured I'd mention it.

Revision history for this message
QuentinHartman (qhartman) wrote : Re: [Bug 217815] Re: When selecting LVM Crypt, erase never ends
  • unnamed Edit (666 bytes, text/html; charset=ISO-8859-1)

On Thu, Apr 17, 2008 at 3:43 PM, Matt Zimmerman <email address hidden> wrote:

> 1. It was impossible to cancel the erasure while it was in progress.
> The Cancel button did not have focus, and there was no response when
> pressing tab.

I haven't been able to get to the Cancel button on the Ubuntu Alternate
installer CD's I've been using either. Seems like a different bug?

--
-Regards-

-Quentin Hartman-

Revision history for this message
Jamie Strandboge (jdstrand) wrote : Re: When selecting LVM Crypt, erase never ends

I have tried these many times (without lvm+encrypt) and see the hang until keyboard event on most installs (~80%) if I don't use preseeding. If I use full preseeding so that I don't have to enter anything after specifying the kernel boot options with F6, then it hangs less frequently (25% of installs hang). This is all with the kvm setup I mentioned above.

Revision history for this message
Matt Zimmerman (mdz) wrote : Re: [Bug 217815] Re: When selecting LVM Crypt, erase never ends

On Fri, Apr 18, 2008 at 01:32:48AM -0000, Jamie Strandboge wrote:
> I have tried these many times (without lvm+encrypt) and see the hang
> until keyboard event on most installs (~80%) if I don't use preseeding.
> If I use full preseeding so that I don't have to enter anything after
> specifying the kernel boot options with F6, then it hangs less
> frequently (25% of installs hang). This is all with the kvm setup I
> mentioned above.

At which milestone was this first noticed? Is it believed to be a
regression from 7.10?

--
 - mdz

Revision history for this message
Matt Zimmerman (mdz) wrote :

Jamie, where does it stall for you, if not during the erase stage?

My tests have all been on a Core 2 Duo T7500 and I haven't been able to reproduce the problem at all despite several test installations using alternate install CDs.

Is this specific to the server kernel?

description: updated
Revision history for this message
Jamie Strandboge (jdstrand) wrote :

Regarding 7.10 regression, I haven't tried gutsy installs with my vm, but I will and will post back.

As for where it stalls, it is random places. I've seen it stall in 'Scanning for CD-ROM', different packages, pretty much anywhere between setting up the keyboard and the partitioner. I did see the hang in the partitioner as well.

Revision history for this message
QuentinHartman (qhartman) wrote :
  • unnamed Edit (1.3 KiB, text/html; charset=ISO-8859-1)

On Fri, Apr 18, 2008 at 7:09 AM, Matt Zimmerman <email address hidden> wrote:

> On Fri, Apr 18, 2008 at 01:32:48AM -0000, Jamie Strandboge wrote:
> > I have tried these many times (without lvm+encrypt) and see the hang
> > until keyboard event on most installs (~80%) if I don't use preseeding.
> > If I use full preseeding so that I don't have to enter anything after
> > specifying the kernel boot options with F6, then it hangs less
> > frequently (25% of installs hang). This is all with the kvm setup I
> > mentioned above.
>
> At which milestone was this first noticed? Is it believed to be a
> regression from 7.10?
>

I believe so. I was running 7.10 on this machine previously and upgraded
when Hardy went Beta. I didn't notice this behavior before then. I even did
another 7.10 installation to confirm, and it did not have this behavior. I
tried to find some earlier alpha ISO's, but have not had any luck with that
yet.
--
-Regards-

-Quentin Hartman-

Revision history for this message
Jamie Strandboge (jdstrand) wrote :

For people trying Gutsy or earlier CDs in kvm, you have to update the iso. See:
https://wiki.ubuntu.com/KvmVirtManagerEtc#head-80f0e334a1de270252f62ac69183da2ed58edf3f

Revision history for this message
Jamie Strandboge (jdstrand) wrote :

This does appear to be a regression from 7.10 as I couldn't trigger the bug in the 7.10 installer. I will continue several more times and report back if 7.10 installs hang.

Revision history for this message
Dustin Kirkland  (kirkland) wrote :

Stéphane Graber's comment about entropy got me thinking...

Is there anywhere in the installation that depends on entropy (/dev/random or /dev/urandom)?

Something like openssh-server would, of course, for key generation. But that's not a default package, and the installation/configuration thereof comes long after the point at which most of you are triggering this bug.

So thinking much earlier in the installer, anything come to mind? Particularly something that might have been added since Gutsy as we now believe this is a regression?

:-Dustin

Revision history for this message
QuentinHartman (qhartman) wrote : Re: [Bug 217815] Re: Installation stalls randomly until a key is pressed
  • unnamed Edit (794 bytes, text/html; charset=ISO-8859-1)

On Fri, Apr 18, 2008 at 10:28 AM, Dustin Kirkland <email address hidden>
wrote:

>
> So thinking much earlier in the installer, anything come to mind?
> Particularly something that might have been added since Gutsy as we now
> believe this is a regression?
>
>
The only thing I can think of that should be relying on entropy would be the
lvm-crypt and disk-wiping stuff that goes along with it, and this problem
gets triggered even when those aren't in use.

--
-Regards-

-Quentin Hartman-

Revision history for this message
Matt Zimmerman (mdz) wrote : Re: [Bug 217815] Re: Installation stalls randomly until a key is pressed

Current thinking is that there are probably two separate issues here:

 1. A problem specific to the KVM virtual environment
 2. A problem specific to a particular hardware type.

As such, I've un-duped bug 217849.

--
 - mdz

Revision history for this message
Rick Clark (dendrobates) wrote :

I think we can consider kvm a hardware type for these purposes.

So I think we have found bugs on two hardware types that present the same way, but may, or may not have the same cause.

We could have two hardware platforms that themselves have identical bugs.

Revision history for this message
Matt Zimmerman (mdz) wrote :

On Fri, Apr 18, 2008 at 06:24:50PM -0000, Rick Clark wrote:
> I think we can consider kvm a hardware type for these purposes.

Only if the bug is in the guest kernel. This could be an issue with KVM
itself, or with the host side.

--
 - mdz

Revision history for this message
Jamie Strandboge (jdstrand) wrote :

I finished installing 6 more gutsy installs (for a total of 7), with no preseeding. I used the same setup as above, except I chose lvm+encrypt and there were no hangs. There appears to be an installer regression on (at least) kvm hardware.

I should note all tests are with the server iso.

Revision history for this message
Nicolas Valcarcel (nvalcarcel) wrote :

I have had the same problem using KVM, the tty1 stops on "Retrieving cpio", but as i go to the tty4 it was installing all the packages, so it wasn't hang, just appears to be.

Revision history for this message
Nicolas Valcarcel (nvalcarcel) wrote :

I forgot, i'm using alternate CD.
On 2nd try it hangs, there was no movement on tty1 or tty4 even if i press a key

Revision history for this message
Nicolas Valcarcel (nvalcarcel) wrote :

Update: it repentantly starts working again 5 minutes after while i was writing last comment

Revision history for this message
Jamie Strandboge (jdstrand) wrote :

To see if this was a problem with the hardware kvm is presenting, I recompiled kvm 1:28-4ubuntu2 (from gutsy) on hardy and used the following:

$ qemu-img create -f qcow2 a1.img 4G
$ kvm -M pc -m 128 -smp 1 -boot d -cdrom /srv/vms/isos/hardy/hardy-server-amd64.iso -usb -vnc 127.0.0.1:1 -hda /home/jamie/a1.img

(and connected with 'xtightvncviewer localhost:1')

I used lvm+encrypt and after 4 times did not get any hangs. kvm 1:28-4ubuntu2 uses qemu 0.9.0 while kvm in hardy uses 0.9.1. I will continue to do installs and will report back if I get a hang with this version of kvm.

Revision history for this message
Ante Karamatić (ivoks) wrote :

During my kvm/qemu testing, even before Gutsy was released, I had similar problem - guest operating system would stall (both windows and linux) randomly when I would switch workspaces in GNOME, shift focus from application to application, etc... As soon as I would move focus back to qemu/kvm window, everything would continue. Sometimes even that wouldn't help.

But, I never had those problems on my workstation, only on my laptop. Workstation has Intel E6400 CPU, while laptop is occupied with Intel T2400. True, I didn't do so many tests on workstation, as I did on laptop...

At the moment, I'm doing encrypted setup of server's release candidate, with hardy's qemu+kqemu, and I didn't have problems (I'm at 15% of wiping partition).

Revision history for this message
Colin Ian King (colin-king) wrote :

A couple of points worth mulling over to corner this one..

1. Is the "hanging" a function of memory size - for example, the buffer cache filling up after a certain amount of time and then pdflush kicking in very aggressively doing dirty data writing backs as fast as it can. In which case, one may see the system resume again after a (long) period of time once pdflush is complete. It may be worth tweaking with the /proc/sys/vm/dirty* options to see if this trips the problem more frequently. Also, it may be worth seeing if doing the install with small memory kvm virtual machines shows this problem earlier than large memory virtual machines just to see if it is a buffer cache write back cpu starvation issue.

2. From my understanding dd is being used to zero the encrypted disk. Does changing the dd block size to a very large size trip this bug earlier as one gets more throughput than the normal (small) block size default?

3. Is the hanging resumed by an external interrupt (keyboard, mouse) because of the scheduler boosts interactive sessions temporarily and hence interfering (in a positive way) to the balancing of the encryption kernel space thread, dd and pdflush?

Revision history for this message
Matt Zimmerman (mdz) wrote :

On Mon, Apr 21, 2008 at 10:21:46AM -0000, Colin King wrote:
> A couple of points worth mulling over to corner this one..
>
> 1. Is the "hanging" a function of memory size - for example, the buffer
> cache filling up after a certain amount of time and then pdflush kicking
> in very aggressively doing dirty data writing backs as fast as it can.
> In which case, one may see the system resume again after a (long) period
> of time once pdflush is complete. It may be worth tweaking with the
> /proc/sys/vm/dirty* options to see if this trips the problem more
> frequently. Also, it may be worth seeing if doing the install with small
> memory kvm virtual machines shows this problem earlier than large memory
> virtual machines just to see if it is a buffer cache write back cpu
> starvation issue.
>
> 2. From my understanding dd is being used to zero the encrypted disk.
> Does changing the dd block size to a very large size trip this bug
> earlier as one gets more throughput than the normal (small) block size
> default?

It's a custom program, actually, blockdev-wipe (in the partman-crypto
package).

> 3. Is the hanging resumed by an external interrupt (keyboard, mouse)
> because of the scheduler boosts interactive sessions temporarily and
> hence interfering (in a positive way) to the balancing of the encryption
> kernel space thread, dd and pdflush?

Note also that there are reports of similar behaviour in other parts of the
installer, which aren't necessarily as I/O intensive.

--
 - mdz

Revision history for this message
Jamie Strandboge (jdstrand) wrote :

Up to date hardy with kvm 1:62+dfsg-0ubuntu6 still hangs. Note that this time I tried it with the same command as I did in my last comment (eg, kvm/xtightvncviewer and not virt-manager).

Last few times it stopped in 'Erasing data on SCSI1...'. Getting back to Soren's comment on default_idle-- whenever the install hangs, CPU utilization on the host by the guest goes to '0'. Normally during the installer it is around 100% on one of the host CPUs (when not waiting for a response from the user).

Also note that if not using virt-manager and gtkvnc (ie you use xtightvncviewer or possibly other vnc clients), it seems all you have to do to get it unhung is drag the mouse over the window (the window does not need to be the active window either). This must be a difference between vnc implementations. So when testing this, be *very* careful about mouse movements over the installer window.

Revision history for this message
Dave Morley (davmor2) wrote :

There is no fault on the test I just ran using the image 20080421 on hardware. If possible could the original report please test and see if he still has the issue with the new cd?

Revision history for this message
QuentinHartman (qhartman) wrote : Re: [Bug 217815] Re: Installation stalls randomly until a key is pressed
  • unnamed Edit (694 bytes, text/html; charset=ISO-8859-1)

On Mon, Apr 21, 2008 at 10:13 AM, davmor2 <email address hidden> wrote:

> There is no fault on the test I just ran using the image 20080421 on
> hardware. If possible could the original report please test and see if
> he still has the issue with the new cd?
>

I am downloading the that ISO now, I will run it on my problem machine and
report back tonight.

--
-Regards-

-Quentin Hartman-

Revision history for this message
Nick Barcet (nijaba) wrote :

I do not have a hang in the disk erase with the 20080421 image, as disk erase has been removed, but I just got a stall in "Installing the base system" within KVM... Pressing a key made the process move forward, as previously.

Revision history for this message
Chuck Short (zulcss) wrote :

I do not have this problem either with the 20080421 image.

Revision history for this message
Derek Kaye (kayedj) wrote :

Just a thought - is it trying to load a screensaver / blank screen, but not actually changing the screen - therefore the moving of the mouse / pressing a key cancels the failing attempt and continues the installation?

Revision history for this message
Colin Watson (cjwatson) wrote :

Derek: There's no screensaver in d-i. Screen blanking is a distant possibility, though. The default screen blanking interval set by the kernel is 10 minutes, and d-i doesn't change this; if the failure happens after 10 minutes (or nearby, to allow for possible clock skew inside kvm) then that would provide some circumstantial support for this.

Revision history for this message
Henrik Nilsen Omma (henrik) wrote :

Some more data points:

* I've tested extensively with the i386 version now (though I haven't seen any reports of it there). I ran 3 alt i386 installs with LVM+crypto in virtualbox at the same time on a 4-core box without getting any hangs.

* I ran both the i386 and amd64 versions in qemu Without the kvm HW acceleration, with no hangs.

* I ran two 64 bit installs /w LVM+crypto at the same time, one allocated 384MB RAM and one allocated 2GB, both with an 8GB disk. They both seemed to stop at the same time and very frequently, every 3-4% of progress. I killed the 2GB RAM version and let the smaller one continue -- and it would stop less often every 8-10% perhaps. Running the smaller RAM version again by itself, it stalls at about every 20% (as was usual before)

Revision history for this message
Henrik Nilsen Omma (henrik) wrote :

next data point: install to a ram disk

I first installed to a 2gb disk image and got hangs as usual. I then created a ram disk with 'mount -t ramfs none ram/ -o maxsize=3000000' and copied the 2gb image file to it. When I re-ran the installation to the store-in-ram version it completed without hangs.

The physical box has 8GB RAM, and there was no disk caching.

Revision history for this message
Henrik Nilsen Omma (henrik) wrote :

... actually on a second run it got stuck in the package install phase, and stgraber reports it getting stuck in his ramdisk installs too.

Revision history for this message
Stéphane Graber (stgraber) wrote :

Looking at the various syslog files we have, it seems we always get messages similar to those :

Apr 21 23:02:11 kernel: [ 1072.770546] Clocksource tsc unstable (delta = 182801418122 ns)
Apr 21 23:02:11 kernel: [ 1072.774264] Time: acpi_pm clocksource has been installed.

Those usually appear at the same time as the installer get stuck.

Revision history for this message
Soren Hansen (soren) wrote :

Lars just did an install with "notsc" on the kernel command line, which did trigger this bug.

Revision history for this message
Soren Hansen (soren) wrote :

Er... What I meant to say was, that with "notsc", Lars did an install that did *NOT* trigger this bug.

Revision history for this message
Soren Hansen (soren) wrote :

If further testing reveals that disabling the tsc actually helps, I've prepared a kernel patch that will fix it for us:

     http://kernel.ubuntu.com/git?p=soren/ubuntu-hardy.git;a=commit;h=c472835957c7bce42c56aa2c57caa2e80a5bdb1a

Mind you, the code is completely untested (it's getting rather late here), but it looks plausible.

Revision history for this message
QuentinHartman (qhartman) wrote :
  • unnamed Edit (1.6 KiB, text/html; charset=ISO-8859-1)

On Mon, Apr 21, 2008 at 7:16 PM, Soren Hansen <email address hidden> wrote:

> If further testing reveals that disabling the tsc actually helps, I've
> prepared a kernel patch that will fix it for us:
>
> http://kernel.ubuntu.com/git?p=soren/ubuntu-
> hardy.git;a=commit;h=c472835957c7bce42c56aa2c57caa2e80a5bdb1a
>
> Mind you, the code is completely untested (it's getting rather late
> here), but it looks plausible.
>

Well I tested this on my machine here and I still see the problem, though
with notsc it seems less frequent. I'm concerned that may be wishful
thinking on my part. I have noticed a new behavior where the install
red-screen fails during the "selecting and installing packages" bit towards
the end of the install. I get a stall like before, but this time when I hit
a key to continue, rather than carrying it, it drops me out to the red
screen. Even if I go go back a few steps in the install process, it keeps
giving me a red screen "A installation step has failed" or the like. I need
to do some more testing and see if I can get it to reliably correlate to
something I'm changing, but as you pointed out, it's getting late here
too...

--
-Regards-

-Quentin Hartman-

Revision history for this message
Matt Zimmerman (mdz) wrote : Re: [Bug 217815] Re: Installation stalls randomly until a key is pressed

On Mon, Apr 21, 2008 at 09:19:23PM -0000, Colin Watson wrote:
> Derek: There's no screensaver in d-i. Screen blanking is a distant
> possibility, though. The default screen blanking interval set by the
> kernel is 10 minutes, and d-i doesn't change this; if the failure
> happens after 10 minutes (or nearby, to allow for possible clock skew
> inside kvm) then that would provide some circumstantial support for
> this.

If that were the case, I'd expect the progress bar to jump if it was paused
for a few minutes, though no one has mentioned that.

--
 - mdz

Revision history for this message
Colin Ian King (colin-king) wrote :

It maybe worth booting with clocksource set to one of the alternative clocksources available on your machine. One can see which clocksources are available using:

cat /sys/devices/system/clocksource/clocksource0/available_clocksource

and then boot with clocksource=xxx (where xxx is one of the clocksoures).

I cannot test this as I don't have the ability to run kvm on my hardware setup here. Can somebody try this and checks if it helps or not.

Revision history for this message
QuentinHartman (qhartman) wrote : Re: [Bug 217815] Re: Installation stalls randomly until a key is pressed
  • unnamed Edit (956 bytes, text/html; charset=ISO-8859-1)

On Tue, Apr 22, 2008 at 4:26 AM, Colin King <email address hidden> wrote:

> It maybe worth booting with clocksource set to one of the alternative
> clocksources available on your machine. One can see which clocksources
> are available using:
>
> cat /sys/devices/system/clocksource/clocksource0/available_clocksource
>
> and then boot with clocksource=xxx (where xxx is one of the
> clocksoures).
>
> I cannot test this as I don't have the ability to run kvm on my hardware
> setup here. Can somebody try this and checks if it helps or not.
>

I'll try using different clock sources tonight.

--
-Regards-

-Quentin Hartman-

Revision history for this message
Henrik Nilsen Omma (henrik) wrote :

Changing package to kvm as it seems we've isolated the offending patch now.

Changed in linux:
status: Confirmed → In Progress
Revision history for this message
QuentinHartman (qhartman) wrote :

I get this problem on real hardware, _not_ KVM. In my comment above I list the model and specs for reference. I also reference that in bug #217849.

Revision history for this message
Henrik Nilsen Omma (henrik) wrote :

Setting to High and moving to 8.04.1.

"<slangasek> heno: yes, if it's in kvm we should mark it as 8.04.1 now"

Changed in kvm:
importance: Critical → High
milestone: ubuntu-8.04 → ubuntu-8.04.1
Rick Clark (dendrobates)
Changed in kvm:
milestone: ubuntu-8.04.1 → ubuntu-8.04
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package kvm - 1:62+dfsg-0ubuntu7

---------------
kvm (1:62+dfsg-0ubuntu7) hardy; urgency=low

  * Revert sighandler.patch and virtio_hang.patch. Fixes hang in installer and
    likely other places, too. (LP: #217815)

 -- Soren Hansen <email address hidden> Tue, 22 Apr 2008 19:09:33 +0200

Changed in kvm:
status: In Progress → Fix Released
Revision history for this message
QuentinHartman (qhartman) wrote :

@Colin King:

On my machine I have hpet, acpi_pm, jiffies, and tsc as available clock sources. hpet seems to be the default. Here are my results using the daily from the 21st (I think...)

hpet: stalls
acpi_pm: stalls (perhaps worse than hpet)
jiffies: Works! If this does not eliminate the stalls, it at least makes them so short they are easily confused with normal processing
tsc: stalls really badly, worse than hpet or acpi_pm

I'm going to test jiffies a couple more times, but the one run though worked great! Remember that this is on real hardware, not KVM!

Revision history for this message
Matt Zimmerman (mdz) wrote : Re: [Bug 217815] Re: Installation stalls randomly until a key is pressed

On Wed, Apr 23, 2008 at 04:25:43AM -0000, QuentinHartman wrote:
> @Colin King:
>
> On my machine I have hpet, acpi_pm, jiffies, and tsc as available clock
> sources. hpet seems to be the default. Here are my results using the
> daily from the 21st (I think...)
>
> hpet: stalls
> acpi_pm: stalls (perhaps worse than hpet)
> jiffies: Works! If this does not eliminate the stalls, it at least makes them so short they are easily confused with normal processing
> tsc: stalls really badly, worse than hpet or acpi_pm
>
> I'm going to test jiffies a couple more times, but the one run though
> worked great! Remember that this is on real hardware, not KVM!

The problem originally described in this bug report has been isolated, and
found to be specific to KVM, thus yours is not the same bug.

Please move discussion of your problem back to the bug report you filed
(#217849) to avoid confusion.

--
 - mdz

Revision history for this message
Thierry Carrez (ttx) wrote :

Declined nominations for karmic/lucid since that was fixed a long time ago... Please open a new bug if you experience the same symptoms on karmic/lucid.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.