Disk driver problems on Toshiba NB305Netbook

Bug #601986 reported by jim_charlton
54
This bug affects 7 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Won't Fix
Undecided
Unassigned

Bug Description

Running
Description: Ubuntu 10.04 LTS
Release: 10.04
Linux jim-nb 2.6.32-24-generic #38~pre201007021000-Ubuntu SMP Sun Jul 4 06:49:54 UTC 2010 i686 GNU/Linux
But problem seen on all kernels tried.

Boot process hangs for ~7 minutes, then machine starts. Booting with init=/bin/bash and running "hdparm -t /dev/sda5" (sda5 is my ubuntu fs) gives ~500kB/sec. Completing the upstart process brings disk up to ~40 MB/sec.

When " bootchart" is installed machine boots in 23 seconds! Resume from hibernate works. Investigation has shown that a compiled C program with an infinite loop, when run in the background from init-top in the intramfs will also cure the slow boot. The C-program saturates one of the two processors of the Atom N450 (~100% CPU1). It appears that bootchart works its magic by simply saturating one CPU. After boot, running the inifinite loop program brings disk performance (as reported by hdparm) up from 40 MB/sec to 70 MB/sec.

Adding highres=off nohz=off to the kernel command line cures slow disk problem but machine runs hot with lower battery life. Adding nohpet to the kernel command line also cures the slow boot but not the resume from suspend.

I suspect that the problem lies with ahci.ko or libata. Changing the BIOS disk parameter from ahci to compatibility also cures the problem... but screws up the existing Windows partition.

Tried the linux-image-2.6.35-020635rc3-generic_2.6.35-020635rc3_i386.deb kernel. It acts the same.

---
AlsaVersion: Advanced Linux Sound Architecture Driver Version 1.0.21.
AplayDevices:
 **** List of PLAYBACK Hardware Devices ****
 card 0: Intel [HDA Intel], device 0: ALC272 Analog [ALC272 Analog]
   Subdevices: 1/1
   Subdevice #0: subdevice #0
Architecture: i386
ArecordDevices:
 **** List of CAPTURE Hardware Devices ****
 card 0: Intel [HDA Intel], device 0: ALC272 Analog [ALC272 Analog]
   Subdevices: 1/1
   Subdevice #0: subdevice #0
AudioDevicesInUse:
 USER PID ACCESS COMMAND
 /dev/snd/controlC0: jim 1338 F.... pulseaudio
CRDA: Error: [Errno 2] No such file or directory
Card0.Amixer.info:
 Card hw:0 'Intel'/'HDA Intel at 0x40a00000 irq 22'
   Mixer name : 'Realtek ALC272'
   Components : 'HDA:10ec0272,1179ff30,00100001'
   Controls : 18
   Simple ctrls : 11
DistroRelease: Ubuntu 10.04
HibernationDevice: RESUME=UUID=04ecea7b-3e40-41af-aa7e-4d67cbf54a52
InstallationMedia: Ubuntu-Netbook 10.04 "Lucid Lynx" - Release i386 (20100429.4)
MachineType: TOSHIBA TOSHIBA NB305
Package: linux (not installed)
ProcCmdLine: BOOT_IMAGE=/boot/vmlinuz-2.6.32-24-generic root=/dev/sda5 ro
ProcEnviron:
 PATH=(custom, no user)
 LANG=en_CA.utf8
 SHELL=/bin/bash
ProcVersionSignature: Ubuntu 2.6.32-24.38~pre201007021000-generic 2.6.32.15+drm33.5
Regression: No
RelatedPackageVersions: linux-firmware 1.34.1
Reproducible: Yes
RfKill:
 0: phy0: Wireless LAN
  Soft blocked: no
  Hard blocked: no
Tags: ubuntu-une lucid filesystem needs-upstream-testing
Uname: Linux 2.6.32-24-generic i686
UnreportableReason: This is not a genuine Ubuntu package
UserGroups:

dmi.bios.date: 03/16/2010
dmi.bios.vendor: TOSHIBA
dmi.bios.version: V1.40
dmi.board.name: NPVAA
dmi.board.vendor: TOSHIBA
dmi.board.version: 1.00
dmi.chassis.asset.tag: *
dmi.chassis.type: 10
dmi.chassis.vendor: TOSHIBA
dmi.chassis.version: N/A
dmi.modalias: dmi:bvnTOSHIBA:bvrV1.40:bd03/16/2010:svnTOSHIBA:pnTOSHIBANB305:pvrPLL3AC-01E014:rvnTOSHIBA:rnNPVAA:rvr1.00:cvnTOSHIBA:ct10:cvrN/A:
dmi.product.name: TOSHIBA NB305
dmi.product.version: PLL3AC-01E014
dmi.sys.vendor: TOSHIBA

jim_charlton (charltn)
description: updated
description: updated
Revision history for this message
kurt belgrave (trinikrono) wrote :

Thank you for taking the time to report this bug and helping to make Ubuntu better. This bug did not have a package associated with it, which is important for ensuring that it gets looked at by the proper developers. You can learn more about finding the right package at https://wiki.ubuntu.com/Bugs/FindRightPackage . I have classified this bug as a bug in Linux (Ubuntu linux kernel).

affects: ubuntu → linux (Ubuntu)
Revision history for this message
Jeremy Foshee (jeremyfoshee) wrote :

Hi jim_charlton,

Please be sure to confirm this issue exists with the latest development release of Ubuntu. ISO CD images are available from http://cdimage.ubuntu.com/daily/current/ . If the issue remains, please run the following command from a Terminal (Applications->Accessories->Terminal). It will automatically gather and attach updated debug information to this report.

apport-collect -p linux 601986

Also, if you could test the latest upstream kernel available that would be great. It will allow additional upstream developers to examine the issue. Refer to https://wiki.ubuntu.com/KernelMainlineBuilds . Once you've tested the upstream kernel, please remove the 'needs-upstream-testing' tag. This can be done by clicking on the yellow pencil icon next to the tag located at the bottom of the bug description and deleting the 'needs-upstream-testing' text. Please let us know your results.

Thanks in advance.

    [This is an automated message. Apologies if it has reached you inappropriately; please just reply to this message indicating so.]

tags: added: needs-kernel-logs
tags: added: needs-upstream-testing
tags: added: kj-triage
Changed in linux (Ubuntu):
status: New → Incomplete
Revision history for this message
jim_charlton (charltn) wrote : AlsaDevices.txt

apport information

tags: added: apport-collected
description: updated
Revision history for this message
jim_charlton (charltn) wrote : BootDmesg.txt

apport information

Revision history for this message
jim_charlton (charltn) wrote : Card0.Amixer.values.txt

apport information

Revision history for this message
jim_charlton (charltn) wrote : Card0.Codecs.codec.0.txt

apport information

Revision history for this message
jim_charlton (charltn) wrote : CurrentDmesg.txt

apport information

Revision history for this message
jim_charlton (charltn) wrote : IwConfig.txt

apport information

Revision history for this message
jim_charlton (charltn) wrote : Lspci.txt

apport information

Revision history for this message
jim_charlton (charltn) wrote : Lsusb.txt

apport information

Revision history for this message
jim_charlton (charltn) wrote : PciMultimedia.txt

apport information

Revision history for this message
jim_charlton (charltn) wrote : ProcCpuinfo.txt

apport information

Revision history for this message
jim_charlton (charltn) wrote : ProcInterrupts.txt

apport information

Revision history for this message
jim_charlton (charltn) wrote : ProcModules.txt

apport information

Revision history for this message
jim_charlton (charltn) wrote : UdevDb.txt

apport information

Revision history for this message
jim_charlton (charltn) wrote : UdevLog.txt

apport information

Revision history for this message
jim_charlton (charltn) wrote : WifiSyslog.txt

apport information

Revision history for this message
jim_charlton (charltn) wrote :

At the time that I filed the bug report I tried all of the upstream kernels as well as the development kernel
linux-image-2.6.35-020635rc3-generic_2.6.35-020635rc3_i386.deb
The problem was the same on all kernels tried.

When I get a moment I will try the most recent maverick kernel again.

Note that this problem is not observable when booting from a usb device (there is no CD/DVD on the Toshiba NB305) as the problem is with the disk driver for the hard disk drive... not the driver for the usb.

Not sure how to check off "needs-upstream-testing".

Revision history for this message
jim_charlton (charltn) wrote :

checked off "needs-upstream-testing"

tags: removed: needs-upstream-testing
Revision history for this message
jim_charlton (charltn) wrote :

Problem the same on kernel linux-image-2.6.35-020635rc6-generic_2.6.35-020635rc6_i386.deb from http://kernel.ubuntu.com/~kernel-ppa/mainline/v2.6.35-rc6-maverick/.

The disk driver problem seems to be limited to the Toshiba NB305 with the Intel Atom N450 dual core processor. The disk is slow until one cpu is heavily loaded. I tried some tests on a different quad core machine and saturating one of the four cores (100% use of one cpu) makes no difference to disk performance.

Revision history for this message
Jeremy Foshee (jeremyfoshee) wrote :

This bug report was marked as Incomplete and has not had any updated comments for quite some time. As a result this bug is being closed. Please reopen if this is still an issue in the current Ubuntu development release http://cdimage.ubuntu.com/daily-live/current/ . Also, please be sure to provide any requested information that may have been missing. To reopen the bug, click on the current status under the Status column and change the status back to "New". Thanks.

[This is an automated message. Apologies if it has reached you inappropriately; please just reply to this message indicating so.]

tags: added: kj-expired
Changed in linux (Ubuntu):
status: Incomplete → Expired
Changed in linux (Ubuntu):
status: Expired → Confirmed
Revision history for this message
Kyle Altendorf (sda) wrote :

Although I have not explored every aspect that Jim has described I believe I may be observing the same (or a related) issue, albeit in Kubuntu Meerkat RC. My NB305 has been slow while installing, booting, running, and shutting down. The interesting part? For the first time I think that my impatience and pressing keys (generally shift) really is having an affect on the system. This even applies to the initial boot message (if I don't press any keys):

Gave up waiting for root device. Common problems:
- Boot args (cat /proc/cmdline)
- Check rootdelay= (did the system wait long enough?)
- Check root= (did the system wait for right device?)
-Missing modules (cat /proc/modules; ls /dev)
ALERT! /dev/disk/by-uuid/<yada-yada> does not exist. Dropping to shell!

If I hold shift down immediately after Grub then this doesn't come up. During the remainder of the boot process I can practically turn the HD LED on and off by pressing and releasing the shift key. I suspect that the shift key is having a similar effect on the kernel as Jim's infinite loop program. I am now booted again after reinstalling and the mouse seems to be having a similar effect while in X, when the mouse works that is.

Back on the install topic, when selecting the timezone even the current time display would not update unless I pressed or held a key. Also, I am using a USB to IDE adapter with a normal 'internal' DVDROM drive. Perhaps this uses a different driver than the CDROM that Jim used? Just trying to guess at why I have trouble with the CDROM but Jim doesn't.

I installed bootchart but it had no apparent effect. I am now trying the hdparm speed test. If I hold down shift throughout the test I get 128MB in 3.02 seconds (42.36 MB/sec). If I only press shift a couple times to get the test started... hdparm just hangs after printing "Timing buffered disk reads:" until I press shift again. The result this time was 10MB in 108.42 seconds (94.45 kB/sec).

My experiences certainly don't quite mirror Jim's, but (obviously) I think they seem to be related. Jim, does holding down shift do anything for you when not using any of your other 'solutions'?

Revision history for this message
jim_charlton (charltn) wrote :

Very interesting observations Kyle.
First of all, I used a bootable USB memory stick (8 Gigs) with the Ubuntu installation disk image on it to install Ubuntu. Since then, I update using the internet. I am unsure about how this would differ from using a USB to IDE adapter to install.

The interesting part is your observation about the SHIFT key. I am currently running Ubuntu Lucid 2.6.32-26-generic (10.04.1 LTS) with bootchart installed. With bootchart installed, the boot process is fast. I use hibernate instead of suspend (as I have described elsewhere) and I find that hibernation works fine and that reloading the resume image takes less than 15 seconds.

If I remove bootchart, then the boot process runs for 6 seconds, then hangs for about 25 seconds, gives a few lines of output and then hangs for many minutes, typically 5 to 7 minutes before finally booting. Resuming from hibernation (loading the disk image) shows the same long delays.

If I simply hold down the LEFT SHIFT key after selecting the OS in the grub menu and hitting enter, there seemed to be some effect on the boot time. But even better was to depress the LEFT SHIFT after about 6 seconds had passed (or if you had it depressed from the beginning, release and press and hold again). This seemed to bring the boot time down to a normal boot time. Rapidly tapping the left shift key also worked. Tapping the LEFT SHIFT key during the resume from hibernation also seemed to work the same magic! :-) Pretty amazing.

I had understood that holding the LEFT SHIFT key during boot was used to bring up the GRUB menu which may be necessary if you are booting a single OS where (I am told) the menu does not display. But I did not think that that key did anything else.

I tried the same thing with the "a" key and the ALT key. The "a" key did not give the same effect. The ALT key seemed to work but the system seemed to be very confused after it booted and I could not get it to shut down properly and had to simply shut it off.

I wondered if tapping or holding a key down simply loaded one processor down and worked like my infinite loop program... and perhaps the way bootchart has an effect. But If I check CPU use when tapping or holding the SHIFT key, I don't see much difference in the CPU load.

So, I would say that you have made a very interesting observation Kyle, but I in the dark about what it means.

The bottom line seems to be that tapping and intermittently holding the left SHIFT key during boot brings the boot process down to a normal time. Also works for resume from hibernation.

I tried the latest Maverick kernel (linux-image-2.6.36-020636rc7-generic_2.6.36-020636rc7.201010070908_i386.deb) by just installing the image and booting to it. I did not upgrade the entire machine. Without bootchart, the machine hung during boot, but tapping the left SHIFT key made it boot up pretty quickly (tried only once).

Revision history for this message
Kyle Altendorf (sda) wrote :

Thanks for the reply and I'm glad it's not just me. I usually just figure I'm going crazy when things like pressing keys and moving the mouse seem to make a difference in speed. :]

I think the most direct clue is the effect of the nohz kernel option which I think controls the dynticks feature:

http://kernelnewbies.org/Linux_2_6_21#head-8547911895fda9cdff32a94771c8f5706d66bba0

So, since dynticks is on by default (aka, when we have this problem) that means that the scheduler is not necessarily being ticked. I am guessing that both your infinite loop program and the pressing of keys are compensating for a bug in dynticks related to this hardware (or a bug in the hardware I suppose) by forcing the scheduler to run. A key repeat rate of, say, 30Hz wouldn't be that far off the classic 100Hz continuous tick.

Revision history for this message
jim_charlton (charltn) wrote :

Interesting. Let's see if I have this straight? The current kernels are using dynamic ticks. nohz=off turns off dynamic ticks so we get regular ticks as in the old OSs and the scheduler will regularly get asked to check to see what jobs should be run and prevent one process from hogging the CPUs. I guess highres=off must just reduce the rate that the scheduler is poked to check for jobs waiting. I am not sure how this fits with nohpet which also will cure the boot problem but not the resume from hibernate hang. I guess nohpet means no high precision event timer so maybe this supports your ideas of this being a timer problem. Indeed, I see another mention of this (see message 5 on http://www.linuxquestions.org/questions/fedora-installation-39/fc-10-my-installation-hangs-after-the-first-white-progress-bar-696298/) on a completely different machine where the user says that tapping the left SHIFT key will help bypass the problems with the interrupt bug.... and that adding nohpet to the kernel line will help cure his problem.

I am not sure why changing the BIOS disk parameter from ahci to compatibility cures the problem as well (although this screws up access to your windows partition if you still have it).

What we do know is that we don't want to use nohz=off highres=off as it makes the machine run hot, screws up pulse audio and reduces the time the machine will run on the battery. I continue to use bootchart as it still seems to boot faster than tapping the left SHIFT key. But both will work.

We really need to get a developer that understands all of this interested in this problem.

Revision history for this message
jim_charlton (charltn) wrote :

Inspired by the timer discussion, I revisited hpet. I tried hpet=force and clocksource=hpet. Both failed. So I tried nohpet again. This gives about a 30 second boot. Not as fast as when bootchart is installed, but not bad. If one puts nohpet in /etc/default/grub in line
GRUB_CMDLINE_LINUX="nohpet"
and then run update-grub and reboot, the parameter is persistently used for each boot.
I then tried hibernate and resume from hibernate. This also works well, and resume from hibernate takes less than 15 seconds (just like with bootchart installed).

So this is also a way to cure the slow boot problem. I am not sure what it does to the highres timers. I assume that the system is still running tickless in highres. Not sure how to check that. Pulse audio seems to be working OK which makes me think that the system is running in highres.

It does not cure the poor disk performance. hdparm -t /dev/sda5 still gives me only about 50 MB/sec. This increases to about 70 MB/sec if one launches a program that saturates one of the CPUs (like an infinite loop). So there is still a problem.

After resume from hibernate, X still crashes randomly which is annoying. I have been meaning to look into that as well. X does not crash randomly after a normal boot.

Revision history for this message
jim_charlton (charltn) wrote :

Bad news. I upgraded the Toshiba NB305 to Maverick (2.6.35-22-generic, 10.10). I tried booting with nohpet on the kernel boot line and it hung during boot. This is contrary to what I observed with Lucid as described above. Tapping the left SHIFT key does cause the boot to resume though. So that does work as Kyle has described. The machine will not hibernate but freezes half way through hibernation and has to be hard reset. . Booting without the nohpet, hangs during boot but tapping the left SHIFT key causes boot to resume. For both the boot attempts described above the music that plays after login is all stuttering and cutting in and out. Hibernate does not work after booting with or without nohpet and tapping the shift key to get it to boot. Installing bootchart does not cure the boot problem like it did in Lucid. After installing bootchart, the machine boots into initramfs with or without tapping the left shift key.

Bottom line... Maverick is worse than Lucid!!! At least in Lucid I could get reliable booting and hibernate/resume. With Maverick, the only way to get it to boot is to tap the left SHIFT key.... and hibernate/resume does not work (at least for me). Sigh...

Revision history for this message
Cameron Matheson (cameron-matheson) wrote :

Try adding intel_idle.max_cstate=0 to your kernel command-line. This has gotten me back to a situation that is mostly equivalent to Lucid (I have to hold down the shift key when booting into X too, or else unity never starts up (i have to kill Xorg and try again).

Revision history for this message
Kyle Altendorf (sda) wrote :

I did try nohpet and saw no effect. nohz=off does work though, no key presses required. I haven't done much with hibernate or suspend. intel_idle.max_cstate=0 resulted in the same "Gave up waiting for root device" error I had posted above. Now that you're on Maverick, Jim, have you seen that error? Or is that still just me.

Revision history for this message
jim_charlton (charltn) wrote :

If I boot maverick with or without bootchart installed the system hangs on booting. I did not wait forever but after a minute tapped the left shift key twice at a 10 second interval. The system gave the message "Gave up waiting for root device", a couple of other lines and then dropped to the initramfs prompt. I also note that the system won't shutdown after the "reboot" command until I tap the left SHIFT key.

Adding intel_idle.max_cstate=0 to the kernel command line does nothing for me. I only tried it once but it seems to boot slower, still requiring the left SHIFT key tap and it won't go into hibernation. It hangs when I close the lid (lid button configured to do hibernation rather than suspend. I have to reset to recover from this.

Revision history for this message
jim_charlton (charltn) wrote :

I tried clocksource=acpi_pm and clocksource=hpet and both failed to get the machine to boot.

I changed the disk parameter "acpi" to "compatibility" in the BIOS and the machine boots fine, and will also hibernate and resume from hibernate. Not only that, but "hdparm -t /dev/sda5" gives me 70 MB/sec.

When I tried using "compatibility" mode before it totally screwed up the Windows partition when I tried to boot Windows. But I really don't care about Windows so I am going to stick with "compatibility" for now and see how it goes.

Revision history for this message
jim_charlton (charltn) wrote :

I really do think that the problem is with the AHCI driver in Ubuntu. I presume that using "compatibility" mode, drops one back to IDE driver mode and I see (ps -=ef) that ata_sff and ata_aux are loaded and running. I see mention of ata_piix in dmesg but not as a loaded process after the machine boots. I have seen other threads where people say that using ata_piix kills disk performance. I don't know enough about disk drivers on Ubuntu to really say much... but overall... the problem on the NB305 sure seems to be how the AHCI driver and the timer interact. Turning off AHCI brings the system back to normal ... although I have not tested disk performance (or how it affects CPU load) extensively.

Revision history for this message
jim_charlton (charltn) wrote :

OK. OK. It seems I spoke too soon. Using "Compaitibility" mode, resume from hibernate doesn't really work without any intervention. To get it to restart you have to wait until the disk image is reloaded and the screen goes black and then tap the left shift key a few times.... just like before. Oh well.... at least the machine boots without intervention.... and it will resume from hibernation... with a bit of prodding!

Revision history for this message
jim_charlton (charltn) wrote :

Three weeks after upgrading to Ubuntu Maverick (10.10, 2.6.35-22-generic) I still can't find a configuration where the Toshiba NB305 Netbook will boot and run reliably with either the "compatibility" or "AHCI" disk setting in the BIOS.

I have resorted to booting the machine on the 2.6.32-26-generic kernel with the disk "compatibility" mode... despite the fact that I have all of the maverick packages loaded. Strangely enough... this seems to work just fine. The machine boots and runs without any hesitation. It even hibernates and resumes OK as long as I still boot on the older kernel.

Revision history for this message
Brandon Low (lostlogic) wrote :

Jim: I'm doing pretty well with nohz=off nohpet highres=off.

hdparm -tT /dev/sda
/dev/sda:
 Timing cached reads: 1554 MB in 2.00 seconds = 777.62 MB/sec
 Timing buffered disk reads: 570 MB in 3.01 seconds = 189.63 MB/sec

Suspend to RAM works properly, but suspend to disk does not, but that's good enough for me.

Still would like some kind of real resolution to the underlying bug.

Revision history for this message
kurt belgrave (trinikrono) wrote :
Revision history for this message
Seth Forshee (sforshee) wrote :

I've been poking at this machine quite a bit over the past few weeks. I'll lay out what I know about this issue, which isn't a complete explanation.

There are various command-line parameters that seem to 'fix' this issue along with the overall performance problems seen with this machine. What all these workarounds ultimately seem to be doing is keeping the CPU busy enough that it seldom or never gets into idle states deeper than C1. You can force this behavior without affecting any other features (tickless, etc.) by passing intel_idle.max_cstate=1 on the kernel command-line. I've yet to see the disk driver or performance issues running with this option.

This points to a problem with deep C-states, which in turn points to an issue somewhere outside of the kernel. Other machines based on this chipset don't exhibit these problems, and loading newer microcode to the CPU doesn't help, so the most likely explanation is that the problem lies with the BIOS. This is something only Toshiba will be able to fix, so all we're left with are the various workarounds.

We're already planning to add some text to the natty release notes recommending to add "highres=off nohz=off" to the kernel command line in response to another BIOS-related issue (see bug 640100), which also effectively works around the issues here. Unfortunately that's probably the best we'll be able to do.

Revision history for this message
kurt belgrave (trinikrono) wrote :

Seth, so what can we do with this bug report, should it be set to wont fix, since we are unable to fix the issue as it is a manufacturer issue or should it be upstreamed to the kernel bug tracker?

Revision history for this message
marcus aurelius (adbiz) wrote : RE: [Bug 601986] Re: Disk driver problems on Toshiba NB305Netbook

a user on the ubuntuforum found a workaround that eliminated the long boot time. the suggestion was to turn sata compatibility off. it seems to work.

http://ubuntuforums.org/showthread.php?t=1497461&highlight=toshiba+nb305

i have not tested the suggestion on how to make suspend and hibernate work. i could never get it to work, so i set the computer to shut down, which seems to work.

Revision history for this message
Seth Forshee (sforshee) wrote :

@kurt: I just forgot to change the status, thanks for pointing it out. I've already corresponded with upstream kernel devs about the C-state issues, who agreed that it looks like a BIOS problem, so I don't see any benefit to filing a bug upstream.

@curtis: I suspect that workaround really just masks the underlying problem. It's easy to show that sata compatibility _can_ work correctly by using intel_idle.max_cstate=1. Perhaps there's some bad interaction between sata mode and deep C-states that leads to the performance issues. It still seems like the best solution is to pass nohz=off highres=off though since it works around the suspend-to-RAM issues.

Changed in linux (Ubuntu):
status: Confirmed → Won't Fix
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.