2.6.27-2 kernel on intrepid: disk is mounted in RO after several hours

Bug #267089 reported by Lionel Porcheron on 2008-09-06
46
This bug affects 5 people
Affects Status Importance Assigned to Milestone
Linux
Fix Released
High
Mactel Support
High
Unassigned
linux (Ubuntu)
High
Leann Ogasawara
linux-backports-modules-2.6.27 (Ubuntu)
Undecided
Unassigned

Bug Description

I have just migrated my Mac Book Pro 3 to Intrepid. Running last kernel 2.6.27-2 in intrepid results in having the disk to be mounted in read only after few ours of usage (about two from my experience).
Disk works fine with another kernel.

Changed in linux:
assignee: nobody → ubuntu-kernel-team
importance: Undecided → High
status: New → Triaged
Colin Ian King (colin-king) wrote :

Hi Lionel,

A couple of things:

1. Is this bug repeatable? When it occurs can you attach the output from the "mount" command.
2. The DMA: Out of SW-IOMMU space for 4224 bytes at device 0000:0b:00.0 message in the dmesg log is interesting too - was this occurring on previous kernels?

Colin.

Hi Colin,

1. Sadly yes, this bug is repeatable. I reproduced it 4-5 times. I'm now running 2.26-5 kernel in intrepid and it works fine. I will go back to 2.6.27 and give you the result of the mount command (I did not notive anything myself after cheking the result of mount, but did not copy it).
2. The DMA: Out of message does not occurs on 2.6.26 or on 2.6.24 kernel

Sorry for the delay. I tested with 2.6.27-3-generic kernel and I still have the issue. I attached the result of the mount command when / is in read-only.

I tested/am running with kernel 2.6.27-7.12 and still have this issue, I believe Juergen diagnosed it to be a leak with the ath9k driver?

http://kerneltrap.org/mailarchive/linux-kernel/2008/8/4/2815804

Toon Verwaest (tverwaes) wrote :

I started having exactly the same problem yesterday when updating to intrepid. But what I did indeed at the same time was; switching from madwifi-ng to ath9k in the 2.6.27.4 kernel that I had built myself. Then I checked and also 2.6.27.3 and 2.6.27 have the same problem. Generally it even came down to my disks having problems (I guess because of crashing DMA and whatnot); and was very repeatable. Every time I rebooted I got the same problem after using my computer for 5 minutes or less. Now I booted in 2.6.27.4 without ath9k but with ath_pci (madwifi) and everything works fine again.

Now I would mostly like to switch away from madwifi-ng as soon as possible since it's pretty buggy itself too. The network drops out (stalls, I have to manually disconnect and reconnect) after a while.

I'm seeing the "DMA: Out of SW-IOMMU space for 4224 bytes at device 0000:0b:00.0" message too, and have also reverted to the ath_pci (madwifi) driver.

This is on a MacBook Pro version 3.1 without any peripherals attached. I did increase the memory to 4 GB, but other than that it is all stock hardware.

Markus (makkus) wrote :

I've got the same problem, also on a Macbook Pro.

All works fine while being using the network via eth0. Once I unplug the cable and use wirelees, after about 5 to 10 minutes I get the "DMA: Out of SW-IOMMU space for..." message....

Using 64bit Intrepid & 4 GB of RAM.

Ricky Campbell (cyberdork33) wrote :

Is this affecting any hardware besides a MacBookPro3,1 ?

(To determine your mac version, run "sudo dmidecode| grep Product")

Changed in mactel-support:
importance: Undecided → High
status: New → Confirmed
blindmouse (arun-elenta) wrote :

I can confirm this on another MacBookPro3,1 - but it happens regardless of being connected over ethernet or not.

Anthony Batchelor (toeknee) wrote :

sudo dmidecode| grep Product
Product Name: MacBookPro3,1
Product Name: Mac-F4238BC8

Same here.

Toon Verwaest (tverwaes) wrote :

same here.
Product Name: MacBookPro3,1
Product Name: Mac-F4238BC8

Dennis Dirdjaja (dcd-ditsch) wrote :

I never got this problem with 2 GB RAM, but as soon as I upgraded to 4 GB, I immediately got this error after approx. 5 minutes connected to my Wifi network.

dennis@silversurfer:~$ sudo dmidecode| grep Product
 Product Name: MacBookPro3,1
 Product Name: Mac-F4238BC8
dennis@silversurfer:~$ uname -a
Linux silversurfer 2.6.27-7-generic #1 SMP Tue Nov 4 19:33:06 UTC 2008 x86_64 GNU/Linux

Anthony Batchelor (toeknee) wrote :

I also have 4GB of RAM.

Toon Verwaest (tverwaes) wrote :

Me too... 4gb.

Luis R. Rodriguez (mcgrof) wrote :

Seems a lot of you are running into it, we're going to look into it on ath9k. Unfortunately we don't have the hardware to reproduce though but will try to get some and zero in on it. Let us see if it is ath9k related and if so we'll fix it.

Luis R. Rodriguez (mcgrof) wrote :

BTW what type of APs do you guys have? 11n or 11G?

Luis R. Rodriguez (mcgrof) wrote :

Also please provide the output of

lsmod | grep -i ata

Luis R. Rodriguez (mcgrof) wrote :

OK I was not able to reproduce this on a x86_64 box with > 7 GB of memory on 2.6.27.5-101.fc10.x86_64.

Do you guys have something like this on lspci:

Intel Corporation Mobile PM965/GM965/GL960 Memory Controller Hub

It seems the Intel 965 memory controller does IOMMU in software so I wonder if that is 100% correct if its the software for that. Not sure, just something someone pointed out to me.

You can try these patches, the first one just clarifies the type of memory we want to deal with the second one makes it consistent.

http://www.kernel.org/pub/linux/kernel/people/mcgrof/patches/ath9k/2008-11-14/

Luis R. Rodriguez (mcgrof) wrote :

Fedora guy ran into this as well:

https://bugzilla.redhat.com/show_bug.cgi?id=471329

Perhaps we shoul dmove this to kernel.org...

Markus (makkus) wrote :

I use an 11g ap.

$ lsmod | grep -i ata
pata_acpi 13568 0
ata_piix 29444 2
ata_generic 14212 0
libata 200160 3 pata_acpi,ata_piix,ata_generic
scsi_mod 183160 5 sbp2,sr_mod,sd_mod,sg,libata
dock 18464 1 libata

$ lspci |grep "Controller Hub"
00:00.0 Host bridge: Intel Corporation Mobile PM965/GM965/GL960 Memory Controller

I don't know whether this is related, probably not, but if I copy large files from/to an usb disk, after a minute or so I get a very high system load and the desktop get's unresponsive and even freezes from time to time. I had that on hardy as well, though...

Markus (makkus) wrote :

Ups. The lspci output above should of course be:

$ lspci |grep "Controller Hub"
00:00.0 Host bridge: Intel Corporation Mobile PM965/GM965/GL960 Memory Controller Hub (rev 03)

I am using 802.11g (base station does not do 802.11a or 802.11n, and I disabled 802.11b).

The various ATA-named modules loaded:

shane@shane-macbook-pro:~$ lsmod | grep -i ata
ata_piix 29444 2
ata_generic 14212 0
pata_acpi 13568 0
libata 200160 3 ata_piix,ata_generic,pata_acpi
scsi_mod 183160 6 usb_storage,sbp2,sd_mod,sr_mod,sg,libata
dock 18464 1 libata

Memory controller:

shane@shane-macbook-pro:~$ lspci | grep Memory
00:00.0 Host bridge: Intel Corporation Mobile PM965/GM965/GL960 Memory Controller Hub (rev 03)

I'm not sure when I'll have time to try the patches, but I'll do it as soon as I have spare time (perhaps Monday or Tuesday).

Dennis Dirdjaja (dcd-ditsch) wrote :

I am using a Time Capsule, i.e. 802.11n. I cannot provide the output as I am not in Ubuntu at the moment, but I am pretty sure they are as stated in the comments above.

Dennis Dirdjaja (dcd-ditsch) wrote :

dennis@silversurfer:~$ lsmod | grep -i ata
pata_acpi 13568 0
ata_piix 29444 2
ata_generic 14212 0
libata 200160 3 pata_acpi,ata_piix,ata_generic
scsi_mod 183160 5 sbp2,sd_mod,sr_mod,sg,libata
dock 18464 1 libata

dennis@silversurfer:~$ lspci |grep "Controller Hub"
00:00.0 Host bridge: Intel Corporation Mobile PM965/GM965/GL960 Memory Controller Hub (rev 03)

Luis R. Rodriguez (mcgrof) wrote :

OK so there a few bug reports on different sites, please stop posting to this one and refer to this one from now on:

http://bugzilla.kernel.org/show_bug.cgi?id=11811

If a fix comes up it will be propagated. For now try booting with mem=3G

Luis R. Rodriguez (mcgrof) wrote :

Please subscribe to the kernel.org bugzilla bug instead and please comment on the questions posted there:

http://bugzilla.kernel.org/show_bug.cgi?id=11811

Changed in linux:
status: Unknown → In Progress
Changed in linux:
status: In Progress → Fix Released

Two of the three patches mentioned in the upstream bug report are already in the upstream kernel and will be making their way into Jaunty:

ogasawara@yoji:~/linux-2.6$ git log ca0c7e5101fd4f37fed8e851709f08580b92fbb3

commit ca0c7e5101fd4f37fed8e851709f08580b92fbb3

Author: Luis R. Rodriguez <email address hidden>

Date: Thu Nov 20 17:15:12 2008 -0800

    ath9k: Fix SW-IOMMU bounce buffer starvation

ogasawara@yoji:~/linux-2.6$ git log b4b6cda2298b0c9a0af902312184b775b8867c65

commit b4b6cda2298b0c9a0af902312184b775b8867c65

Author: Luis R. Rodriguez <email address hidden>

Date: Thu Nov 20 17:15:13 2008 -0800

    ath9k: correct expected max RX buffer size

The third patch is in the wireless-testing tree but I imagine it will find it's way into Jaunty as well:

ogasawara@yoji:~/wireless-testing$ git log c7bd0826b3080cf3d328793adc5cb2b585277bb4

commit c7bd0826b3080cf3d328793adc5cb2b585277bb4

Author: Luis R. Rodriguez <email address hidden>

Date: Fri Nov 21 17:41:33 2008 -0800

    ath9k: Handle -ENOMEM on RX gracefully

All three of these patches are currently in the ubuntu-intrepid-lbm git tree (ie linux-backports-modules-2.6.27) since it was recently updated to the wireless-testing master-2008-11-24:

ogasawara@yoji:~/ubuntu-intrepid-lbm$ git log

commit 4239ade1bf58a0353bf749a69ad4c1264125af31

Author: Tim Gardner <email address hidden>

Date: Tue Nov 25 08:37:57 2008 -0700

    UBUNTU: Update to wireless-testing master-2008-11-24

    Signed-off-by: Tim Gardner <email address hidden>

The version of linux-backports-modules-2.6.27 which will contain these patches should eventually make it's way into intrepid-proposed for testing and then available as an update.

Thanks.

Changed in linux-backports-modules-2.6.27:
status: New → Fix Committed
Changed in linux:
status: Triaged → Fix Committed
John Pugh (jpugh) wrote :

Do we plan on submitting a SRU for this patch? This does cause corruption and is a blocker for upgrade/install on a number of systems.

Hi John,

The Intrepid kernel is being kept up to date with the upstream stable patch sets. So 2/3 of these patches will go in as an SRU when the 2.6.27.8 stable patch set is applied (a member of the kernel team is still reviewing the entire patch set). I'll try and update this report with the SRU bug number once it's opened. Additionally, linux-backports-modules-intrepid will eventually contain all three of these patches. Thanks.

Changed in linux:
assignee: ubuntu-kernel-team → leannogasawara
Changed in mactel-support:
status: Confirmed → In Progress

On Tue, Dec 23, 2008 at 11:23 AM, Leann Ogasawara
<email address hidden> wrote:
> ** Tags removed: pet-bug
>
> --
> 2.6.27-2 kernel on intrepid: disk is mounted in RO after several hours
> https://bugs.launchpad.net/bugs/267089
> You received this bug notification because you are a direct subscriber
> of the bug.
>

What does pet-bug mean? This has been long fixed as of 2.6.27.8. If
intrepid has at least this kernel then this can be closed.

  Luis

Hi Luis,

Sorry for the extra noise with the tagging. I've since removed the tag. Bug 308761 is the SRU (Stable Release Update) to incorporate the 2.6.27.8 and 2.6.27.9 patch sets into Intrepid. The kernel with these patches is currently in intrepid-proposed for anyone wanting to test - https://wiki.ubuntu.com/Testing/EnableProposed . Thanks.

deckatron75 (jsdecker) wrote :

I am in this boat.

I have a MacBookPro3,1 with 4038228 kB of ram that consistently dropped "/" into read-only under ubuntu 8.10.
I just installed ubuntu 8.10 today.

I have applied the intrepid-proposed patches and am now running 2.6.27-11-generic x86_64.

It seems stable after 60 minutes (longest run yet).

I have run the System->Administration->Hardware Testing application.

If there is anything I can do to help verify please let me know

jsdecker at comcast dot net

Thanks!

deckatron75 wrote:
> I am in this boat.
>
> I have a MacBookPro3,1 with 4038228 kB of ram that consistently dropped "/" into read-only under ubuntu 8.10.
> I just installed ubuntu 8.10 today.
>
> I have applied the intrepid-proposed patches and am now running
> 2.6.27-11-generic x86_64.
>
> It seems stable after 60 minutes (longest run yet).
>
> I have run the System->Administration->Hardware Testing application.
>
> If there is anything I can do to help verify please let me know
>
> jsdecker at comcast dot net
>
> Thanks!
>

I have been running 2.6.27-11 now for several weeks with no apparent
problems. I do have problems with ath9k, but that is another bug.

I am not sure if the wireless driver patch that Luis entered has been
added yet?

John Pugh (jpugh) wrote :

Sorry I replied and quoted the entire message above....

I do not think this is completely fixed.

This bug https://bugs.edge.launchpad.net/ubuntu/intrepid/+source/linux/+bug/278190 manifests itself after some time running with no other processes. I can re-create on my Macbook Pro v3,1 with 2.6.27-11

This kernel fixes the original symptoms, but still has ath9k problems.

On Thu, Jan 15, 2009 at 5:43 AM, John Pugh <email address hidden> wrote:
> Sorry I replied and quoted the entire message above....
>
> I do not think this is completely fixed.
>
> This bug
> https://bugs.edge.launchpad.net/ubuntu/intrepid/+source/linux/+bug/278190
> manifests itself after some time running with no other processes. I can
> re-create on my Macbook Pro v3,1 with 2.6.27-11
>
> This kernel fixes the original symptoms, but still has ath9k problems.

And how is that relevant to this bug report? BTW you can always do:

sudo apt-get install linux-backports-modules-intrepid

To get the bleeding edge ath9k.

  Luis

Luis R. Rodriguez wrote:
> sudo apt-get install linux-backports-modules-intrepid
>
> To get the bleeding edge ath9k.

I am using backports and hence bleeding edge ath9k.
It appears this may well be related to NM or wpa_supplicant. It was my
non-developer mind thinking that the original problem was causing it,
but I am obviously incorrect in that line of thought. My troub

Moving this from Fix Committed to Fix Released.

Changed in linux:
status: Fix Committed → Fix Released
Changed in linux-backports-modules-2.6.27:
status: Fix Committed → Fix Released
Changed in mactel-support:
status: In Progress → Fix Released
Changed in linux:
importance: Unknown → High
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.