Fedora

Hardy -- Suspend RUINS FILESYSTEM

Bug #203537 reported by ichudov on 2008-03-18

	Status	Importance	Assigned to
Fedora	Won't Fix	Medium	redhat-bugs #451404
linux (Ubuntu)	Fix Released	High	Unassigned
Hardy	Won't Fix	High	Jim Lieb

Bug Description

I am running Ubuntu Hardy Alpha 6. With latest updates.

Doing /usr/sbin/pm-suspend shuts my computer down looking good. When powered back up, it does not boot. When rebooted, it says GRUB ERROR 17.

I booted from a Live CD and could not mount the root partition as it no longer contained ext3.

I was able to run fsck -y -t ext3 /dev/sda and fix partition. However, a few files were corrupted.

:root:~ ###parted /dev/sda print

Disk /dev/sda: 250GB
Sector size (logical/physical): 512B/512B
Partition Table: msdos

Number Start End Size Type File system Flags
1 32.3kB 240GB 240GB primary ext3 boot
2 240GB 250GB 10.2GB extended
5 240GB 250GB 10.2GB logical linux-swap

Information: Don't forget to update /etc/fstab, if necessary.

~/misc/life/wallpapers==>lspci
00:00.0 RAM memory: nVidia Corporation C51 Host Bridge (rev a2)
00:00.1 RAM memory: nVidia Corporation C51 Memory Controller 0 (rev a2)
00:00.2 RAM memory: nVidia Corporation C51 Memory Controller 1 (rev a2)
00:00.3 RAM memory: nVidia Corporation C51 Memory Controller 5 (rev a2)
00:00.4 RAM memory: nVidia Corporation C51 Memory Controller 4 (rev a2)
00:00.5 RAM memory: nVidia Corporation C51 Host Bridge (rev a2)
00:00.6 RAM memory: nVidia Corporation C51 Memory Controller 3 (rev a2)
00:00.7 RAM memory: nVidia Corporation C51 Memory Controller 2 (rev a2)
00:02.0 PCI bridge: nVidia Corporation C51 PCI Express Bridge (rev a1)
00:03.0 PCI bridge: nVidia Corporation C51 PCI Express Bridge (rev a1)
00:04.0 PCI bridge: nVidia Corporation C51 PCI Express Bridge (rev a1)
00:05.0 VGA compatible controller: nVidia Corporation MCP51 PCI-X GeForce Go 6100 (rev a2)
00:09.0 RAM memory: nVidia Corporation MCP51 Host Bridge (rev a2)
00:0a.0 ISA bridge: nVidia Corporation MCP51 LPC Bridge (rev a3)
00:0a.1 SMBus: nVidia Corporation MCP51 SMBus (rev a3)
00:0a.3 Co-processor: nVidia Corporation MCP51 PMU (rev a3)
00:0b.0 USB Controller: nVidia Corporation MCP51 USB Controller (rev a3)
00:0b.1 USB Controller: nVidia Corporation MCP51 USB Controller (rev a3)
00:0d.0 IDE interface: nVidia Corporation MCP51 IDE (rev f1)
00:0e.0 IDE interface: nVidia Corporation MCP51 Serial ATA Controller (rev f1)
00:10.0 PCI bridge: nVidia Corporation MCP51 PCI Bridge (rev a2)
00:10.1 Audio device: nVidia Corporation MCP51 High Definition Audio (rev a2)
00:14.0 Bridge: nVidia Corporation MCP51 Ethernet Controller (rev a3)
00:18.0 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] HyperTransport Technology Configuration
00:18.1 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] Address Map
00:18.2 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] DRAM Controller
00:18.3 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] Miscellaneous Control
04:05.0 Ethernet controller: Atheros Communications, Inc. AR2413 802.11bg NIC (rev 01)
04:06.0 CardBus bridge: Texas Instruments PCIxx12 Cardbus Controller
04:06.2 Mass storage controller: Texas Instruments 5-in-1 Multimedia Card Reader (SD/MMC/MS/MS PRO/xD)
05:00.0 USB Controller: NEC Corporation USB (rev 44)
05:00.1 USB Controller: NEC Corporation USB 2.0 (rev 05)

Tags:

Revision history for this message

ichudov (igor-chudov) wrote on 2008-03-18:

Acer Aspire 9300 laptop.

I meant fsck -y -t ext3 /dev/sda1.

Nothing in /var/log/messages.

I disabled all /usr/sbin/pm-* for now by renaming them.

root:~ ###free
total used free shared buffers cached
Mem: 3788848 858336 2930512 0 18636 344224
-/+ buffers/cache: 495476 3293372
Swap: 9936160 0 9936160

:root:~ ###cat /etc/fstab
# /etc/fstab: static file system information.
#
# <file system> <mount point> <type> <options> <dump> <pass>
proc /proc proc defaults 0 0
# /dev/sda1
UUID=0e95e649-79be-40ef-a5f5-58ab2939e1a1 / ext3 errors=remount-ro 0 1
# /dev/sda5
UUID=6b6ca593-1835-42e0-99ea-d85a2fc0615f none swap sw 0 0
/dev/hdc /media/cdrom0 udf,iso9660 user,noauto,exec,utf8 0 0

igor.chudov.com:/home/ichudov/tmp/incoming/torrent /torrents nfs ro,auto 0 0
igor.chudov.com:/home/ichudov/photos /photos nfs ro,auto 0 0
igor.chudov.com:/films/entertainment /entertainment nfs ro,auto 0 0
besm6:root:~ ###

Leann Ogasawara (leannogasawara) on 2008-03-18

Changed in linux:
assignee:	nobody → ubuntu-kernel-team
importance:	Undecided → High
status:	New → Triaged

Revision history for this message

ichudov (igor-chudov) wrote on 2008-03-19:

My system recovery and update script Edit (11.7 KiB, text/x-sh)

I have my laptop set up to be completely restorable and recoverable with just a few shell scripts.
They install packages, check out my projects from CVS, set up apache etc etc etc.

So this laptop can be restored from a fresh reinstall quickly and easily.

If this could help, I am willing to volunteer to test this bug and try various things, I dont care if it breaks my laptop often. I can live with that.

Revision history for this message

MrAuer (mr-auer) wrote on 2008-04-05:

I made a separate bug report of this, since Im running 64 bit Hardy. The first description here tells almost exactly what happened to me yesterday. I am running Hardy alpha 5, updated to latest packages yesterday. I tried suspending, then woke ut up - keyboard and mouse didnt work, not even ctrl-alt-SysRq, so I was forced to do a hard reset. After this - GRUB error 17, corrupted ALL my partitions - root, home, storage.I booted with a live cd, and fsck`ed the partitions - my home folder is now vanished into tens of numbered files in lost+found, also lost some stuff from root. Can only log in as root in safe mode now. I havent yet been able to get to my logs there - may be I will have to reinstall, worst case scenario. If you have ideas about what logs would be useful, Ill try to get them. That machine is AMD 5000+ X2, 4 GB ram, Nvidia chipset on Asus mobo, Nvidia GT 8600 display, running kernel 2.6.24-14-rt, all packages up to date as of yesterday.

Revision history for this message

MrAuer (mr-auer) wrote on 2008-04-05:

My mobo is Asus M2N - Nvidia chipset.

Revision history for this message

Hendy Irawan (ceefour) wrote on 2008-04-25:

Probably related to bug:
https://bugs.launchpad.net/ubuntu/+source/acpid/+bug/198125

Does this bug still exist on release Hardy?

If so this is *VERY* dangerous people!!!!!!!!!!!!

Revision history for this message

quincy robinson (arqueware) wrote on 2008-04-30:

I confirm this bug on a fresh install of 8.04 x86_64. twice yesterday i installed Hardy from the text-based install disk.

The first error was Grub Error 2, but at the time, i did not realize what had caused it.

The latest one is Grub Error 17. Linux had already booted fresh after the initial installation, but after suspending would not restore, and then the error 17. It was a fresh install, so no real harm done, but for the waste of time involved.

Revision history for this message

trogdor (boggsj) wrote on 2008-04-30:

I can also confirm this, on a fresh install of 8.04 amd64. Suspending trashes the filesystem. I reinstalled twice and it happened both times. This is pretty serious. I didn't lose any data, since my home directory is on a different disk that wasn't mounted, but this is a big problem.

I'm using an Everex StepNote XT5000T. 4GB RAM, 256MB GeForce Go 7600, 100GB and 250GB hard drives. I tried this with both the "free" and "restricted" Nvidia driver. Filesystem trashed both times. I'd love to attach log files, but there are none. Any advice on how I can provide more useful information and maybe get this fixed?

Revision history for this message

trogdor (boggsj) wrote on 2008-04-30:

Oh, I forgot to mention I also get Grub Error 17 after powering back up.

Revision history for this message

quincy robinson (arqueware) wrote on 2008-04-30:

this also occurs in Gutsy 7.10 x86_64

Revision history for this message

Paul Novotny (paul-novotny) wrote on 2008-05-01:

#10

This happens to me too. fsck turned everything into unreadable numbered files in lost+found

I cannot believe this is a known bug that wiped out my hard drive. A two month old known bug in the supposed rock solid Hardy Heron released version. This is absolutely the worst bug possible! "On some computers if you push the big shiney orange Suspend button, you loose all your data". Are you serious.

Revision history for this message

Paul Novotny (paul-novotny) wrote on 2008-05-01: Re: [Bug 203537] Re: Hardy -- pm-suspend RUINS PARTITION TABLE

#11

FYI, here is a description of how to remove the Suspend and Hibernate
buttons. I wasn't comfortable having them around so I can accidentally
hit them.

http://jeremy.visser.name/2007/02/08/how-to-disable-suspend-and-hibernate-for-all-users-in-ubuntu/
Pay attention to comment #3, since the flag location has been moved.

I also disabled suspend and hibernate in '/etc/defulat/acpi-support',
don't know what that changed, but I did it anyways.

On Thu, 2008-05-01 at 08:38 -0500, Igor Chudov wrote:
> Yep, I agree totally. In fact I recently wanted to check if the
> suspend is working already, but it seems like I should wait.
>
> i
> On Thu, May 01, 2008 at 01:06:49PM -0000, Paul Novotny wrote:
> > This happens to me too. fsck turned everything into unreadable numbered
> > files in lost+found
> >
> > I cannot believe this is a known bug that wiped out my hard drive. A two
> > month old known bug in the supposed rock solid Hardy Heron released
> > version. This is absolutely the worst bug possible! "On some computers
> > if you push the big shiney orange Suspend button, you loose all your
> > data". Are you serious.
> >

Revision history for this message

ichudov (igor-chudov) wrote on 2008-05-01: Re: Hardy -- pm-suspend RUINS PARTITION TABLE

#12

Would anyone, please, look at this bug?

My laptop does not contain anything critical and I can test it whenever you want. I can always rebuild it with a few scripts.

Revision history for this message

trogdor (boggsj) wrote on 2008-05-01:

#13

I'm also willing to do testing. My home directory is mounted on a separate disk, which I can umount for testing purposes. I'm willing (even eager) to trash the root filesystem as many times as necessary to get suspend/resume working. I can live without suspend, but it's nice to have. I got used to it in OS X on my iBook and in Vista, on the very laptop where it's broken in Ubuntu.

What Paul wrote above is spot-on. This is a very serious bug. If you accidentally click the wrong button when trying to power down or log out, you'll lose all your data. I'm sure that would be a show-stopper for almost anyone.

Revision history for this message

trogdor (boggsj) wrote on 2008-05-02:

#14

Maybe this should be renamed "RUINS FILESYSTEM", since this bug destroys all the data on the disk, rather than just the partition table? Maybe it would get more attention....

Revision history for this message

quincy robinson (arqueware) wrote on 2008-05-02:

#15

My rig was a XP64 - x86_64 dual boot. I was able to restore my XP64 partition with SuperGrub, but the linux partition table entry was trashed. Is there a way to back up the partition table to experiment? I don't think the data is destroyed, just "lost."

Revision history for this message

quincy robinson (arqueware) wrote on 2008-05-02:

#16

Nix that.

(just dd the first 512K of the device)

Revision history for this message

exactt (giesbert) wrote on 2008-06-03:

#17

i just wanted to try if suspend2ram works. i really didn't expect my filesystem to vanish. this is really critical!!!!! could it be related to nvidia? i am/was running latest hardy amd64 with nvidia-glx-new drivers installed. ended up with grub error 17 as well...

Revision history for this message

exactt (giesbert) wrote on 2008-06-03:

#18

this bug left my system partition as well as the data partition as a bunch of files and directories in lost+found. this is a disaster!
for completeness and hopefully a fast bug fix i want to mention that i was using 2.6.24-38(?, the latest from proposed) kernel and that the system suspended and woke up again. but i wasn't working correctly. the screensaver showed up(i have the random picture one) but i couldn't exit it. i could switch to console using ctrl+alt+f1 but couldn't login. whatever username i entered it asked again for the username. then (i think) i shutdown using the power button. afterwards grub error 17 was reported.

going to cry the whole night long now :-(

Revision history for this message

Paul Novotny (paul-novotny) wrote on 2008-06-03:

#19

Exactt, sorry to hear you have joined the club. Just for completeness, I also experienced the bug with amd64 and nvidia-glx-new.

Revision history for this message

Oliver (lobohacks) wrote on 2008-06-10:

#20

Hello,
same here on my AMD 780G chipset system.
I just bought a Gigabyte GA-MA78GM-S2H board with a AMD X2 4850e cpu and MDT 2x2048-800-CL-KIT AMD Edition memory.

Happens on a fully updated backports & proposed system and on a fresh setup system too.

I was able to grep (alias write down on paper !oldschool ;)! ) something from tty1 just after resume.

CODE
[117.239935] EXT3-fs error (device dm-1) : ext3_memblock : <6> attempt to access beyond end of device
[117.239992] Aborting journal on dev dm-1
[117.243010] Remounting filesystem read-only
[117.247559] EXT3-fs error (device dm-1) : ext3_free_blocks : Freeing blocks in system zone - Block = 1507582, count = 1
[117.247671] EXT3-fs error (devive dm-1) In ext3_free_blocks_sb : Journal has aborted
\CODE

Please fix this one, i want my ubuntu back!

Regards oliver

Revision history for this message

arturj (arturj-freenet) wrote on 2008-06-17:

#21

Hallo,

same here on my AMD 650G chipset Mobo.

I was running a SATA1 HD with 200GB and no problem (nvidia-glx-new grafix driver). After moving (fresh install of hardy) to a SATA2 HD with 500GB and 32MB of cache my filesystem gets totaly corrupted by Suspend-to-RAM. Resuming first gives me some errors:

kbd 00:0a: activation failed
ata1.00: revalidation failed
a couple of USB errors follow

After rebooting the machine the linux partition is not accessible giving me:

"GRUB loading, please wait....
Error 17"

The Problem first arised with the new Harddisk - so there seems to be some difference between them.

Revision history for this message

arturj (arturj-freenet) wrote on 2008-06-18:

#22

APPENDIX:

Hello again,

some further informations regarding my hardware:
Mainboard Asus M2A-VM with ATI 690G and ATI SB600 Chipsets. SB600 provides the SATA interface
AMD64x2 CPU
2x2GB of RAM

My old system not only had a 200GB SATA disk but also only 2x512MB of RAM (and everythin was OK under HARDY). After insertig 2x2GB RAM nvidia driver did not function. A BIOS-Update solved the Problem.

Also: my new RAM and the new DISK seem to be OK, as Windows-XP (SP2) works stable and even supend-resume works flawlessly!

So only the RAM and DISK have changed. Some people suggest unsing "noapic acpi=off" kernel options to solve very similar problems. Is there a bug of compatibility issue with the SD_MOD driver?

Revision history for this message

exactt (giesbert) wrote on 2008-06-18:

#23

just wanted to add that i also have 2x2GB and a 500GB HD connected via SATA to a SB600 south-bridge.

Revision history for this message

arturj (arturj-freenet) wrote on 2008-06-18:

#24

Hello again,

in some other forums I found this:

"> iommu problem? Try it with mem=3G.

YES! :-) How did you know?

Even mem=4G did the trick."

Problem: even with "mem=4GB" option, the kernel only sees 3GB of useable RAM. IOMMU seens to cause problems.

dmesg W/O mem= option:
[17180642.907074] Checking aperture...
[17180642.907074] Node 0: aperture @ 4000000 size 32 MB
[17180642.907074] Aperture pointing to e820 RAM. Ignoring.
[17180642.907074] No AGP bridge found
[17180642.907074] Your BIOS doesn't leave a aperture memory hole
[17180642.907074] Please enable the IOMMU option in the BIOS setup
[17180642.907074] This costs you 64 MB of RAM
[17180642.907074] Mapping aperture over 65536 KB of RAM @ 4000000
[17180642.907074] PM: Registered nosave memory: 0000000004000000 - 0000000008000000
[17180642.907074] Memory: 3760952k/4980736k available (2489k kernel code, 169668k reserved, 1371k data, 344k init)
.....
[17180643.205903] PCI-DMA: Disabling AGP.
[17180643.205903] PCI-DMA: aperture base @ 4000000 size 65536 KB
[17180643.205903] PCI-DMA: using GART IOMMU.
[17180643.205903] PCI-DMA: Reserving 64MB of IOMMU area in the AGP aperture

dmesg WITH mem=4GB option:

[17180642.907074] Checking aperture...
[17180642.907074] Node 0: aperture @ 4000000 size 32 MB
[17180642.907074] Aperture pointing to e820 RAM. Ignoring.
[17180642.907074] No AGP bridge found
[17180642.907074] Memory: 3052376k/3144576k available (2489k kernel code, 91812k reserved, 1371k data, 344k init)
*** and thats all regarding MMU and AGP ***

Revision history for this message

Oliver (lobohacks) wrote on 2008-06-18:

#25

Hello,
I can confirm this "fix".
Adding mem=3G to boot options "fixes" the bug.
Now I want my 1GB of missing Ram back!! ;)

Regards oliver

Revision history for this message

arturj (arturj-freenet) wrote on 2008-06-18:

#26

Hello,

reading in different forums I get more and more the idea that the IOMMU implementation in the GART driver is to blame.

To get back all your memory you can try the kernel option "iommu=soft" which forces a software emulated IOMMU to take over instead the AMD64 hardware IOMMU. Different people reported that this also fixes the problem without limiting the RAM. Btw. this is the default if you are using an Intel IA64 because this CPU has no hardware IOMMU.

Revision history for this message

trogdor (boggsj) wrote on 2008-06-19:

#27

YES! arturj, THANK YOU.

I can confirm that the kernel option "iommu=soft" fixes this problem. I can now suspend/hibernate/resume just fine. All 4GB of my RAM seem to be available. I'm using 64-bit Hardy on an Everex XT5000T with 2 hard drives, 100GB and 250GB, no problem. Now that this is working again, Ubuntu seems complete :-)

Revision history for this message

trogdor (boggsj) wrote on 2008-06-19:

#28

Some more details:
Prior to enabling the "iommu=soft" kernel option, this is what dmesg shows in regards to IOMMU:

/var/log$ zgrep IOMMU dmesg.1.gz
[ 22.839979] Please enable the IOMMU option in the BIOS setup
[ 23.667358] PCI-DMA: using GART IOMMU.
[ 23.667365] PCI-DMA: Reserving 64MB of IOMMU area in the AGP aperture

After enabling the kernel option:
/var/log$ grep IOMMU dmesg

No more IOMMU messages.

Revision history for this message

ichudov (igor-chudov) wrote on 2008-06-19:

#29

I am the original submitter of this bug. For months, I avoided hibernating on this laptop.

After seeing messages about IOMMU=soft, I verified that I also have messages about IOMMU in my dmesg.

I then added iommu=soft to kopt setting in /boot/grub/menu.lst and reran update-grub.

After this I rebooted and verified that there are no more messages about IOMMU in dmesg.

Then I hibernated.

THAT WORKED!

Then I suspended to RAM. That did NOT work, however, my disk was not corrupted. If I can suspend, but not hibernate, I will be happy.

Revision history for this message

Fernando Ipar (fipar) wrote on 2008-06-19:

#30

I can confirm that the 'iommu=soft' option fixes this. I'm running hardy on a Vostro 1000 with 4gb of RAM.

Thanks arturj!! You made my ubuntu usable again (I had to tune2fs -c 0 my filesystems due to the frequent boots so hibernate is really welcome).

Revision history for this message

Oliver (lobohacks) wrote on 2008-06-19:

#31

I can confirm that too iommu=soft fixes this bug

Revision history for this message

Paul Novotny (paul-novotny) wrote on 2008-06-19:

#32

Let me preface this by saying, I am no expert, I just did some googling, and I am most likely getting the info wrong.

BUT, from what I found, it looks like the kernel thinks that the IOMMU is setup wrong in the BIOS so it makes some changes at startup. I think that is what the "Using GART IOMMU" and "Reserving 64MB of IOMMU area in the AGP aperture" are saying. The problem comes in when we supsend. During resume this 'fixing' doesn't occur again for some reason and there is a mismatch between the computer state before suspend and after suspend. Thus, "RUINS FILESYSTEM".

I wonder if the "iommu=soft" is a case of throwing the baby out with the bath water? I don't know what the hardware IOMMU does exactly, but we probably want to keep it around since AMD took the time to implement on their chips.

Now, I THINK, this aperture is fixing things for AGP cards, which most of us probably don't have. So long story short, I looked through the iommu settings and there is "iommu=noagp". I have also seen recomendations of "iommu=noaperture" and "iommu=memaper=3" (or "iommu=memaper=1" or "iommu=memaper=2")

I can't go through losing my data again, so is there someone willing to try these out as alternatives to "iommu=soft"? I give no guarantees.

Revision history for this message

arturj (arturj-freenet) wrote on 2008-06-19:

#33

Hello,

beside iommu=soft which I proposed and tested successfully I also already had tested:

- iommu=noagp NO success
- iommu=noaperture SUCCESS (seems similar to =soft according to dmesg output)
- iommu=memaper1,2,3 NO success

Still I agree, that this should be considered as a "workaround" as AMDs hardware implementation of IOMMU should give some performance boost.

Revision history for this message

trogdor (boggsj) wrote on 2008-06-20:

#34

Interesting... with iommu=soft, suspend/hibernate/resume works, but it seems to confuse the graphics card driver. My laptop has a Nvidia GeForce Go 7600 (256MB), and before suspending, adjusting brightness with nvclock works just fine. After resuming, the display is at maximum brightness and nvclock doesn't work anymore.

$ nvclock -S 40
Error!
Smartdimmer is only supported on certain laptops using a Geforce 6200/7x00Go. If you want support on your laptop contact the author.

This is small potatoes, though. Getting suspend/resume working again is huge :-)

Revision history for this message

pawe (peterawe) wrote on 2008-06-24:

#35

Just want to add that the hibernate->ruined filesystem bug is also a problem on my

CPU AMD 64 X2 Dual Core Processor 5200+
MB M2A-VM HDMI
GPU ATI RADEON HD3870 (restricted drivers)

Ubuntu 8.04 (hardy) (64bit)
2.6.24-19-generic

everything up-to-date via update-manager as of 2008-06-24

Revision history for this message

Daniel Silberschmidt (dansilber) wrote on 2008-07-03:

#36

Same issue here, also with amd64 and nvidia-glx-new running hardy.

After hibernating, system locked and then grub error 17... my first partition became unmountable

Fortunately I was very lucky and was able to recover the partition with

e2fsck dev/sda1

without even a file in Lost+found....

Hope a fix for this will be soon available!

Revision history for this message

Leann Ogasawara (leannogasawara) wrote on 2008-08-28:

#37

The Ubuntu Kernel Team is planning to move to the 2.6.27 kernel for the upcoming Intrepid Ibex 8.10 release. As a result, the kernel team would appreciate it if you could please test this newer 2.6.27 Ubuntu kernel. There are one of two ways you should be able to test:

1) If you are comfortable installing packages on your own, the linux-image-2.6.27-* package is currently available for you to install and test.

--or--

2) The upcoming Alpha5 for Intrepid Ibex 8.10 will contain this newer 2.6.27 Ubuntu kernel. Alpha5 is set to be released Thursday Sept 4. Please watch http://www.ubuntu.com/testing for Alpha5 to be announced. You should then be able to test via a LiveCD.

Please let us know immediately if this newer 2.6.27 kernel resolves the bug reported here or if the issue remains. More importantly, please open a new bug report for each new bug/regression introduced by the 2.6.27 kernel and tag the bug report with 'linux-2.6.27'. Also, please specifically note if the issue does or does not appear in the 2.6.26 kernel. Thanks again, we really appreicate your help and feedback.

Revision history for this message

arturj (arturj-freenet) wrote on 2008-09-23:

#38

I booted from Intrepid Ibex 8.10 Alpha5 Live-CD and did a "dmesg". All "AGP" and "IOMMU" relevant messages seem to be the same as with 8.04. I have done no suspend testing, since it needs nvidia binary drivers installed on my system to resume.

grep "AGP" dmesg-intrepid
[ 0.004000] No AGP bridge found
[ 0.608433] PCI-DMA: Disabling AGP.
[ 0.608662] PCI-DMA: Reserving 64MB of IOMMU area in the AGP aperture

grep IOMMU dmesg-intrepid
[ 0.004000] Please enable the IOMMU option in the BIOS setup
[ 0.608655] PCI-DMA: using GART IOMMU.
[ 0.608662] PCI-DMA: Reserving 64MB of IOMMU area in the AGP aperture

some differences are here - maybe thats it:

grep aperture dmesg-intrepid
[ 0.004000] Checking aperture...
[ 0.004000] Node 0: aperture @ 0 size 32 MB
[ 0.004000] Your BIOS doesn't leave a aperture memory hole
[ 0.004000] Mapping aperture over 65536 KB of RAM @ 20000000
[ 0.608649] PCI-DMA: aperture base @ 20000000 size 65536 KB
[ 0.608662] PCI-DMA: Reserving 64MB of IOMMU area in the AGP aperture

while 8.04 tells me:

grep aperture dmesg-hardy
[ 20.581072] Checking aperture...
[ 20.581130] CPU 0: aperture @ 7cee000000 size 32 MB
[ 20.590440] Your BIOS doesn't leave a aperture memory hole
[ 20.617981] Mapping aperture over 65536 KB of RAM @ 4000000
[ 21.283264] PCI-DMA: aperture base @ 4000000 size 65536 KB
[ 21.283394] PCI-DMA: Reserving 64MB of IOMMU area in the AGP aperture

So - can someone tell if that already helps?

Revision history for this message

Alan Groupe (alan-groupe) wrote on 2008-10-10:

#39

I have the same issue as everyone else here. Motherboard is an Asus M2N-MX SE+ with an AMD Athlon64 X2 5000+, not overclocked.

I downloaded and installed the current 8.10 beta. When suspended, the system still hung on resume, but after hitting reset, I got the GRUB menu and booted just fine, rather than the dreaded Grub error 17. So the new kernel does seem to take care of the critical error, at least for me.

While I still can't resume from suspend (hibernate is fine), I'm much less concerned about being able to use suspend than I am about blowing away my system if I suspend accidentally, so I think I'll go with disabling suspend from /etc/default, but just wanted to respond to Leann.

Revision history for this message

Paul Novotny (paul-novotny) wrote on 2008-10-16:

#40

Just wondering if this bug is still there with the latest kernel in hardy (2.6.24-21). I noticed that in the changes for this new kernel it fixes Bug #257293 (https://bugs.launchpad.net/ubuntu/+source/linux/+bug/257293), which seems very similar to this one. Has anyone tested this? Without the iommu=soft fix?

Revision history for this message

arturj (arturj-freenet) wrote on 2008-11-20:

#41

I am now running a fresh installation of Ubuntu 8.10 (final release) and was brave enough (due to some hints in the internet which state that the bug should have been resolved) to test it. So I suspended the system and tried to resume without iommu=soft optoin. I have NO Grub-17 error any more. I did this a few times and my filesystem is still intact. Can someone else confirm this?

Revision history for this message

ichudov (igor-chudov) wrote on 2008-11-20: Re: [Bug 203537] Re: Hardy -- Suspend RUINS FILESYSTEM

#42

I confirm this, I suspend several times a day on Intrepid

On Thu, Nov 20, 2008 at 02:57:54PM -0000, arturj wrote:
> I am now running a fresh installation of Ubuntu 8.10 (final release) and
> was brave enough (due to some hints in the internet which state that the
> bug should have been resolved) to test it. So I suspended the system and
> tried to resume without iommu=soft optoin. I have NO Grub-17 error any
> more. I did this a few times and my filesystem is still intact. Can
> someone else confirm this?
>

Revision history for this message

Leann Ogasawara (leannogasawara) wrote on 2008-12-15:

#43

Thanks everyone for testing Intrepid. ichudov, since you are the original bug reporter and can confirm this is resolved for you with Intrepid I'm marking this fixed released for the Intrepid release and will approve the Hardy nomination. Thanks.

Changed in linux:
status:	Triaged → Fix Released
importance:	Undecided → High
status:	New → Triaged

Revision history for this message

Launchpad Janitor (janitor) wrote on 2008-12-23: Kernel team bugs

#44

Per a decision made by the Ubuntu Kernel Team, bugs will longer be assigned to the ubuntu-kernel-team in Launchpad as part of the bug triage process. The ubuntu-kernel-team is being unassigned from this bug report. Refer to https://wiki.ubuntu.com/KernelTeamBugPolicies for more information. Thanks.

Jim Lieb (lieb) on 2009-01-09

Changed in linux:
assignee:	nobody → lieb
status:	Triaged → In Progress

Revision history for this message

Jim Lieb (lieb) wrote on 2009-02-24:

#45

There is a workaround (iommu=soft) and this problem has been fixed in later kernels. Due to the version
span between those kernels and hardy, backporting a patch would be too risky. Users who need the
hardware iommu for performance should upgrade to get a later kernel.