[Dell Studio XPS 1640] Sudden Read-Only Filesystems

Bug #1063354 reported by Lars Kumbier
396
This bug affects 82 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Won't Fix
High
Unassigned

Bug Description

After upgrading to ubuntu 12.10, I experience sudden locks of my filesystems (I have a root and a home partition with ext4), in which the filesystems suddenly become mounted readonly. /var/log/syslog shows the following entries:

Oct 7 20:00:42 StudioXPS signond[3510]: signondaemon.cpp 345 init Failed to SUID root. Secure storage will not be available.
Oct 7 20:02:12 StudioXPS kernel: [ 249.193555] ata1.00: exception Emask 0x0 SAct 0x7 SErr 0x0 action 0x0
Oct 7 20:02:12 StudioXPS kernel: [ 249.193561] ata1.00: irq_stat 0x40000001
Oct 7 20:02:12 StudioXPS kernel: [ 249.193565] ata1.00: failed command: READ FPDMA QUEUED
Oct 7 20:02:12 StudioXPS kernel: [ 249.193572] ata1.00: cmd 60/20:00:90:6f:53/00:00:1a:00:00/40 tag 0 ncq 16384 in
Oct 7 20:02:12 StudioXPS kernel: [ 249.193572] res 41/40:20:98:6f:53/00:00:1a:00:00/40 Emask 0x409 (media error) <F>
Oct 7 20:02:12 StudioXPS kernel: [ 249.193575] ata1.00: status: { DRDY ERR }
Oct 7 20:02:12 StudioXPS kernel: [ 249.193578] ata1.00: error: { UNC }
Oct 7 20:02:12 StudioXPS kernel: [ 249.193581] ata1.00: failed command: WRITE FPDMA QUEUED
Oct 7 20:02:12 StudioXPS kernel: [ 249.193587] ata1.00: cmd 61/18:08:18:fb:0e/00:00:2b:00:00/40 tag 1 ncq 12288 out
Oct 7 20:02:12 StudioXPS kernel: [ 249.193587] res 41/40:08:98:6f:53/00:00:1a:00:00/40 Emask 0x9 (media error)
Oct 7 20:02:12 StudioXPS kernel: [ 249.193590] ata1.00: status: { DRDY ERR }
Oct 7 20:02:12 StudioXPS kernel: [ 249.193593] ata1.00: error: { UNC }
Oct 7 20:02:12 StudioXPS kernel: [ 249.193596] ata1.00: failed command: WRITE FPDMA QUEUED
Oct 7 20:02:12 StudioXPS kernel: [ 249.193602] ata1.00: cmd 61/d8:10:a0:bd:8b/00:00:0d:00:00/40 tag 2 ncq 110592 out
Oct 7 20:02:12 StudioXPS kernel: [ 249.193602] res 41/40:08:98:6f:53/00:00:1a:00:00/40 Emask 0x9 (media error)
Oct 7 20:02:12 StudioXPS kernel: [ 249.193605] ata1.00: status: { DRDY ERR }
Oct 7 20:02:12 StudioXPS kernel: [ 249.193607] ata1.00: error: { UNC }
Oct 7 20:02:12 StudioXPS kernel: [ 249.196606] ata1.00: configured for UDMA/100
Oct 7 20:02:12 StudioXPS kernel: [ 249.196622] sd 0:0:0:0: >[sda] Unhandled sense code
Oct 7 20:02:12 StudioXPS kernel: [ 249.196624] sd 0:0:0:0: >[sda]
Oct 7 20:02:12 StudioXPS kernel: [ 249.196626] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
Oct 7 20:02:12 StudioXPS kernel: [ 249.196628] sd 0:0:0:0: >[sda]
Oct 7 20:02:12 StudioXPS kernel: [ 249.196629] Sense Key : Medium Error [current] [descriptor]
Oct 7 20:02:12 StudioXPS kernel: [ 249.196633] Descriptor sense data with sense descriptors (in hex):
Oct 7 20:02:12 StudioXPS kernel: [ 249.196634] 72 03 11 04 00 00 00 0c 00 0a 80 00 00 00 00 00
Oct 7 20:02:12 StudioXPS kernel: [ 249.196642] 1a 53 6f 98
Oct 7 20:02:12 StudioXPS kernel: [ 249.196645] sd 0:0:0:0: >[sda]
Oct 7 20:02:12 StudioXPS kernel: [ 249.196648] Add. Sense: Unrecovered read error - auto reallocate failed
Oct 7 20:02:12 StudioXPS kernel: [ 249.196650] sd 0:0:0:0: >[sda] CDB:
Oct 7 20:02:12 StudioXPS kernel: [ 249.196651] Read(10): 28 00 1a 53 6f 90 00 00 20 00
Oct 7 20:02:12 StudioXPS kernel: [ 249.196658] end_request: I/O error, dev sda, sector 441675672
Oct 7 20:02:12 StudioXPS kernel: [ 249.196674] sd 0:0:0:0: >[sda] Unhandled sense code
Oct 7 20:02:12 StudioXPS kernel: [ 249.196676] sd 0:0:0:0: >[sda]
Oct 7 20:02:12 StudioXPS kernel: [ 249.196678] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
Oct 7 20:02:12 StudioXPS kernel: [ 249.196679] sd 0:0:0:0: >[sda]
Oct 7 20:02:12 StudioXPS kernel: [ 249.196681] Sense Key : Medium Error [current] [descriptor]
Oct 7 20:02:12 StudioXPS kernel: [ 249.196683] Descriptor sense data with sense descriptors (in hex):
Oct 7 20:02:12 StudioXPS kernel: [ 249.196684] 72 03 11 04 00 00 00 0c 00 0a 80 00 00 00 00 00
Oct 7 20:02:12 StudioXPS kernel: [ 249.196692] 1a 53 6f 98
Oct 7 20:02:12 StudioXPS kernel: [ 249.196695] sd 0:0:0:0: >[sda]
Oct 7 20:02:12 StudioXPS kernel: [ 249.196697] Add. Sense: Unrecovered read error - auto reallocate failed
Oct 7 20:02:12 StudioXPS kernel: [ 249.196699] sd 0:0:0:0: >[sda] CDB:
Oct 7 20:02:12 StudioXPS kernel: [ 249.196700] Write(10): 2a 00 2b 0e fb 18 00 00 18 00
Oct 7 20:02:12 StudioXPS kernel: [ 249.196706] end_request: I/O error, dev sda, sector 722402072
Oct 7 20:02:12 StudioXPS kernel: [ 249.196710] Buffer I/O error on device sda6, logical block 82899555
Oct 7 20:02:12 StudioXPS kernel: [ 249.196718] Buffer I/O error on device sda6, logical block 82899556
Oct 7 20:02:12 StudioXPS kernel: [ 249.196722] Buffer I/O error on device sda6, logical block 82899557
Oct 7 20:02:12 StudioXPS kernel: [ 249.196725] EXT4-fs warning (device sda6): ext4_end_bio:250: I/O error writing to inode 20709582 (offset 0 size 12288 starting block 90300262)
Oct 7 20:02:12 StudioXPS kernel: [ 249.196726] JBD2: Detected IO errors while flushing file data on sda6-8
Oct 7 20:02:12 StudioXPS kernel: [ 249.196737] sd 0:0:0:0: >[sda] Unhandled sense code
Oct 7 20:02:12 StudioXPS kernel: [ 249.196739] sd 0:0:0:0: >[sda]
Oct 7 20:02:12 StudioXPS kernel: [ 249.196740] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
Oct 7 20:02:12 StudioXPS kernel: [ 249.196742] sd 0:0:0:0: >[sda]
Oct 7 20:02:12 StudioXPS kernel: [ 249.196743] Sense Key : Medium Error [current] [descriptor]
Oct 7 20:02:12 StudioXPS kernel: [ 249.196745] Descriptor sense data with sense descriptors (in hex):
Oct 7 20:02:12 StudioXPS kernel: [ 249.196746] 72 03 11 04 00 00 00 0c 00 0a 80 00 00 00 00 00
Oct 7 20:02:12 StudioXPS kernel: [ 249.196754] 1a 53 6f 98
Oct 7 20:02:12 StudioXPS kernel: [ 249.196758] sd 0:0:0:0: >[sda]
Oct 7 20:02:12 StudioXPS kernel: [ 249.196759] Add. Sense: Unrecovered read error - auto reallocate failed
Oct 7 20:02:12 StudioXPS kernel: [ 249.196761] sd 0:0:0:0: >[sda] CDB:
Oct 7 20:02:12 StudioXPS kernel: [ 249.196762] Write(10): 2a 00 0d 8b bd a0 00 00 d8 00
Oct 7 20:02:12 StudioXPS kernel: [ 249.196768] end_request: I/O error, dev sda, sector 227261856
Oct 7 20:02:12 StudioXPS kernel: [ 249.196781] ata1: EH complete
Oct 7 20:02:12 StudioXPS kernel: [ 249.196810] Aborting journal on device sda6-8.
Oct 7 20:02:12 StudioXPS kernel: [ 249.197216] EXT4-fs error (device sda6): ext4_journal_start_sb:370: Detected aborted journal
Oct 7 20:02:12 StudioXPS kernel: [ 249.197219] EXT4-fs (sda6): Remounting filesystem read-only
Oct 7 20:02:13 StudioXPS kernel: [ 250.934678] ecryptfs_encrypt_page: Error attempting to write lower page; rc = [-30]
Oct 7 20:02:13 StudioXPS kernel: [ 250.934691] ecryptfs_write_end: Error encrypting page (upper index [0x0000000000000078])
Oct 7 20:02:13 StudioXPS kernel: [ 250.938886] ecryptfs_encrypt_page: Error attempting to write lower page; rc = [-30]
Oct 7 20:02:13 StudioXPS kernel: [ 250.938896] ecryptfs_write_end: Error encrypting page (upper index [0x0000000000000050])
Oct 7 20:02:13 StudioXPS kernel: [ 250.939062] ecryptfs_encrypt_page: Error attempting to write lower page; rc = [-30]
Oct 7 20:02:13 StudioXPS kernel: [ 250.939068] ecryptfs_writepage: Error encrypting page (upper index [0x0000000000000000])
Oct 7 20:02:21 StudioXPS kernel: [ 259.082126] ecryptfs_encrypt_page: Error attempting to write lower page; rc = [-30]
Oct 7 20:02:21 StudioXPS kernel: [ 259.082138] ecryptfs_write_end: Error encrypting page (upper index [0x0000000000000005])
Oct 7 20:02:21 StudioXPS kernel: [ 259.082257] ecryptfs_encrypt_page: Error attempting to write lower page; rc = [-30]
Oct 7 20:02:21 StudioXPS kernel: [ 259.082262] ecryptfs_write_end: Error encrypting page (upper index [0x0000000000000003])
Oct 7 20:02:21 StudioXPS kernel: [ 259.082376] ecryptfs_encrypt_page: Error attempting to write lower page; rc = [-30]
Oct 7 20:02:21 StudioXPS kernel: [ 259.082381] ecryptfs_write_end: Error encrypting page (upper index [0x0000000000000000])
Oct 7 20:05:16 StudioXPS kernel: [ 433.841434] ecryptfs_encrypt_page: Error attempting to write lower page; rc = [-30]
Oct 7 20:05:16 StudioXPS kernel: [ 433.841448] ecryptfs_write_end: Error encrypting page (upper index [0x00000000000000c9])
Oct 7 20:07:57 StudioXPS sudo: pam_ecryptfs: pam_sm_authenticate: /home/lars is already mounted

The harddrive is one month old and has no defects (AFAIK). The problem arises anywhere between directly after boot and 3h into working. A remount with mount -o remount,rw is not possible and aborted with an error. Since I will most certainly loose data during work, this renders my system unusable for the moment. The problem did not occur when running 12.04.

ProblemType: Bug
DistroRelease: Ubuntu 12.10
Package: linux-image-3.5.0-17-generic 3.5.0-17.27
ProcVersionSignature: Ubuntu 3.5.0-17.27-generic 3.5.5
Uname: Linux 3.5.0-17-generic x86_64
ApportVersion: 2.6.1-0ubuntu1
Architecture: amd64
AudioDevicesInUse:
 USER PID ACCESS COMMAND
 /dev/snd/controlC1: lars 2341 F.... pulseaudio
 /dev/snd/controlC0: lars 2341 F.... pulseaudio
Date: Sun Oct 7 20:00:11 2012
EcryptfsInUse: Yes
InstallationMedia: Ubuntu 12.10 "Quantal Quetzal" - Beta amd64 (20120926)
MachineType: Dell Inc. Studio XPS 1640
ProcFB: 0 radeondrmfb
ProcKernelCmdLine: BOOT_IMAGE=/vmlinuz-3.5.0-17-generic root=UUID=68856248-4726-45a0-84b2-670a468cce31 ro quiet splash
RelatedPackageVersions:
 linux-restricted-modules-3.5.0-17-generic N/A
 linux-backports-modules-3.5.0-17-generic N/A
 linux-firmware 1.94
RfKill:
 0: phy0: Wireless LAN
  Soft blocked: no
  Hard blocked: yes
SourcePackage: linux
UpgradeStatus: No upgrade log present (probably fresh install)
dmi.bios.date: 11/19/2009
dmi.bios.vendor: Dell Inc.
dmi.bios.version: A12
dmi.board.name: 0W497D
dmi.board.vendor: Dell Inc.
dmi.board.version: A12
dmi.chassis.type: 8
dmi.chassis.vendor: Dell Inc.
dmi.chassis.version: A12
dmi.modalias: dmi:bvnDellInc.:bvrA12:bd11/19/2009:svnDellInc.:pnStudioXPS1640:pvrA123:rvnDellInc.:rn0W497D:rvrA12:cvnDellInc.:ct8:cvrA12:
dmi.product.name: Studio XPS 1640
dmi.product.version: A123
dmi.sys.vendor: Dell Inc.

Revision history for this message
Lars Kumbier (derlars) wrote :
Brad Figg (brad-figg)
Changed in linux (Ubuntu):
status: New → Confirmed
Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

Can you boot back into the 3.2 Precise kernel and see if the issue goes away?

Changed in linux (Ubuntu):
importance: Undecided → Medium
status: Confirmed → Incomplete
Revision history for this message
Lars Kumbier (derlars) wrote :

Hi Joseph,

I reinstalled 12.04 in a separate partition and have been working with it for the past 4 hours with harddrive intensive applications (which usually led to the error a lot faster) and so far, I cannot reproduce the problem on 12.04. I will continue to work with 12.04 tomorrow for a few hours and leave the computer with some torrents to work on.

Best Regards,
Lars

Revision history for this message
Lars Kumbier (derlars) wrote :

Hi Joseph,

the box has been processing torrents and my usual work without any occurance of the problem, so I suspect the problem to be a regression from 3.2 to 3.5. My kernel is currently 3.2.0-31.

What I tried so far on 12.10:
I've tried to set the ide_generic-setting at boottime as described in the harddrive debugging page[1], which did not change the behavior of setting my root partition and/or home partition to readonly mode.

Also, the mount-output shows the partitions as read-write-mount, although they are set to readonly. Both partitions are formatted to ext4 and I have to run fsck on pretty much every boot.

Best Regards,
Lars

[1] https://wiki.ubuntu.com/DebuggingIDEIssues

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

Would it be possible for you to test the latest upstream kernel? Refer to https://wiki.ubuntu.com/KernelMainlineBuilds . Please test the latest v3.6 kernel[0] (Not a kernel in the daily directory) and install both the linux-image and linux-image-extra .deb packages.

Once you've tested the upstream kernel, please remove the 'needs-upstream-testing' tag. Please only remove that one tag and leave the other tags. This can be done by clicking on the yellow pencil icon next to the tag located at the bottom of the bug description and deleting the 'needs-upstream-testing' text.

If this bug is fixed in the mainline kernel, please add the following tag 'kernel-fixed-upstream'.

If the mainline kernel does not fix this bug, please add the tag: 'kernel-bug-exists-upstream'.

If you are unable to test the mainline kernel, for example it will not boot, please add the tag: 'kernel-unable-to-test-upstream'.
Once testing of the upstream kernel is complete, please mark this bug as "Confirmed".

Thanks in advance.

[0] http://kernel.ubuntu.com/~kernel-ppa/mainline/v3.6-quantal/

tags: added: needs-upstream-testing
Revision history for this message
Lars Kumbier (derlars) wrote :

Hi Joseph,

bug confirmed with ubuntu 12.04 install and the latest 3.6.1-030601-generic kernel as requested. Since this bug leads to sudden data loss, shouldn't the bug get a higher importance than medium?

Best Regards,
Lars

Changed in linux (Ubuntu):
status: Incomplete → Confirmed
tags: added: kernel-bug-exists-upstream
removed: needs-upstream-testing
Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

So this bug did eventually happen with 12.04?

Changed in linux (Ubuntu):
importance: Medium → High
Revision history for this message
Lars Kumbier (derlars) wrote :

Hi Joseph,

yes, I've installed the 3.6.1-kernel on the 12.04 installation. I've setup four partitions for the test: shared /boot, shared /home, and two separate Root-Partitions, one for 12.04 and one for 12.10.

As the bug does not happen on 12.04-root with the default 3.2.0-kernel, but does with the 3.6.1, we can rule out an error in any tool package of ext4, as they stayed the same. So it's a regression from 3.2 up to at least 3.5.0. If possible, I would not like to try all versions in between - is there any way around this?

Best Regards,
Lars

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

I should have provided more details to my question. Did this bug happen on the v3.2 kernel on 12.04? If not, I'd like to perform a kernel bisect to identify the commit that introduced this regression. It would be very helpful to know the earliest kernel where the issue started happening as well as the latest kernel that did not have this issue.

Can you test the following kernels and report back? We are looking for the first kernel version that doesn't have this bug:

v3.3 final: http://kernel.ubuntu.com/~kernel-ppa/mainline/v3.3-precise/
v3.4 final: http://kernel.ubuntu.com/~kernel-ppa/mainline/v3.4-quantal/
v3.5-rc4: http://kernel.ubuntu.com/~kernel-ppa/mainline/v3.5-rc4-quantal/

You don't have to test every kernel, just up until the kernel that first has this bug.

Thanks in advance!

tags: added: performing-bisect
Revision history for this message
mvidberg (marko-j) wrote :

I updated from 12.04 to 12.10 (AMD 64bit) last night and have been having this exact same problem. I have a fairly new WD 750Gig black HD with / and /home partitions. Through-out the day, either / or /home has been suddenly going into read-only mode and all I can do is reboot to get it back to normal.

Revision history for this message
Sushi (sushi-addiction13) wrote :

I've had a problem since 12.04 (amd64) with file system suddenly and randomly going read only. I'm not sure it's exactly the same issue but probably is. I'm using gnome for the past week instead of unity because there were several nasty bugs in unity that weren't fixed for over a year. If I get the same error under gnome, I'll try to report it.

Revision history for this message
Lars Kumbier (derlars) wrote :

@marko-j, @sushi-addiction13: Could you also check the mainline kernels, as @jsalisbury suggested? I am currently unable to test the systems for time constraints.

Revision history for this message
edward (deltorodata) wrote :

oh, i got yesterday the same problem, i was trying to do anything... i think im gonna reinstallt today kubuntu 12.04.... i can just say, ive upgrade my pc, from 12.04 to 12.10 then wildwindow tell me something like "ur system is readonly" n i could do anything... ive reinstall 12.10 from zero... it was working ok, then ive installed some programms that i need. and came back that thing again.
i was trying anything, but i got always that wild comment "filesystem is ro". if someone knows how to repair that, or update over chroot-system... pls just let us know! thanks!
have a nice day!

Revision history for this message
mvidberg (marko-j) wrote :

I have been using 12.10 BUT booting with the 3.2.0-32 kernel and have not the problem at all anymore. I just tried booting into 3.5.0-17 again and the problem happened within minutes. If I get a chance I will try installing some other kernels to test with.

Revision history for this message
Sushi (sushi-addiction13) wrote :

@Lars Scheithauer Sorry, can't test, wish I could help more. Mentally challenged after some vaccines 5 years ago... where am I :D
I can tell you that so far I have not experienced this problem using gnome3.6
My computer is running 16hours a day doing a lot of things. It happened average once a week under unity in 12.04 but happens almost every day in 12.10

Revision history for this message
edward (deltorodata) wrote :

ive reinstalled 12.04... im gonna wait few months....

Revision history for this message
Ryan Budney (delooper) wrote :

I'm having the same problem, on a Lenovo W530 running the 64-bit kernel. Is there anything I could provide, or do you have all the information you need? In a week of running 12.10 on my laptop 24-hours a day, this has happened twice.

Revision history for this message
Ryan Budney (delooper) wrote :

Weird. Just after I said it's only happened twice in a week, it happened a 3rd time, minutes after rebooting from the 2nd occurence.

Revision history for this message
mvidberg (marko-j) wrote :

On the weekend I installed and tried booting with the kernels mentioned in an above post: v3.3 final, v3.4 final, and v3.5-rc4. I couldn't reproduce the problem with any of these. I am currently running with v3.5-rc4 and will stick with that until this issue is resolved. Once again, I am running ubuntu AMD64 with a 750G Western Digital (black edition) drive which is partioned into "/" and "/home" (both are ext4).

Revision history for this message
mvidberg (marko-j) wrote :

For other people experiencing this problem, I suggest installing smartmontools and doing some tests on your HD to rule out actual hardware problems. Also, try booting with an older kernel from the grub boot menu and see if your issue goes away (because if not, you may actually have failing hardware).

Revision history for this message
Ryan Budney (delooper) wrote :

I've installed smartmontools and have run two short tests and am running a long test now. The SMART log knows of the recent problems but the tests are not discovering any problems. The problems recorded in the SMART log are "READ DMA" commands. Is there anything in particular I should be looking for?

Revision history for this message
Ryan Budney (delooper) wrote :

Running the SMART tests from the system BIOS directly (not loading Ubuntu) it appears to be failing all the tests. It appears to be a hardware problem. Strange that it only started acting up now.

Revision history for this message
mvidberg (marko-j) wrote :

@jsalisbury are there any other kernels you would like to have tested? I have been running a number of days now on v3.5-rc4 without this problem coming up, but I can test with some others if needed.

Revision history for this message
Lars Kumbier (derlars) wrote :

@marko-j The idea behind the kernel testing is to find the first version, that introduced the bug - if the 3.5-rc4 works for you, could you please retest the latest kernel? Maybe the problem wasn't in the kernel but in a library after all or was introduced on the way up from 3.5-rc4.

Revision history for this message
mvidberg (marko-j) wrote :

@derlars I did try the latest kernel again and got the problem to happen once on boot already and another time after about 2.5 hours. I will try out some of the kernels between rc4 and latest.

Revision history for this message
mvidberg (marko-j) wrote :

I tried 3.5-rc5 and got the problem fairly quickly. Went back to 3.5-rc4 and noticed I was getting ext4 buffer write sda errors in syslog for that as well... so it might be that 3.5-rc4 doesn't actually work for me either, although I haven't had rc4 go into read-only mode at all yet. Since yesterday I've gone to using 3.4. Haven't had any sda errors at all in syslog anymore. I should have been paying attention to my syslog previously already instead of just waiting for the fs to go read-only.

Unfortunately, I can't be testing any more of these kernels as I'm slowly getting more and more files getting corrupted.. and this is my home office work computer so I can't allow that. Will stay on 3.4 now. Maybe someone else can test 3.5-rc1 thru 3.5-rc3.

Revision history for this message
mvidberg (marko-j) wrote :

I just recently found out about the problems with ext4 corruption (as mentioned in http://lwn.net/Articles/521022/ ) and am now wondering if this is all related to that? Except that that bug is only supposed to happen in rare cases, nothing like what I've been seeing. Also don't understand why more people aren't having this issue... is it possible that it is just a failing HD for me as well? I will stay on 3.4 for now and report back here if I see anything suspicious in my syslog at all.

Revision history for this message
mvidberg (marko-j) wrote :

Noticed that kernel 3.5.0-18 was included in my updates today so I just tried booting with it. Within minutes I started getting errors in my /var/log/syslog as follows:

Nov 6 17:04:31 Woogie kernel: [ 1676.289748] Buffer I/O error on device sda3, logical block 96221046
Nov 6 17:04:31 Woogie kernel: [ 1676.289749] Buffer I/O error on device sda3, logical block 96221047
Nov 6 17:04:31 Woogie kernel: [ 1676.289750] Buffer I/O error on device sda3, logical block 96221048
Nov 6 17:04:31 Woogie kernel: [ 1676.289751] Buffer I/O error on device sda3, logical block 96221049
Nov 6 17:04:31 Woogie kernel: [ 1676.289752] Buffer I/O error on device sda3, logical block 96221050
Nov 6 17:04:31 Woogie kernel: [ 1676.289753] Buffer I/O error on device sda3, logical block 96221051
Nov 6 17:04:31 Woogie kernel: [ 1676.289754] Buffer I/O error on device sda3, logical block 96221052
Nov 6 17:04:31 Woogie kernel: [ 1676.289756] Buffer I/O error on device sda3, logical block 96221053
Nov 6 17:04:31 Woogie kernel: [ 1676.289757] Buffer I/O error on device sda3, logical block 96221054
Nov 6 17:04:31 Woogie kernel: [ 1676.289758] Buffer I/O error on device sda3, logical block 96221055
Nov 6 17:04:31 Woogie kernel: [ 1676.289759] EXT4-fs warning (device sda3): ext4_end_bio:250: I/O error writing to inode 15731887 (offset 342884352 size 524288 starting block 129425402)
Nov 6 17:04:31 Woogie kernel: [ 1676.289762] sd 0:0:0:0: [sda]
Nov 6 17:04:31 Woogie kernel: [ 1676.289763] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
Nov 6 17:04:31 Woogie kernel: [ 1676.289764] sd 0:0:0:0: [sda]
Nov 6 17:04:31 Woogie kernel: [ 1676.289765] Sense Key : Aborted Command [current] [descriptor]
Nov 6 17:04:31 Woogie kernel: [ 1676.289766] Descriptor sense data with sense descriptors (in hex):
Nov 6 17:04:31 Woogie kernel: [ 1676.289767] 72 0b 00 00 00 00 00 0c 00 0a 80 00 00 00 00 00
Nov 6 17:04:31 Woogie kernel: [ 1676.289771] 3d b7 05 d7
Nov 6 17:04:31 Woogie kernel: [ 1676.289773] sd 0:0:0:0: [sda]
Nov 6 17:04:31 Woogie kernel: [ 1676.289773] Add. Sense: No additional sense information
Nov 6 17:04:31 Woogie kernel: [ 1676.289775] sd 0:0:0:0: [sda] CDB:
Nov 6 17:04:31 Woogie kernel: [ 1676.289775] Write(10): 2a 00 3d b7 05 d7 00 00 20 00
Nov 6 17:04:31 Woogie kernel: [ 1676.289779] end_request: I/O error, dev sda, sector 1035404759
Nov 6 17:04:31 Woogie kernel: [ 1676.289789] ata1: EH complete
Nov 6 17:04:31 Woogie kernel: [ 1676.289832] EXT4-fs (sda3): delayed block allocation failed for inode 15731887 at logical offset 84032 with max blocks 4 with error -5
Nov 6 17:04:31 Woogie kernel: [ 1676.289842] EXT4-fs (sda3): This should not happen!! Data will be lost

So back to kernel 3.4.0 I go (no errors in my syslog for that).

Revision history for this message
Sushi (sushi-addiction13) wrote :

Happened again. Seems to happen less using cinnamon rather than unity. I lost the syslog on reboot along with 12 files (1GB) from a single directory I was not using at the time. Usually I only have firefox cache/data and whatever app is open at the time show up as file errors on boot (fsk?) but now it's other files too.
... and yes, My HDD is fine.

When your OS starts deleting your files, I think it's a critical bug.

1 comments hidden view all 129 comments
Revision history for this message
ferlantz (fagmj74) wrote :

I have the same trouble with kernel 3.5.0-18, 3.5.0-17 works OK.

Revision history for this message
Kay (ksthiele) wrote :

this bug is so annoying, definitely critical since it really makes ubuntu unusable.

Revision history for this message
LaunchpadUser (lpusr) wrote :

I just updated from 12.04 to 12.10 (64 Bit) and a few days later, hell broke lose: the computer doesn'T boot corretc, lots of boot failers, disk checks on every boot and after several boot, the login screen appears. But the system is read only. I have an Intel 80 GB SSD with / and /home partitions.

Revision history for this message
raggar (mbaart) wrote :

I also have this problem. I used 12.04 without problems for 6 months and in 12.10 the system ended up with a read-only filesystem and now without booting.

I have a intel 120gb SSD with 4 partitions. Ubuntu 12.10 (64bit) is only installed on one partition, including the /home. It also has a boot flag.

tags: removed: performing-bisect
Revision history for this message
Henrique Maia (henriquemaia) wrote :

Another one bites the dust. Losing data is not ok. This is happening in shorter intervals by now.

I have two different hdds, same kind, one in use, the other just for backups. When I first got these errors I've immediately exchanged drives, I thought I was about to witness a spectacular hardware failure. Now the second drive is having the same problems, in shorter intervals.

This is the worst bug I've experienced in all my years as a Linux user. Not good at all.

Revision history for this message
gunwald (gunwald) wrote :

This bug drives me crazy because it makes me loosing data! I experienced this bug the first time, when I installed linux 3.5 in Ubuntu 12.04. I could get rid of it by turning back to linux 3.2. Know I am using Ubuntu 12.10 and the problem occurs constantly. But only on computers with SSD hard drives. Actually I running Ubuntu on an Zenbook UX 21. Please fix that bug, its the worst bug Ubuntu ever had.

Revision history for this message
Josh Wines (joshwines) wrote :

Seems a lot of people running into this issue are all using SSDs, myself included. Anybody else in this thread having the same thing happen using an old spinner drive?

Revision history for this message
Sushi (sushi-addiction13) wrote :

I'm not using SSD. There is a possibility it happens more with SSDs. I've found the bug happens more often when using unity rather than cinnamon or gnome 3.6. It's weird. I'm thinking of going back to unity to see if I can get info on the bug and what might trigger it but I don't want to lose more files.

At ferlantz (fagmj74) suggestion, I've been using 3.5.0-17 kernel to see if that avoids the bug. No problems for the past few days.

Revision history for this message
Branislav Holý (branoholy) wrote :

I also had this bug in Xubuntu 12.10 and I don't have SSD. Now I'm using Xubuntu 12.04 and have no problem.

Revision history for this message
Wallo013 (walloo13) wrote :

I have a SSD, but the issue is with my standard HD which is mounted in /home.
I actually returned to Ubuntu 12.04 since I could not afford loosing data and/or having my /home becoming read-only after only 2 minutes.
I really don't understand why such major bug is not taken seriously and how filesystem corruption and loss of data is not considered as a critical issue.

information type: Public → Public Security
information type: Public Security → Public
49 comments hidden view all 129 comments
Revision history for this message
Tim Wochomurka (tim-wochomurka) wrote : Re: [Bug 1063354] Re: Sudden Read-Only Filesystems
Download full text (11.7 KiB)

Same here. I'm on an ancient athalon xp machine with sata hdd so...
On Feb 1, 2013 3:25 AM, "Lars Scheithauer" <email address hidden>
wrote:

> No, it's not. Hdd here.
> Am 01.02.2013 06:20 schrieb "Mark Mandel" <email address hidden>:
>
> > I had much the same thing a while back, and a co-worker suggested I
> > update the firmware on my Crucial M4 SSD drive, and I've not had a
> > problem since.
> >
> > Worth a shot!
> >
> > --
> > You received this bug notification because you are subscribed to the bug
> > report.
> > https://bugs.launchpad.net/bugs/1063354
> >
> > Title:
> > Sudden Read-Only Filesystems
> >
> > To manage notifications about this bug go to:
> >
> https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1063354/+subscriptions
> >
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1063354
>
> Title:
> Sudden Read-Only Filesystems
>
> Status in “linux” package in Ubuntu:
> Confirmed
>
> Bug description:
> After upgrading to ubuntu 12.10, I experience sudden locks of my
> filesystems (I have a root and a home partition with ext4), in which
> the filesystems suddenly become mounted readonly. /var/log/syslog
> shows the following entries:
>
> Oct 7 20:00:42 StudioXPS signond[3510]: signondaemon.cpp 345 init
> Failed to SUID root. Secure storage will not be available.
> Oct 7 20:02:12 StudioXPS kernel: [ 249.193555] ata1.00: exception
> Emask 0x0 SAct 0x7 SErr 0x0 action 0x0
> Oct 7 20:02:12 StudioXPS kernel: [ 249.193561] ata1.00: irq_stat
> 0x40000001
> Oct 7 20:02:12 StudioXPS kernel: [ 249.193565] ata1.00: failed
> command: READ FPDMA QUEUED
> Oct 7 20:02:12 StudioXPS kernel: [ 249.193572] ata1.00: cmd
> 60/20:00:90:6f:53/00:00:1a:00:00/40 tag 0 ncq 16384 in
> Oct 7 20:02:12 StudioXPS kernel: [ 249.193572] res
> 41/40:20:98:6f:53/00:00:1a:00:00/40 Emask 0x409 (media error) <F>
> Oct 7 20:02:12 StudioXPS kernel: [ 249.193575] ata1.00: status: { DRDY
> ERR }
> Oct 7 20:02:12 StudioXPS kernel: [ 249.193578] ata1.00: error: { UNC }
> Oct 7 20:02:12 StudioXPS kernel: [ 249.193581] ata1.00: failed
> command: WRITE FPDMA QUEUED
> Oct 7 20:02:12 StudioXPS kernel: [ 249.193587] ata1.00: cmd
> 61/18:08:18:fb:0e/00:00:2b:00:00/40 tag 1 ncq 12288 out
> Oct 7 20:02:12 StudioXPS kernel: [ 249.193587] res
> 41/40:08:98:6f:53/00:00:1a:00:00/40 Emask 0x9 (media error)
> Oct 7 20:02:12 StudioXPS kernel: [ 249.193590] ata1.00: status: { DRDY
> ERR }
> Oct 7 20:02:12 StudioXPS kernel: [ 249.193593] ata1.00: error: { UNC }
> Oct 7 20:02:12 StudioXPS kernel: [ 249.193596] ata1.00: failed
> command: WRITE FPDMA QUEUED
> Oct 7 20:02:12 StudioXPS kernel: [ 249.193602] ata1.00: cmd
> 61/d8:10:a0:bd:8b/00:00:0d:00:00/40 tag 2 ncq 110592 out
> Oct 7 20:02:12 StudioXPS kernel: [ 249.193602] res
> 41/40:08:98:6f:53/00:00:1a:00:00/40 Emask 0x9 (media error)
> Oct 7 20:02:12 StudioXPS kernel: [ 249.193605] ata1.00: status: { DRDY
> ERR }
> Oct 7 20:02:12 StudioXPS kernel: [ 249.193607] ata1.00: error: { UNC }
> Oct 7 20:02:12 StudioXPS kernel: [ 249.196606] ata...

Revision history for this message
Philip Busby (philip-busby) wrote : Re: Sudden Read-Only Filesystems

I thought I would say that I too am having the same problem, although am not using SSD. I am using a HDD drive. I am getting the problem on my Ubunto 12.10 partition and on my Mint14 partition.

My problems are not happening every day, so it makes it more difficult to be of any help. However,I did get the impression that I would have the problem after using a terminal session and SU. It could be a coincidence, but I thought I would share the information.

Intel® Core™2 Duo CPU T7100 @ 1.80GHz × 2
OS 32-bit
Not sue where to find the details for kernel version.

PhilB

Revision history for this message
mvidberg (marko-j) wrote :

I have been following this bug from close to when it was first posted (because I had the same issue myself) and I am now convinced that this has become a gathering place for people who all have different hardware issues which all end up producing the noticable issue of the hard drive going into read-only mode (because the newer kernels seem to be more picky about filesystem errors now?.. which is good thing).

Everybody having issues should do the following things:

1. Make sure you are using good quality 6 Gbps rated SATA cabling.
2. Make sure you have the power cable going directly from the power supply to the back of the HD and also have the SATA cable going directly from the motherboard to back of the HD. (no in-between connectors or hot-swap bracketing)
3. Make sure you have a power supply on your system that is rated at a high enough wattage to handle every card and storage device, etc.. you have connected.
4. Run SMART tests on your affected HD. This means installing smartmontools within Ubuntu and also running tests via your BIOS interface (if your BIOS has that available). This will rule out actual problems with the HD itself.

Revision history for this message
mvidberg (marko-j) wrote :

Missed an extra step to take if you are using a SSD... make sure you have the most up-to-date firmware (as mentioned by Mark above in this thread).

Probably not a bad idea for everyone to check if their motherboard BIOS is up-to-date as well.

Revision history for this message
Emily Gonyer (emilyyrose) wrote :

I'm having the same problem. Started on my husbands Gateway laptop with 650GB HDD and then, 3 days later started on mine, while I was in the midst of attempting to determine whether his harddrive was failing or whether a simple re-install would work. The latter seems to have fixed his... and I've told him *not* to update his linux kernel. Debating whether its worth it to do the same to mine now (mine's a Lenovo IdeaPad w/ 500gb hdd, both just over a yr old).

Revision history for this message
Fernando Luís Santos (flsantos) wrote :

Same here with an Ion-nvidia any luck out there?

Revision history for this message
Bill Golz (e-bill) wrote :

This problem was plaguing me lately as well on my primary ext4 partition. I first thought the HDD was failing, but I wasn't getting any SMART errors. Also, the HDD this is happening to is only 8 months old. Just to be safe, I ran a ddrescue dump, and it ran through fine with no errors. I'm running Ubuntu on some old hardware: an ASUS P5N-E SLI motherboard (nForce 650i chipset) with a Core 2 Duo CPU. As suggested, I rebooted and selected an older kernel in GRUB, and have been running fine for almost a week now. This leads me to believe the problem is with the kernel and not my hardware.

Revision history for this message
Ulf Gunnarson (ulf-gunnarson) wrote :

Just got a dell 6430s computer. It worked fine for a day then started freezing with same errors as previous posters. Can freeze almost at once after login or it works for hours.

( 256 gb ssd disk in comp, done a complete syscheck twice and no errors on disk or other hardware )

Computer is unusable at the moment :-( Since i cant rely on it at work.

Revision history for this message
Mark Davidson (outdoorshappy) wrote :

This may be related to this bug or it may not, but I will add it to the growing data-stream to try to help.

My system 12.04 ubuntu, then mythbuntu was showing exactly the same symptoms as this bug describes. It is a dual boot system with Win8 and ubuntu. Win8 was stable, ubuntu was not. Smart diagnostics were fine at first (hint of where we are going ;-) )

After 6 weeks of chasing the problem with multiple crashes, freezes, etc. I the smart diagnostics started coming up with some bad/suspect sectors. These were scattered around the disc in both the win8 and ubuntu partitions. They continued to grow until the drive failed specification--i.e. a bad drive.

I realize that without detailed diagnostics saved all through the process, this is total speculation, but perhaps as a hypothesis/place to look is in the handling of bad/failing sectors on the disc. Throughout the process, win8 never lost stability or data, but parts of the ubuntu would periodically crash due to read errors trying to read those nearly failed, intermittent sectors. I don't know enough (nearly nothing) about the back end of this, but that may be a difference here.

I replaced the drive 2 weeks ago and have had no further crashes or freezes (at least at this point).

Revision history for this message
TeamFahQ (teamfahq1) wrote :

Not sure if this will help anyone on this issue, but I just experienced it as well. For my own personal reasons, I have my android build environment on a separate partition. I opened it and decided to do some work on a project and suddenly all my partitions became read only. I found that after a reboot, I was forced to run fsck, but I also found that if I did not try to change anything in my android partition, that this error would not happen. I also noticed that prior to this error, when I tried to compile any of my android projects, my computer would become very slow and seemingly overloaded. So... After a reboot, I ran fsck /dev/sda3 (my android partition)and now I am read only error free, and I am currently compiling a ROM, listening to youtube videos, typing out this reply and have supertuxcart (I love that game...lol) running in the background with no lag. So, my system is back up and running like it should be. Oh, and for your information:
Machine: Acer Aspire 5560
Kernel: Linux 3.5.0-25-generic (x86_64)
Distribution: Linux Mint 14 Nadia
Processor: 4x AMD A6-3400M APU with Radeon(tm) HD Graphics

I know Linux Mint is not Ubuntu, but it is still based on 12.10
I'm sure if this is happening on your main partition (/dev/sda1 for me), you would need to do this from a live cd.

Revision history for this message
Cesarfps (sodazoe) wrote :

Hello! Well, recently I upgraded from Kubuntu 12.04 to 12.10 and also started to experience this issue, it just started to happen since I upgraded to 12.10, I never experienced it with Precise Pangolin. About the error message, I don't need to perform a specific task to trigger it, it has happened when I'm working or when I'm just using the web browser alike, and there's not a signal to predict when the error is about occur, and as my computer is my day-to-day working tool, this issue is very frustrating.

System info:

CPU: Sempron single core 1.8 GHz, 64 bit.
RAM: 3GB
OS: Kubuntu 12.10 (upgraded from 12.04).
Kernel: Linux 3.5.0-25-generic (x86_64)
Nvidia GeForce Go6150 grqphics with nvidia privative controller.

I do dual boot with Windows 7; my /home and / partitions are both in separated partitions and are formatted as ext4. I ran also fsck test, and the error has became a bit less frequent, but it persists; also, the SMART analysis didn't throw any errors.

description: updated
1 comments hidden view all 129 comments
Revision history for this message
Kad Mann (nospam-nospam-nospam) wrote :

> Mark Davidson (outdoorshappy) wrote on 2013-02-21: #98
> a bad drive.

Certainly not in my case. My drive is a brand, spanking new 512GB Samsung 840 Pro and SMART is squeky clean.

> I realize that without detailed diagnostics

When you mean to say wild speculation, please say wild speculation.

Revision history for this message
djahjah (boreste) wrote :

Also hapenned to me when i upgraded my kernel in ubuntu 12.04 and now with a fresh install on 12.10. Smart runs ok. But my question is, if it's a general kernel problem, why there ins'y more people being affected?

Revision history for this message
Tim Wochomurka (tim-wochomurka) wrote : Re: [Bug 1063354] Re: Sudden Read-Only Filesystems
Download full text (11.4 KiB)

I know a lot of discussion right now is leaning towards dual-boot/multiple
partion disks. Just pointing out that I'm experiencing the error on a
desktop with a single HDD with only one OS.

On 1 March 2013 09:47, djahjah <email address hidden> wrote:

> Also hapenned to me when i upgraded my kernel in ubuntu 12.04 and now
> with a fresh install on 12.10. Smart runs ok. But my question is, if
> it's a general kernel problem, why there ins'y more people being
> affected?
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1063354
>
> Title:
> Sudden Read-Only Filesystems
>
> Status in “linux” package in Ubuntu:
> Confirmed
>
> Bug description:
> After upgrading to ubuntu 12.10, I experience sudden locks of my
> filesystems (I have a root and a home partition with ext4), in which
> the filesystems suddenly become mounted readonly. /var/log/syslog
> shows the following entries:
>
> Oct 7 20:00:42 StudioXPS signond[3510]: signondaemon.cpp 345 init
> Failed to SUID root. Secure storage will not be available.
> Oct 7 20:02:12 StudioXPS kernel: [ 249.193555] ata1.00: exception
> Emask 0x0 SAct 0x7 SErr 0x0 action 0x0
> Oct 7 20:02:12 StudioXPS kernel: [ 249.193561] ata1.00: irq_stat
> 0x40000001
> Oct 7 20:02:12 StudioXPS kernel: [ 249.193565] ata1.00: failed
> command: READ FPDMA QUEUED
> Oct 7 20:02:12 StudioXPS kernel: [ 249.193572] ata1.00: cmd
> 60/20:00:90:6f:53/00:00:1a:00:00/40 tag 0 ncq 16384 in
> Oct 7 20:02:12 StudioXPS kernel: [ 249.193572] res
> 41/40:20:98:6f:53/00:00:1a:00:00/40 Emask 0x409 (media error) <F>
> Oct 7 20:02:12 StudioXPS kernel: [ 249.193575] ata1.00: status: { DRDY
> ERR }
> Oct 7 20:02:12 StudioXPS kernel: [ 249.193578] ata1.00: error: { UNC }
> Oct 7 20:02:12 StudioXPS kernel: [ 249.193581] ata1.00: failed
> command: WRITE FPDMA QUEUED
> Oct 7 20:02:12 StudioXPS kernel: [ 249.193587] ata1.00: cmd
> 61/18:08:18:fb:0e/00:00:2b:00:00/40 tag 1 ncq 12288 out
> Oct 7 20:02:12 StudioXPS kernel: [ 249.193587] res
> 41/40:08:98:6f:53/00:00:1a:00:00/40 Emask 0x9 (media error)
> Oct 7 20:02:12 StudioXPS kernel: [ 249.193590] ata1.00: status: { DRDY
> ERR }
> Oct 7 20:02:12 StudioXPS kernel: [ 249.193593] ata1.00: error: { UNC }
> Oct 7 20:02:12 StudioXPS kernel: [ 249.193596] ata1.00: failed
> command: WRITE FPDMA QUEUED
> Oct 7 20:02:12 StudioXPS kernel: [ 249.193602] ata1.00: cmd
> 61/d8:10:a0:bd:8b/00:00:0d:00:00/40 tag 2 ncq 110592 out
> Oct 7 20:02:12 StudioXPS kernel: [ 249.193602] res
> 41/40:08:98:6f:53/00:00:1a:00:00/40 Emask 0x9 (media error)
> Oct 7 20:02:12 StudioXPS kernel: [ 249.193605] ata1.00: status: { DRDY
> ERR }
> Oct 7 20:02:12 StudioXPS kernel: [ 249.193607] ata1.00: error: { UNC }
> Oct 7 20:02:12 StudioXPS kernel: [ 249.196606] ata1.00: configured for
> UDMA/100
> Oct 7 20:02:12 StudioXPS kernel: [ 249.196622] sd 0:0:0:0: >[sda]
> Unhandled sense code
> Oct 7 20:02:12 StudioXPS kernel: [ 249.196624] sd 0:0:0:0: >[sda]
> Oct 7 20:02:12 StudioXPS kernel: [ 249.196626] Result: hostbyte=DID_OK
> driverbyte=D...

Revision history for this message
rami (lehtinen-rami) wrote : Re: Sudden Read-Only Filesystems

I got my problem circumvented with disabling NCQ, so I don't know if my problem is the same as the one reported.

/etc/default/grub
Code:
GRUB_CMDLINE_LINUX="libata.force=noncq"

Also connecting through Intel P55 s-ata 3 Gb/s worked, so I suppose my problem is related to Marvell Technology Group Ltd. 88SE9123 PCIe SATA 6.0 Gb/s controller (rev 10)

http://markmail.org/message/dt7dnl3pubvwcmmm

https://bugzilla.kernel.org/show_bug.cgi?id=43160

https://bugs.launchpad.net/ubuntu/+source/linux/+bug/550559

http://ubuntuforums.org/showthread.php?t=1396465

http://ubuntuforums.org/showthread.php?t=1480965

system info:
ubuntu 12.04 and 13.04
 ASUS P7P55D-E PRO mainboard with the Marvell 88SE9123 PCIe SATA 6.0 onboard controller
Samsung SSD 840 PRO Series (DXM03BQ0) ssd / 1T WDC WD10EZRX-00A8LB0 (01.01A10) hdd

-with the hdd the errors didn't occur so often, once in a week / two weeks, twice a day, dependind, but with the ssd up to several times a day.

Revision history for this message
raggar (mbaart) wrote :

@Rami, Thanks, this helps! :-)

My problem looks solved with the solution from Rami: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1063354/comments/105 suggested. I tried it for about two weeks and no problems till now. Before the read-only filesystem occurred about two times a week.

/etc/default/grub
Code:
GRUB_CMDLINE_LINUX="libata.force=noncq"

Revision history for this message
Daniel Butler (rekh127) wrote :

I started having this problem this weekend after I installed 13.04 (previously I was on 12.04, which was fine). Apparantly random filesystem corruption leading to unusable disk. It was happening pretty much every boot, taking 10 - 90 minutes before it went crazy. Adding ""libata.force=noncq" to my grub config does seem to have fixed the problem. Let me know if I can help solve this with any logs or anything.

Revision history for this message
Daniel Butler (rekh127) wrote :

Hmm. Never mind. I guess I was over eager.... it was working fine for a couple of hours (after never lasting more than 90 minutes before, and very rarely more than an hour) but now it's not working again. I'm now wondering if I've got a different problem.

Revision history for this message
Oleg S. Lekshin (oleg-lekshin) wrote :

I have the same problem on my Acer Aspire TimelineX under Ubuntu 12.04. "libata.force=noncq" seems to work.
Daniel, have you run "update-grub" ?

Revision history for this message
Marcus Andersson (kamratpost) wrote :

'noncq' does not solve the problem for me. I still can't go beyond kernel 3.0.0 without a lockup after a few minutes.

Revision history for this message
Richard Andersson (richardandersson79) wrote :

'noncq' did not solve it for me either, not for 12.04-2 and not in 12.10. I had to pull my SSD and start a fresh install on an HDD. So far no read-only experience...

Intel DX58SO2 motherboard.
SATA controller: Intel Corporation 82801JI (ICH10 Family) SATA AHCI Controller
SSD: Intel 510 120Gb, 6Gbps.
Tried going through drive diagnostics, SMART looks fine, even did a low-level surface scan of the drive.
Bug experiences on drive using both an ext2 and and ext4 (luks-encrypted and not) filesystem.

Sorry I can't test around too much as this computer is critical in my everyday work.

Revision history for this message
adam (capes-adam) wrote :

I currently dedicate a second Hard Drive to VirtualBox (1. I have a LOT of VM's, 2. I can use my primary HDD at full speed at the same time I am running a VM ) which runs Ext4 and this Read-Only bug only occurs when I run a specific VM. The VM is Windows XP Home Edition SP3 and every time I boot into it, the guest OS starts to give me "Delayed Write Failed" Errors and at the same time I lose write access to the partition. Haven't discovered any data loss nor does this VM contain any important data, annoying though. Am running the 3.5.0-17 kernel.

Revision history for this message
Georgi Georgiev (gkgeorgiev) wrote :

Dear All,

I also encountered the same issue with Ubuntu Server 13.04 while I was trying to checkout all Android projects.
Exactly like TeamFahQ (teamfahq1) wrote on 2013-02-22: #99

Revision history for this message
Richard Andersson (richardandersson79) wrote :

Update to my previous comment (#111). I installed an HDD (Western Digital Caviar Black) and since (2.5 months) there has been no problem... until just now. Rebooting does not solve it as it happens quickly after. My last installation has been without system encryption.

Revision history for this message
Sushi (sushi-addiction13) wrote :

O.K. seems many people still have this problem. The original bug report talks about /var/log/syslog.

I think there is another bug that I experience and many other people do too. There is no error in the log files. The system suddenly goes read only without writing anything to the logs. As far as my very limited knowledge of linux goes, that means I should check dmesg when this happens?

My work around for this bug has been to switch to cinnamon and never shutdown/reboot unless I really must. I suspend the pc and now the read only filesystem only happens every 2 or 3 months. Then fsck still deletes some files on reboot, usually firefox cache files, sometimes icons and rarely other files. Like movies :( and yes my HDD is perfect.

I did read that there is a kernel issue for people who frequently shutdown/reboot their system that causes this issue but I can't find it again. Apparently is was a comment made by one of the kernel developers.

Revision history for this message
Defacto Seven (be-real) wrote :

Just adding my input. I have the same read only file system problem for the first time this evening. No changes percipitated it. Running debian wheezy. No new info that hasn't already been served.

Revision history for this message
Defacto Seven (be-real) wrote :

Just a follow up... My problem is fixed or at least figured out. Although many with this bug have similar log output mine was definitely hardware. SATA hard drives have extremely fragile connectors and the connector to this particular drive with the read only problem was slightly pushed down by another connector. It finally broke the HD plug socket just enough to make a bad connection. It's practically imperceptible until you remove the connector as see the slight movement of the HD plug component. I reconnected the plug and put slight pressure underneath the component, started the computer and it fired right up. The system did a forced fsck and it is working with all data present and permissions corrected. Now I just need to dump all the data to a new HD.

This may be an issue for some and not for others but the outcome and the dmesg logs show ata errors on that drive. It would be very easy to miss the breakage since it's a very small variance, and it's bizaar that it causes the read only problem as stated above but nothing else.

Revision history for this message
persa (leo-lienard) wrote :

Hi,

I had exactly same problem and I solved it.

My ubuntu 13.04 is installed on Intel SATA 2 SSD but was connected to the Motherboard on SATA 3 connector (in the past, this drive was for windows partition and no errors with the Marvell chipset SATA 3 with SATA 2 drive). I switch my SSD on SATA 2 connector then no problems ....

I hope this may help you...

Revision history for this message
Rahul Jain (equites-vero) wrote :

Had the same bug running kernel 3.11 on Mint 15 (which is based on Raring). Everything was fine for a long time until suddenly one day while downloading a torrent!!! my filesystem locked to read-only and torrent stopped. Reboot solved it, but two days later, it happened again. Definitely not a hardware issue as Elementary on kernel 3.2 on another partition is running perfectly. No smartctl errors.

Revision history for this message
Felix Joussein (felix-joussein) wrote :

Hi, I can confirm Rahul Jain's observation:
I am also on Mint 15.
I had that problem since Mint 14, then not for a long while and yesterday it appeared again.
In my case, I am sure, my Hardware is OK, no smart errors on the SSD, and as this is a notebook, I don't think, there are any pin problems on the connectors.

I have noticed, that this problem appears every time, ther's heavy SDD activity, and lots of small data-blocks are beeing encrypted and written/read to the SSD.
In particular yesterday I was enabling the calendar caching in my thunderbird lightning calendars.
This provoked massive writing and reading operations.
An other time that happened was again in thunderbird when I synchronized my 12GB IMAP Mailbox.

I just tried to verify my theory about many little files/small data-blocks - I used bonnie++ to write many little files in a very fast manner.
Result: same dmesg output, ends in kernel panic...

This is a vital problem so a fix needs to be found asap.
regards and merry Christmas ,
Felix

Revision history for this message
Axel Pospischil (apos) wrote :

Hi there,

I am running Ubuntu 12.04 LTS on three machines:

1. An intel xeon with an asus motherboard, 32GB RAM, SSD (SAMSUNG older model)
2. An Lenovo Thinkpad 201s i7, 8GB RAM, SSD (Crucial)
3. An Lenovo Thinkpad W510 i7 720, 8GB RAM, SSD (Samsung EVO)

All systems are running under kernel 3.11 (linux-image-generic-lts-saucy , 3.11.0.15.14)

/etc/default/grub: GRUB_CMDLINE_LINUX_DEFAULT="quiet splash"

/etc/modprobe.d/options: EMPTY

The #3 (W510) randomly freezes with the following errors:
ata1: SATA max UDMA/133 abar m2048@0xf2627000 port 0xf2627100 irq 54
ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
ata1.00: ACPI cmd ef/02:00:00:00:00:a0 (SET FEATURES) succeeded
ata1.00: ACPI cmd f5/00:00:00:00:00:a0 (SECURITY FREEZE LOCK) filtered out
ata1.00: ACPI cmd ef/10:03:00:00:00:a0 (SET FEATURES) filtered out
ata1.00: ATA-9: Samsung SSD 840 EVO 500GB, EXT0BB0Q, max UDMA/133
ata1.00: 976773168 sectors, multi 16: LBA48 NCQ (depth 31/32), AA
ata1.00: ACPI cmd ef/02:00:00:00:00:a0 (SET FEATURES) succeeded
ata1.00: ACPI cmd f5/00:00:00:00:00:a0 (SECURITY FREEZE LOCK) filtered out
ata1.00: ACPI cmd ef/10:03:00:00:00:a0 (SET FEATURES) filtered out
ata1.00: configured for UDMA/133
[...]

These errors occur up to 5-10 times, then the SSD is set to readonly state like this:

ata1: EH complete
ata1: limiting SATA link speed to 1.5 Gbps
ata1.00: exception Emask 0x52 SAct 0x1 SErr 0x1a80d00 action 0x6 frozen
ata1.00: irq_stat 0x08000000, interface fatal error
ata1: SError: { UnrecovData Proto HostInt 10B8B BadCRC LinkSeq TrStaTrns }
ata1.00: failed command: READ FPDMA QUEUED
ata1.00: cmd 60/08:00:d8:b9:27/00:00:05:00:00/40 tag 0 ncq 4096 in
ata1.00: status: { DRDY }
ata1: hard resetting link
ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 310)

---------------------------------

i am now trying the following settings:

/etc/default/grub: GRUB_CMDLINE_LINUX_DEFAULT="quiet splash libata.force=noncq"

/etc/modprobe.d/options: options libata noacpi=1

/etc/default/tlp (which was on standard setting until now)

# Hard disk advanced power management level: 1(max saving)..254(off)
# Levels 1..127 may spin down the disk.
# Separate values for multiple devices with spaces.
DISK_APM_LEVEL_ON_AC="254 254"
DISK_APM_LEVEL_ON_BAT="254 254"

# SATA aggressive link power management (ALPM):
# min_power/medium_power/max_performance
SATA_LINKPWR_ON_AC=max_performance
SATA_LINKPWR_ON_BAT=max_performance

I will write ... after further testing.

Please note: system #1. and #2. are running without any problems!

Revision history for this message
Axel Pospischil (apos) wrote :

Unfortunetely with the settings from above i was not lucky this morning. When I turned on the W510 notebook (#3.) it stuck like before. I really think, this is system specific for this laptop, because the other computers with itdentical software are running flawlessly ... 24/7 (via suspend / resume).

I also have to add, that all disks run on lvm based cryptsetup!

> cat /var/log/syslog.1 | grep ata1 | cut -d "[" -f2

    1.419360] ata1: SATA max UDMA/133 abar m2048@0xf2627000 port 0xf2627100 irq 53
    1.737774] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
    1.741004] ata1.00: ACPI cmd ef/02:00:00:00:00:a0 (SET FEATURES) succeeded
    1.741014] ata1.00: ACPI cmd f5/00:00:00:00:00:a0 (SECURITY FREEZE LOCK) filtered out
    1.741020] ata1.00: ACPI cmd ef/10:03:00:00:00:a0 (SET FEATURES) filtered out
    1.741298] ata1.00: ATA-9: Samsung SSD 840 EVO 500GB, EXT0BB0Q, max UDMA/133
    1.741303] ata1.00: 976773168 sectors, multi 16: LBA48 NCQ (depth 31/32), AA
    1.742899] ata1.00: ACPI cmd ef/02:00:00:00:00:a0 (SET FEATURES) succeeded
    1.742907] ata1.00: ACPI cmd f5/00:00:00:00:00:a0 (SECURITY FREEZE LOCK) filtered ou
    1.742913] ata1.00: ACPI cmd ef/10:03:00:00:00:a0 (SET FEATURES) filtered out
    1.743209] ata1.00: configured for UDMA/133

I will investigate further

Revision history for this message
Axel Pospischil (apos) wrote :

[Problem probably solved]

I changed my fstab and removed the "defaults"-entry:
Probably this was the problem and the "defaults"-entry puts some mount options, that interfere with the ssd.

fatab-NEW: /dev/mapper/vg--myvg-root / ext4 noatime,errors=remount-ro 0 1
fstab-OLD: /dev/mapper/vg--myvg-root / ext4 defaults,noatime,errors=remount-ro 0 1

I left the libata option in modprobe and kernel boot options:
> cat /etc/modprobe.d/options
options libata noacpi=1

> cat /etc/default/grub
GRUB_CMDLINE_LINUX_DEFAULT="quiet splash libata.force=noncq libata.noacpi=1"

Since two days the system is running without any filesystem-freezes.

I am attaching an ubuntu bug-report which can probably help to solve and analyse the problem.
The system is up to date as time of writing and all necessary updates are applied.

-------------------------------------------------------------------------------

I have to add, that I am running a newer version of lvm2 with gives me the option of using TRIM:
ii lvm2 2.02.95-4ubuntu1.1~p Linux Logical Volume Manager
All necessary entries in crytptab and lvm.conf are done and trim is working (on demand).

> dmesg | grep libata
[ 0.000000] Command line: BOOT_IMAGE=/vmlinuz-3.11.0-15-generic root=/dev/mapper/vg--madagaskar-root ro quiet splash libata.force=noncq libata.noacpi=1
[ 0.517478] libata version 3.00 loaded.

> dmesg | grep ata1
[ 1.400125] ata1: SATA max UDMA/133 abar m2048@0xf2627000 port 0xf2627100 irq 53
[ 1.717874] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[ 1.719776] ata1.00: FORCE: horkage modified (noncq)
[ 1.719843] ata1.00: ATA-9: Samsung SSD 840 EVO 500GB, EXT0BB0Q, max UDMA/133
[ 1.719848] ata1.00: 976773168 sectors, multi 16: LBA48 NCQ (not used)
[ 1.720159] ata1.00: configured for UDMA/133
[12726.396872] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[12726.398974] ata1.00: configured for UDMA/133
[54682.379955] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[54682.382006] ata1.00: configured for UDMA/133

Revision history for this message
penalvch (penalvch) wrote :

Lars Kumbier, as per http://www.dell.com/support/drivers/us/en/04/DriverDetails?driverId=R301708&fileId=2731109636 an update is available for your BIOS (A15). If you update to this following https://help.ubuntu.com/community/BiosUpdate , does it change anything? If it doesn't, could you please both specify what happened, and just provide the output of the following terminal command:
sudo dmidecode -s bios-version && sudo dmidecode -s bios-release-date

Please note your current BIOS is already in the Bug Description, so posting this on the old BIOS would not be helpful.

For more on BIOS updates and linux, please see https://help.ubuntu.com/community/ReportingBugs#Bug_reporting_etiquette .

Thank you for your understanding.

tags: added: bios-outdated-a15
Changed in linux (Ubuntu):
status: Confirmed → Incomplete
summary: - Sudden Read-Only Filesystems
+ [Dell Studio XPS 1640] Sudden Read-Only Filesystems
tags: added: regression-release
Revision history for this message
Axel Pospischil (apos) wrote :

Hi there: Lenovo W510, Ubuntu 12.04 LTS, SSD

In my last comment I hoped, the problem was solved (https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1063354/comments/123). Unfortunately it is not. Yeterday morning my filesystem was readonly again.

So I am going to remove now all noatime, all trim entries. I am reverting to the presice version of lvm 2 and I am setting back all acpi settings for libata in ubuntu and the bios.

Because this seams to be a specific bug for a Dell system, I am creating a new bug report for my machine.

https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1265309

Revision history for this message
NW (ubuntu327) wrote :
Download full text (6.2 KiB)

I have an old tower which I use to test multiple operating systems. Each OS lives on a separate drive in a removable tray, so the drives can be swapped as needed. Once in a while the system would hang when the BIOS was set to auto-detect the drives at every boot, or I would see an occasional failure to mount the ATA boot device when Linux was started in verbose mode--and Windows would simply freeze randomly. The problem was traced to the power connector on a drive tray: I had to extract the pins from the connector with a special tool, cut off the wires, soak the pins in contact cleaner, and solder them back on, because the crimped connection and the corrosion made it unreliable.

http://en.wikipedia.org/wiki/Molex_connector#Disk_drive_connector_.28AMP_MATE-N-LOK_1-480424-0_Power_Connector.29

http://www.molex.com/molex/products/family?key=disk_drive_power_connector&channel=PRODUCTS&chanName=family&pageTitle=Introduction

I never had a problem with these connectors before, except for the ones in the Enermax trays (which seem to be made of the cheapest materials they could find.) Before I repaired the power connector, I encountered that read-only bug in Ubuntu. When this occurred, ALL physical volumes attached to the machine became read-only, including other hard drives and all external USB storage devices. Even new USB devices attached later were not writable. The only thing I could write to was a network share. If this happens on all affected platforms, it might give developers some idea of what to look for in the source code. I also wonder if some power management feature could be involved:

GRUB_CMDLINE_LINUX="libata.dma=0 libata.noacpi=1"
http://ubuntuforums.org/showthread.php?t=1892483

I believe this bug can be triggered by other things too, such as system BIOS bug or AHCI preference, drive firmware bug, defective electrolytic capacitors on a old mainboard, bad solder joints just about anywhere, a defective (or overloaded) power supply. But in the case of SSD drives it could also be a latency issue:

Why Solid-State Drives Slow Down As You Fill Them Up (Ubuntu should warn about this)
 "When filling up an empty drive, they found high write performance very early in the process and a significant drop as the write operations continued to fill up the drive... If you have a solid-state drive, you should try to avoid using more than 75% of its capacity."
http://www.howtogeek.com/165542/why-solid-state-drives-slow-down-as-you-fill-them-up/

(for general reference on dual-boot systems):
12 Things You Must Do When Running a Solid State Drive in Windows 7
http://www.maketecheasier.com/12-things-you-must-do-when-running-a-solid-state-drive-in-windows-7/

I suspect that people who experience read-only issues today were experiencing silent write retries in previous kernel versions and simply did not notice because the retry was successful. It seems like the common thread is that the drive was not ready to accept writes for some reason, and the kernel did not detect this condition. I tried to simulate this by removing power to the drive momentarily. During this time, CPU usage was very high, but it returned to normal when power was appl...

Read more...

Revision history for this message
penalvch (penalvch) wrote :

NW, thank you for your comment. So your hardware and problem may be tracked, could you please file a new report with Ubuntu by executing the following in a terminal while booted into the default Ubuntu kernel (not a mainline one) via:
ubuntu-bug linux

For more on this, please read the official Ubuntu documentation:
Ubuntu Bug Control and Ubuntu Bug Squad: https://wiki.ubuntu.com/Bugs/BestPractices#X.2BAC8-Reporting.Focus_on_One_Issue
Ubuntu Kernel Team: https://wiki.ubuntu.com/KernelTeam/KernelTeamBugPolicies#Filing_Kernel_Bug_reports
Ubuntu Community: https://help.ubuntu.com/community/ReportingBugs#Bug_reporting_etiquette

When opening up the new report, please feel free to subscribe me to it.

Thank you for your understanding.

Helpful bug reporting tips:
https://wiki.ubuntu.com/ReportingBugs

1 comments hidden view all 129 comments
Revision history for this message
Simon Déziel (sdeziel) wrote :

This bug is old and was initially reported with Ubuntu 12.10 which is long EOL. As such, I'll mark it as "won't fix", but please re-open if you can reproduce with a current system/kernel. Thanks

Changed in linux (Ubuntu):
status: Incomplete → Won't Fix
Displaying first 40 and last 40 comments. View all 129 comments or add a comment.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.