[4.1.0 regression] Kernel oops - blk_update_request: I/O error when running udisks2 force_test_removal test

Bug #1478623 reported by Iain Lane
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Linux
Unknown
Unknown
linux (Ubuntu)
Fix Released
High
Joseph Salisbury
Wily
Fix Released
High
Joseph Salisbury

Bug Description

We noticed that the udisks2 autopkgtest is failing on i386 in the "forced removal" test, for example

  https://jenkins.qa.ubuntu.com/job/wily-adt-udisks2/ARCH=i386,label=adt/55/console

I reproduced this locally, on this kernel

ubuntu@autopkgtest:~$ uname -a
Linux autopkgtest 4.1.0-2-generic #2-Ubuntu SMP Wed Jul 22 18:17:34 UTC 2015 i686 i686 i686 GNU/Linux

and got the following spewed to console

fs: forced removal ... [ 303.782426] blk_update_request: I/O error, dev sdb, sector 0
[ 303.807400] BUG: unable to handle kernel paging request at 1afde000
[ 303.808010] IP: [<c1378626>] __percpu_counter_add+0x16/0x90
[ 303.808010] *pdpt = 000000000ee9b001 *pde = 0000000000000000
[ 303.808010] Oops: 0000 [#1] SMP
[ 303.808010] Modules linked in: scsi_debug btrfs xor raid6_pq ufs qnx4 hfsplus hfs minix ntfs msdos jfs xfs libcrc32c ppdev kvm_intel kvm joydev serio_raw 8250_fintek parport_pc parport i2c_piix4 mac_hid autofs4 psmouse pata_acpi floppy
[ 303.808010] CPU: 0 PID: 22573 Comm: umount Not tainted 4.1.0-2-generic #2-Ubuntu
[ 303.808010] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.8.1-20150604_211837-tipua 04/01/2014
[ 303.808010] task: dc75e5e0 ti: ce85a000 task.ti: ce85a000
[ 303.808010] EIP: 0060:[<c1378626>] EFLAGS: 00010092 CPU: 0
[ 303.808010] EIP is at __percpu_counter_add+0x16/0x90
[ 303.808010] EAX: 00000001 EBX: dbba9c08 ECX: 00000000 EDX: 00000001
[ 303.808010] ESI: ffffffff EDI: 00000000 EBP: ce85be6c ESP: ce85be54
[ 303.808010] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
[ 303.808010] CR0: 8005003b CR2: 1afde000 CR3: 1e9c88c0 CR4: 000007f0
[ 303.808010] Stack:
[ 303.808010] c1aa7b40 00000001 00000000 dbba9be8 ffffffff decbfd60 ce85be88 c115bca4
[ 303.808010] 00000008 ddce5228 ddce5228 decbfd60 decbfd70 ce85bea4 c11e47f9 00000000
[ 303.808010] 00000286 ddce5228 d559cbd0 cdd41400 ce85beb4 c11e496f d3cc2000 00000000
[ 303.808010] Call Trace:
[ 303.808010] [<c115bca4>] account_page_dirtied+0x74/0xf0
[ 303.808010] [<c11e47f9>] __set_page_dirty+0x39/0xc0
[ 303.808010] [<c11e496f>] mark_buffer_dirty+0x4f/0xb0
[ 303.808010] [<c1246138>] ext4_commit_super+0x168/0x220
[ 303.808010] [<c1246db1>] ext4_put_super+0xc1/0x310
[ 303.808010] [<c11ce6a2>] ? dispose_list+0x32/0x40
[ 303.808010] [<c11cf2fa>] ? evict_inodes+0xda/0xf0
[ 303.808010] [<c11b6834>] generic_shutdown_super+0x64/0xe0
[ 303.808010] [<c11616d0>] ? unregister_shrinker+0x40/0x50
[ 303.808010] [<c11b6b2f>] kill_block_super+0x1f/0x70
[ 303.808010] [<c11b6e0d>] deactivate_locked_super+0x3d/0x70
[ 303.808010] [<c11b71f7>] deactivate_super+0x57/0x60
[ 303.808010] [<c11d2099>] cleanup_mnt+0x39/0x90
[ 303.808010] [<c11d2130>] __cleanup_mnt+0x10/0x20
[ 303.808010] [<c107dbf1>] task_work_run+0xb1/0xd0
[ 303.808010] [<c1010a63>] do_notify_resume+0x53/0x70
[ 303.808010] [<c1717d17>] work_notifysig+0x26/0x2b
[ 303.808010] Code: 24 8b 54 24 04 83 c4 08 5b 5e 5f 5d c3 90 8d b4 26 00 00 00 00 55 89 e5 57 56 53 89 c3 89 d0 83 ec 0c 89 45 ec 89 4d f0 8b 7b 14 <64> 8b 37 89 7d e8 89 f7 c1 ff 1f 01 c6 11 cf 8b 4d 08 c1 f9 1f
[ 303.808010] EIP: [<c1378626>] __percpu_counter_add+0x16/0x90 SS:ESP 0068:ce85be54
[ 303.808010] CR2: 000000001afde000
[ 303.808010] ---[ end trace bd84bb1a9fa3ebd4 ]---
[ 304.463429] scsi 2:0:0:0: Direct-Access Linux scsi_debug 0184 PQ: 0 ANSI: 6
[ 304.468098] sd 2:0:0:0: Attached scsi generic sg2 type 0
[ 304.472038] sd 2:0:0:0: [sdc] 585728 512-byte logical blocks: (299 MB/286 MiB)
[ 304.476079] sd 2:0:0:0: [sdc] Write Protect is off
[ 304.476661] sd 2:0:0:0: [sdc] Mode Sense: 73 00 10 08
[ 304.484045] sd 2:0:0:0: [sdc] Write cache: enabled, read cache: enabled, supports DPO and FUA
[ 304.572090] sd 2:0:0:0: [sdc] Attached SCSI disk

To reproduce, do something like this:

$ adt-buildvm-ubuntu-cloud -r wily -a i386 # download a cloud image, fettle it a bit to be more useful
$ qemu-system-i386 -m 2048 -net user -net nic,model=virtio -enable-kvm -nographic adt-wily*
# login with ubuntu/ubuntu

# apt-get update
# apt-get dist-upgrade
$ apt-get source udisks2
$ cd udisks2-*
$ cat debian/tests/control # install the test-deps
# debian/tests/upstream-system # runs the testsuite, when it gets to test_force_removal you should see this bug. the testsuite hangs too.

Revision history for this message
Brad Figg (brad-figg) wrote : Missing required logs.

This bug is missing log files that will aid in diagnosing the problem. From a terminal window please run:

apport-collect 1478623

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete
Martin Pitt (pitti)
Changed in linux (Ubuntu):
status: Incomplete → Confirmed
Iain Lane (laney)
tags: added: bot-stop-nagging
Revision history for this message
Joseph Salisbury (jsalisbury) wrote : Re: Kernel oops - blk_update_request: I/O error when running udisks2 force_test_removal test

Would it be possible for you to test the latest upstream kernel? Refer to https://wiki.ubuntu.com/KernelMainlineBuilds . Please test the latest v4.2 kernel[0].

If this bug is fixed in the mainline kernel, please add the following tag 'kernel-fixed-upstream'.

If the mainline kernel does not fix this bug, please add the tag: 'kernel-bug-exists-upstream'.

Once testing of the upstream kernel is complete, please mark this bug as "Confirmed".

Thanks in advance.

[0] http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.2-rc4-unstable/

tags: added: kernel-da-key
Changed in linux (Ubuntu):
importance: Undecided → Medium
Revision history for this message
Iain Lane (laney) wrote :
Download full text (3.6 KiB)

Tried that, still fails

fs: forced removal ... [ 103.792793] blk_update_request: I/O error, dev sdb, sector 0
[ 103.804099] BUG: unable to handle kernel paging request at 1af23000
[ 103.805253] IP: [<c1357db4>] __percpu_counter_add+0x14/0xa0
[ 103.806244] *pdpt = 0000000015588001 *pde = 0000000000000000
[ 103.807201] Oops: 0000 [#1] SMP
[ 103.807776] Modules linked in: scsi_debug ppdev input_leds joydev serio_raw parport_pc 8250_fintek parport i2c_piix4 mac_hid autofs4 raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq raid1 raid0 multipath linear psmouse floppy pata_acpi
[ 103.808017] CPU: 0 PID: 1546 Comm: umount Not tainted 4.2.0-040200rc4-generic #201507271733
[ 103.808017] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.8.1-20150604_211837-tipua 04/01/2014
[ 103.808017] task: de8a9180 ti: d5d5c000 task.ti: d5d5c000
[ 103.808017] EIP: 0060:[<c1357db4>] EFLAGS: 00010096 CPU: 0
[ 103.808017] EIP is at __percpu_counter_add+0x14/0xa0
[ 103.808017] EAX: 00000000 EBX: c007b394 ECX: 00000000 EDX: 00000001
[ 103.808017] ESI: ffffffff EDI: ddd3d0c0 EBP: d5d5de60 ESP: d5d5de48
[ 103.808017] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
[ 103.808017] CR0: 8005003b CR2: 1af23000 CR3: 1effb6c0 CR4: 000006b0
[ 103.808017] Stack:
[ 103.808017] c1a972c0 00000001 00000000 c007b364 ffffffff ddd3d0c0 d5d5de84 c11552be
[ 103.808017] 00000008 dc731800 dbd26c8c ddd3d0c0 ddd3d0c0 dbd26d70 dbd26d80 d5d5dea0
[ 103.808017] c11d994c dc731800 00000292 ddd3d0c0 dc731800 dbd26d70 d5d5deb8 c11d9b09
[ 103.808017] Call Trace:
[ 103.808017] [<c11552be>] account_page_dirtied+0xae/0x180
[ 103.808017] [<c11d994c>] __set_page_dirty+0x3c/0xb0
[ 103.808017] [<c11d9b09>] mark_buffer_dirty+0x69/0xf0
[ 103.808017] [<c1239ce9>] ext4_commit_super+0x169/0x220
[ 103.808017] [<c11f5d58>] ? mb_cache_shrink+0x28/0x190
[ 103.808017] [<c123a941>] ext4_put_super+0xc1/0x320
[ 103.808017] [<c11c26a2>] ? dispose_list+0x32/0x40
[ 103.808017] [<c11c325b>] ? evict_inodes+0x9b/0xf0
[ 103.808017] [<c11ac1b4>] generic_shutdown_super+0x64/0xe0
[ 103.808017] [<c115a4e0>] ? unregister_shrinker+0x40/0x50
[ 103.808017] [<c11ac48f>] kill_block_super+0x1f/0x70
[ 103.808017] [<c11ac73d>] deactivate_locked_super+0x3d/0x70
[ 103.808017] [<c11acc6a>] deactivate_super+0x3a/0x60
[ 103.808017] [<c11c5c69>] cleanup_mnt+0x39/0x80
[ 103.808017] [<c11c5cf0>] __cleanup_mnt+0x10/0x20
[ 103.808017] [<c107c171>] task_work_run+0x91/0xb0
[ 103.808017] [<c1011103>] do_notify_resume+0x53/0x80
[ 103.808017] [<c16fc497>] work_notifysig+0x26/0x2b
[ 103.808017] Code: b3 3e 3a 00 8b 44 24 04 8b 54 24 08 83 c4 0c 5b 5e 5f 5d c3 8d 76 00 55 89 e5 57 56 53 89 c3 83 ec 0c 89 55 ec 89 4d f0 8b 40 14 <64> 8b 30 89 f7 c1 ff 1f 01 d6 8b 55 08 11 cf 89 d1 c1 f9 1f 39
[ 103.808017] EIP: [<c1357db4>] __percpu_counter_add+0x14/0xa0 SS:ESP 0068:d5d5de48
[ 103.808017] CR2: 000000001af23000
[ 103.808017] ---[ end trace 4f2efe63047f95c9 ]---
[ 104.455856] scsi 2:0:0:0: Direct-Access Linux scsi_debug 0184 PQ: 0 ANSI: 6
[ 104.462147] sd 2:0:0:0: Attached scsi generic sg2 type 0
[ 104.468103] sd 2:0:0:0: [sdc] 585728 5...

Read more...

tags: added: bug-exists-upstream
tags: added: kernel-bot-stop-nagging
removed: bot-stop-nagging
tags: added: bot-stop-nagging kernel-bug-exists-upstream
removed: bug-exists-upstream kernel-bot-stop-nagging
Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

This issue appears to be an upstream bug, since you tested the latest upstream kernel. Would it be possible for you to open an upstream bug report[0]? That will allow the upstream Developers to examine the issue, and may provide a quicker resolution to the bug.

Please follow the instructions on the wiki page[0]. The first step is to email the appropriate mailing list. If no response is received, then a bug may be opened on bugzilla.kernel.org.

Once this bug is reported upstream, please add the tag: 'kernel-bug-reported-upstream'.

[0] https://wiki.ubuntu.com/Bugs/Upstream/kernel

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

Also, is there a prior kernel version that did not exhibit this bug? If so, we can perform a kernel bisect to identify the commit that introduced this issue.

Changed in linux (Ubuntu):
status: Confirmed → Triaged
Revision history for this message
Iain Lane (laney) wrote : Re: [Bug 1478623] Re: Kernel oops - blk_update_request: I/O error when running udisks2 force_test_removal test

On Wed, Jul 29, 2015 at 05:37:12PM -0000, Joseph Salisbury wrote:
> This issue appears to be an upstream bug, since you tested the latest
> upstream kernel. Would it be possible for you to open an upstream bug
> report[0]? That will allow the upstream Developers to examine the issue,
> and may provide a quicker resolution to the bug.
>
> Please follow the instructions on the wiki page[0]. The first step is to
> email the appropriate mailing list. If no response is received, then a
> bug may be opened on bugzilla.kernel.org.

Probably want a better reproducer than the udisks2 testsuite before
doing that, no?

--
Iain Lane [ <email address hidden> ]
Debian Developer [ <email address hidden> ]
Ubuntu Developer [ <email address hidden> ]

Revision history for this message
Martin Pitt (pitti) wrote : Re: Kernel oops - blk_update_request: I/O error when running udisks2 force_test_removal test

https://jenkins.qa.ubuntu.com/job/wily-adt-udisks2/44/ARCH=i386,label=adt/ was the last one that succeeded, with linux-image-4.0.0-4-generic 4.0.0-4.7. https://jenkins.qa.ubuntu.com/job/wily-adt-udisks2/45/ARCH=i386,label=adt/ is the first one that failed, with linux-image-4.1.0-1-generic 4.1.0-1.1.

So this is quite clearly a regression in 4.1.0-1.1.

summary: - Kernel oops - blk_update_request: I/O error when running udisks2
- force_test_removal test
+ [4.1.0 regression] Kernel oops - blk_update_request: I/O error when
+ running udisks2 force_test_removal test
Changed in linux (Ubuntu Vivid):
importance: Undecided → Medium
status: New → Triaged
tags: added: performing-bisect
no longer affects: linux (Ubuntu Vivid)
Changed in linux (Ubuntu Wily):
importance: Medium → High
status: Triaged → In Progress
assignee: nobody → Joseph Salisbury (jsalisbury)
Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

There are three commits in v4.1 that make changes to the EIP indicated:

8c1903d xfs: inode and free block counters need to use __percpu_counter_compare
0d485ad xfs: use generic percpu counters for free block counter
501ab32 xfs: use generic percpu counters for inode counter

I built a test kernel with one of these commits reverted(8c1903d). This test kernel can be downloaded from:
http://kernel.ubuntu.com/~jsalisbury/lp1478623/

Can you test this kernel and see if it still exhibits this bug?

If it does, I'll build a test kernel with all three reverted. And if that still has the bug, I'll start a kernel bisect.

Revision history for this message
Iain Lane (laney) wrote :

Cheers Joseph - it's still broken with that kernel I'm afraid.

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

I attempted to revert the other two commits. However, there are allot of prereqs that also need to be removed. I think a bisect would be faster due to that.

I started a bisect between v4.0 final and v4.1 final. It will require testing about 8 - 10 test kernels.

I built the first test kernel up to commit:
d0a3997c0c3f9351e24029349dee65dd1d9e8d84

The test kernel can be downloaded from:
http://kernel.ubuntu.com/~jsalisbury/lp1478623/

Can you test this kernel and see if it exhibits this bug? I'll build the next test kernel based on your results.

Revision history for this message
Iain Lane (laney) wrote : Re: [Bug 1478623] Re: [4.1.0 regression] Kernel oops - blk_update_request: I/O error when running udisks2 force_test_removal test

On Fri, Jul 31, 2015 at 05:57:58PM -0000, Joseph Salisbury wrote:
> I attempted to revert the other two commits. However, there are allot
> of prereqs that also need to be removed. I think a bisect would be
> faster due to that.
>
> I started a bisect between v4.0 final and v4.1 final. It will require
> testing about 8 - 10 test kernels.
>
> I built the first test kernel up to commit:
> d0a3997c0c3f9351e24029349dee65dd1d9e8d84
>
> The test kernel can be downloaded from:
> http://kernel.ubuntu.com/~jsalisbury/lp1478623/
>
> Can you test this kernel and see if it exhibits this bug? I'll build
> the next test kernel based on your results.

I can do if you want, but it will be much faster if you (or someone who
can easily build the new kernels) does it...

If you like, I can supply a .img with the pre-reqs installed (by running
the steps I gave in the bug description).

Let me know what you think.

--
Iain Lane [ <email address hidden> ]
Debian Developer [ <email address hidden> ]
Ubuntu Developer [ <email address hidden> ]

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

Do you know if I perform the steps listed to reproduce the issue in a VM or do I need some dedicated hardware?

Revision history for this message
Iain Lane (laney) wrote :

On Mon, Aug 03, 2015 at 05:55:35PM -0000, Joseph Salisbury wrote:
> Do you know if I perform the steps listed to reproduce the issue in a VM
> or do I need some dedicated hardware?

In a VM - the first command I listed (adt-buildvm-ubuntu-cloud from the
`autopkgtest' package) will download and construct one for you, then you
just need to boot into it and install the stuff to run udisks2's
autopkgtest.

--
Iain Lane [ <email address hidden> ]
Debian Developer [ <email address hidden> ]
Ubuntu Developer [ <email address hidden> ]

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

I have an environment setup to try and reproduce/bisect the bug. However, I'm getting an error when running the test. Have you seen this:

ubuntu@autopkgtest:~/udisks2-2.1.6$ sudo debian/tests/upstream-system
Traceback (most recent call last):
  File "<frozen importlib._bootstrap>", line 2158, in _find_spec
AttributeError: 'DynamicImporter' object has no attribute 'find_spec'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "src/tests/integration-test", line 61, in <module>
    from gi.repository import GLib, Gio, UDisks
  File "/usr/lib/python3/dist-packages/gi/importer.py", line 144, in find_module
    'introspection typelib not found' % namespace)
ImportError: cannot import name UDisks, introspection typelib not found

Revision history for this message
Iain Lane (laney) wrote :

On Fri, Aug 21, 2015 at 04:58:30PM -0000, Joseph Salisbury wrote:
> I have an environment setup to try and reproduce/bisect the bug.
> However, I'm getting an error when running the test. Have you seen
> this:
>
>
> ubuntu@autopkgtest:~/udisks2-2.1.6$ sudo debian/tests/upstream-system
> Traceback (most recent call last):
> File "<frozen importlib._bootstrap>", line 2158, in _find_spec
> AttributeError: 'DynamicImporter' object has no attribute 'find_spec'
>
> During handling of the above exception, another exception occurred:
>
> Traceback (most recent call last):
> File "src/tests/integration-test", line 61, in <module>
> from gi.repository import GLib, Gio, UDisks
> File "/usr/lib/python3/dist-packages/gi/importer.py", line 144, in find_module
> 'introspection typelib not found' % namespace)
> ImportError: cannot import name UDisks, introspection typelib not found

I guess you want gir1.2-udisks-2.0 - do you have this installed? The '@'
in the test-depends means that you should install the binary packages
produced by this source. Sorry for not being precise enough in the
description.

--
Iain Lane [ <email address hidden> ]
Debian Developer [ <email address hidden> ]
Ubuntu Developer [ <email address hidden> ]

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

Thanks, Iain. That got me a little further, but it seems I may still be missing something:

ubuntu@autopkgtest:~/udisks2-2.1.6$ sudo debian/tests/upstream-system
Testing installed system binaries
Traceback (most recent call last):
  File "src/tests/integration-test", line 1475, in <module>
    UDisksTestCase.init(logfile=args.logfile)
  File "src/tests/integration-test", line 108, in init
    for l in open('/usr/share/dbus-1/system-services/org.freedesktop.UDisks2.service'):
FileNotFoundError: [Errno 2] No such file or directory: '/usr/share/dbus-1/system-services/org.freedesktop.UDisks2.service'

(process:818): GLib-CRITICAL **: g_hash_table_destroy: assertion 'hash_table != NULL' failed

(process:818): GLib-CRITICAL **: g_hash_table_destroy: assertion 'hash_table != NULL' failed

Revision history for this message
Iain Lane (laney) wrote :

On Mon, Aug 24, 2015 at 06:14:48PM -0000, Joseph Salisbury wrote:
> Thanks, Iain. That got me a little further, but it seems I may still be
> missing something:
>
> ubuntu@autopkgtest:~/udisks2-2.1.6$ sudo debian/tests/upstream-system
> Testing installed system binaries
> Traceback (most recent call last):
> File "src/tests/integration-test", line 1475, in <module>
> UDisksTestCase.init(logfile=args.logfile)
> File "src/tests/integration-test", line 108, in init
> for l in open('/usr/share/dbus-1/system-services/org.freedesktop.UDisks2.service'):
> FileNotFoundError: [Errno 2] No such file or directory: '/usr/share/dbus-1/system-services/org.freedesktop.UDisks2.service'

sounds like you want udisks2 (install *all* binaries built from the
udisks2 source package):

laney@nightingale> dpkg -S /usr/share/dbus-1/system-services/org.freedesktop.UDisks2.service
udisks2: /usr/share/dbus-1/system-services/org.freedesktop.UDisks2.service

--
Iain Lane [ <email address hidden> ]
Debian Developer [ <email address hidden> ]
Ubuntu Developer [ <email address hidden> ]

Revision history for this message
ronny (ronny-standtke) wrote :

This is an upstream Linux kernel bug, see here:
https://bugzilla.kernel.org/show_bug.cgi?id=101011

Revision history for this message
Martin Pitt (pitti) wrote :

This seems fixed in wily now with the latest kernel: http://autopkgtest.ubuntu.com/packages/u/udisks2/wily/i386/ is happy again \o/

Changed in linux (Ubuntu Wily):
status: In Progress → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.