xfstest sanity checks 17 fails on data-hole-data inside page

Bug #1709738 reported by bugproxy
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
The Ubuntu-power-systems project
Fix Released
Medium
Canonical Kernel Team
linux (Ubuntu)
Fix Released
Medium
Joseph Salisbury
Artful
Fix Released
Medium
Joseph Salisbury

Bug Description

Problem Description
----------------------------
xfstests fails with Metadata corruption at leaf on ext4 filesystem

Environment
------------------
Kernel Build: 4.12.1-041201-generic
System Name : ltc-test-ci2
Model : 8247-22L
Platform : PowerNV ( P8 )
Issue observed in P9 also.

Uname output
-------------------
# uname -a
Linux ltc-test-ci2 4.12.1-041201-generic #201707121132 SMP Wed Jul 12 17:03:25 UTC 2017 ppc64le ppc64le ppc64le GNU/Linux

Steps to reproduce:
----------------------------------------
1. Create a loop device with ext4 filesystem
2. git clone git://git.kernel.org/pub/scm/fs/xfs/xfstests-dev.git; cd xfstests-dev
3. make
4. Create a local.config for running with created loop device
5. Run xfstests-dev test : ./check tests/ext4/445

generic/445 [failed, exit status 1] - output mismatch (see /root/harish/xfstests-dev/results//generic/445.out.bad)
    --- tests/generic/445.out 2017-07-13 06:04:36.244322946 -0400
    +++ /root/harish/xfstests-dev/results//generic/445.out.bad 2017-07-14 02:49:06.540352923 -0400
    @@ -1,2 +1,3 @@
     QA output created by 445
    -Silence is golden
    +seek sanity check failed!
    +(see /root/harish/xfstests-dev/results//generic/445.full for details)
    ...
    (Run 'diff -u tests/generic/445.out /root/harish/xfstests-dev/results//generic/445.out.bad' to see the entire diff)

Nothing observed in dmesg.

Full log is attached.

Note: Issue is also observed on distro kernel - 4.11.0-10-generic.

It needs to have two disks you run the test on. Make sure you create those directories before running the test.

# cat local.config
export TEST_DEV=/dev/loop0
export TEST_DIR=/mnt/test
export SCRATCH_DEV=/dev/loop1
export SCRATCH_MNT=/mnt/scratch

Revision history for this message
bugproxy (bugproxy) wrote : Test full log

Default Comment by Bridge

tags: added: architecture-ppc64le bugnameltc-156899 severity-medium targetmilestone-inin1710
Changed in ubuntu:
assignee: nobody → Ubuntu on IBM Power Systems Bug Triage (ubuntu-power-triage)
affects: ubuntu → linux (Ubuntu)
Changed in ubuntu-power-systems:
importance: Undecided → Medium
assignee: nobody → Canonical Kernel Team (canonical-kernel-team)
Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

Would it be possible for you to test the latest upstream kernel? Refer to https://wiki.ubuntu.com/KernelMainlineBuilds . Please test the latest v4.13 kernel[0].

If this bug is fixed in the mainline kernel, please add the following tag 'kernel-fixed-upstream'.

If the mainline kernel does not fix this bug, please add the tag: 'kernel-bug-exists-upstream'.

Once testing of the upstream kernel is complete, please mark this bug as "Confirmed".

Thanks in advance.

[0] http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.13-rc4

tags: added: kernel-da-key
Changed in linux (Ubuntu):
importance: Undecided → Medium
status: New → Incomplete
Changed in ubuntu-power-systems:
status: New → Incomplete
Revision history for this message
bugproxy (bugproxy) wrote : Comment bridged from LTC Bugzilla

------- Comment From <email address hidden> 2017-08-14 01:33 EDT-------
(In reply to comment #6)
> Would it be possible for you to test the latest upstream kernel? Refer to
> https://wiki.ubuntu.com/KernelMainlineBuilds . Please test the latest v4.13
> kernel[0].
>

Issue not observed with upstream kernel, looks like it has been resolved with 4.13 kernel.

# ./check tests/generic/445
FSTYP -- ext4
PLATFORM -- Linux/ppc64le ltc-test-ci2 4.13.0-041300rc4-generic
MKFS_OPTIONS -- /dev/loop1
MOUNT_OPTIONS -- -o acl,user_xattr /dev/loop1 /mnt/scratch

generic/445 0s
Ran: generic/445
Passed all 1 tests

Thanks,
Harish

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

Thanks for testing the upstream kernel. Since the bug is fixed in 4.13-rc4, we should be able to perform a "Reverse" bisect to identify the commit that fixes the bug.

We first need to identify the last kernel version that had the bug and the first kernel version that did knot.

Can you test the following kernel:
4.13-rc1: http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.13-rc1

Changed in linux (Ubuntu):
status: Incomplete → In Progress
Changed in ubuntu-power-systems:
status: Incomplete → In Progress
Changed in linux (Ubuntu):
assignee: Ubuntu on IBM Power Systems Bug Triage (ubuntu-power-triage) → Joseph Salisbury (jsalisbury)
Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2017-09-04 02:17 EDT-------
(In reply to comment #8)
> Can you test the following kernel:
> 4.13-rc1: http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.13-rc1

The test fails with 4.13-rc1.

FSTYP -- ext4
PLATFORM -- Linux/ppc64le alp4 4.13.0-041300rc1-generic
MKFS_OPTIONS -- /dev/loop1
MOUNT_OPTIONS -- -o acl,user_xattr /dev/loop1 /mnt/scratch

generic/445 [failed, exit status 1] - output mismatch (see /root/xfstests-dev/results//generic/445.out.bad)
--- tests/generic/445.out 2017-08-14 00:15:40.110124192 -0500
+++ /root/xfstests-dev/results//generic/445.out.bad 2017-09-04 01:16:00.288000000 -0500
@@ -1,2 +1,3 @@
QA output created by 445
-Silence is golden
+seek sanity check failed!
+(see /root/xfstests-dev/results//generic/445.full for details)
...
(Run 'diff -u tests/generic/445.out /root/xfstests-dev/results//generic/445.out.bad' to see the entire diff)
Ran: generic/445
Failures: generic/445
Failed 1 of 1 tests

Thanks,
Harish

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

Can you next test the following two kernels:

4.13-rc2: http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.13-rc2
4.13-rc3: http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.13-rc3

We should know the last bad kernel and first good kernel after getting these test results.

If 4.13-rc2 is good, you don't have to test 4.13-rc3.

Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2017-09-06 08:51 EDT-------
(In reply to comment #10)
> Can you next test the following two kernels:
>
> 4.13-rc2: http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.13-rc2
> 4.13-rc3: http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.13-rc3

Issue is observed in both the above kernels which mean 4.13-rc3 is the last bad kernel.

Thanks,
Harish

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

I started a "Reverse" kernel bisect between v4.13-rc3 final and v4.12-rc4. The kernel bisect will require testing of about 7-10 test kernels.

I built the first test kernel, up to the following commit:
a64c40e79fb20c15e42e184d7cde30f900d138eb

The test kernel can be downloaded from:
http://kernel.ubuntu.com/~jsalisbury/lp1709738

Can you test that kernel and report back if it has the bug or not? I will build the next test kernel based on your test results.

Thanks in advance

Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2017-09-07 01:02 EDT-------
(In reply to comment #12)
>
> The test kernel can be downloaded from:
> http://kernel.ubuntu.com/~jsalisbury/lp1709738

The link does not contain a any ppc specific kernel. Do you want to install the provided kernel and test?

Harish S

Revision history for this message
Seth Forshee (sforshee) wrote :

Note that we plan to switch artful to 4.13 soon, and the 4.12 kernel in artful-proposed is likely to be our last kernel based on 4.12. So it may not be worth the effort to bisect.

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

@Harish, I agree with Seth. The fix will make it's way into Artful when we switch to the 4.13 kernel.

Manoj Iyer (manjo)
tags: added: triage-g
Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

Does this bug still happen on Artful with the latest updates?

Changed in linux (Ubuntu):
status: In Progress → Incomplete
Changed in linux (Ubuntu Artful):
status: In Progress → Incomplete
Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2017-11-08 11:21 EDT-------
(In reply to comment #16)
>
> Does this bug still happen on Artful with the latest updates?

Oh, somehow missed the above comment..

# ./check tests/generic/445
FSTYP -- ext4
PLATFORM -- Linux/ppc64le ltcalpine-lp9 4.13.0-16-generic
MKFS_OPTIONS -- /dev/loop1
MOUNT_OPTIONS -- -o acl,user_xattr /dev/loop1 /mnt/scratch

generic/445 0s
Ran: generic/445
Passed all 1 tests

Test passed with latest artful kernel. Issue is resolved.

Thanks,
Harish

Changed in linux (Ubuntu):
status: Incomplete → Fix Released
Changed in linux (Ubuntu Artful):
status: Incomplete → Fix Released
Manoj Iyer (manjo)
Changed in ubuntu-power-systems:
status: In Progress → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.