Ubuntu16.04:KVM - xfs_log_force: error -5 returned after stress test running after a day #345

Bug #1775527 reported by bugproxy
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
The Ubuntu-power-systems project
Invalid
High
Canonical Kernel Team
linux (Ubuntu)
Invalid
High
Canonical Kernel Team
Xenial
Invalid
High
Canonical Kernel Team

Bug Description

== Comment: #0 - Chanh H. Nguyen <email address hidden> - 2016-09-08 15:01:30 ==
We have installed 16.04.1 on our guest and running IO stress test against via virtual qlogic disk. After running around 8 hrs, this error popping up "xfs_imap_to_bp: xfs_trans_read_buf() returned error -5" then with all of these errors "xfs_log_force: error -5 returned" fill up all the log.

Jul 19 19:03:41 tinyg2 kernel: [65047.974604] XFS (vdb9): metadata I/O error: block 0x2f140 ("xfs_trans_read_buf_map") error 5 numblks 32
Jul 19 19:03:41 tinyg2 kernel: [65047.975885] XFS (vdb9): xfs_imap_to_bp: xfs_trans_read_buf() returned error -5.
Jul 19 19:03:41 tinyg2 kernel: [65047.976069] XFS (vdb9): xfs_do_force_shutdown(0x8) called from line 3433 of file /build/linux-JQcyrX/linux-4.4.0/fs/xfs/xfs_inode.c. Return address = 0xd0000000037951f4
Jul 19 19:03:41 tinyg2 kernel: [65047.976457] XFS (vdb9): metadata I/O error: block 0x703d54 ("xlog_iodone") error 5 numblks 64
Jul 19 19:03:41 tinyg2 kernel: [65047.976662] XFS (vdb9): xfs_do_force_shutdown(0x2) called from line 1197 of file /build/linux-JQcyrX/linux-4.4.0/fs/xfs/xfs_log.c. Return address = 0xd0000000037a4c00
Jul 19 19:03:41 tinyg2 kernel: [65047.976995] XFS (vdb9): Log I/O Error Detected. Shutting down filesystem
Jul 19 19:03:41 tinyg2 kernel: [65047.977114] XFS (vdb9): Please umount the filesystem and rectify the problem(s)

follow with all of these error that quickly fill up dmesg log....
Jul 19 19:04:37 tinyg2 kernel: [65103.515386] XFS (vdb9): xfs_log_force: error -5 returned.
Jul 19 19:05:07 tinyg2 kernel: [65133.595369] XFS (vdb9): xfs_log_force: error -5 returned.
Jul 19 19:05:37 tinyg2 kernel: [65163.679358] XFS (vdb9): xfs_log_force: error -5 returned.
Jul 19 19:06:07 tinyg2 kernel: [65193.755383] XFS (vdb9): xfs_log_force: error -5 returned.
Jul 19 19:06:37 tinyg2 kernel: [65223.835396] XFS (vdb9): xfs_log_force: error -5 returned.
Jul 19 19:06:51 tinyg2 kernel: [65237.091455] EXT4-fs (loop1): mounting ext2 file system using the ext4 subsystem
Jul 19 19:06:51 tinyg2 kernel: [65237.093219] EXT4-fs (loop1): mounted filesystem without journal. Opts: (null)
Jul 19 19:06:51 tinyg2 kernel: [65237.623893] Process accounting resumed
Jul 19 19:07:07 tinyg2 kernel: [65253.919361] XFS (vdb9): xfs_log_force: error -5 returned.
Jul 19 19:07:37 tinyg2 kernel: [65283.995393] XFS (vdb9): xfs_log_force: error -5 returned.
Jul 19 19:08:08 tinyg2 kernel: [65314.075378] XFS (vdb9): xfs_log_force: error -5 returned.
Jul 19 19:08:38 tinyg2 kernel: [65344.155397] XFS (vdb9): xfs_log_force: error -5 returned.
Jul 19 19:09:08 tinyg2 kernel: [65374.235367] XFS (vdb9): xfs_log_force: error -5 returned.
Jul 19 19:09:38 tinyg2 kernel: [65404.319354] XFS (vdb9): xfs_log_force: error -5 returned.
Jul 19 19:10:08 tinyg2 kernel: [65434.395368] XFS (vdb9): xfs_log_force: error -5 returned.

== Comment: #1 - Chanh H. Nguyen <email address hidden> - 2016-09-08 15:02:27 ==
We recently re-created this bug on our new system "cressg2" which is using a different SAN on kernel -36.
My Ubuntu 16.04.1 guest, cressg2, are running the IO and other stress test, and it is hitting "xfs_log_force: error -5 returned" problem after it runs the tests for 15 hours. It is using our lab storage (SVC SAN storage configuation with access to DS8000 Volume) and switch, compare to tinyg3, so it is using different storage and fabric switch.

root@cressg2:~# uname -a
Linux cressg2 4.4.0-36-generic #55-Ubuntu SMP Thu Aug 11 18:00:57 UTC 2016 ppc64le ppc64le ppc64le GNU/Linux

[Wed Sep 7 10:26:48 2016] XFS (vdb8): xfs_log_force: error -5 returned.
[Wed Sep 7 10:27:18 2016] XFS (vdb8): xfs_log_force: error -5 returned.
[Wed Sep 7 10:27:32 2016] XFS (vdb8): xfs_log_force: error -5 returned.
[Wed Sep 7 10:27:33 2016] XFS (vdb8): xfs_log_force: error -5 returned.
[Wed Sep 7 10:27:33 2016] XFS (vdb8): xfs_log_force: error -5 returned.
[Wed Sep 7 10:27:48 2016] XFS (vdb8): xfs_log_force: error -5 returned.
[Wed Sep 7 10:27:48 2016] XFS (vdb8): xfs_log_force: error -5 returned.
[Wed Sep 7 10:27:49 2016] XFS (vdb8): xfs_log_force: error -5 returned.
[Wed Sep 7 10:27:56 2016] XFS (vdb8): xfs_log_force: error -5 returned.
[Wed Sep 7 10:28:18 2016] XFS (vdb8): xfs_log_force: error -5 returned.
[Wed Sep 7 10:28:48 2016] XFS (vdb8): xfs_log_force: error -5 returned.
[Wed Sep 7 10:29:18 2016] XFS (vdb8): xfs_log_force: error -5 returned.

Revision history for this message
bugproxy (bugproxy) wrote : sosreport of guest

Default Comment by Bridge

tags: added: architecture-ppc64le bugnameltc-146012 severity-high targetmilestone-inin16045
Revision history for this message
bugproxy (bugproxy) wrote : sosreport of the Host

Default Comment by Bridge

Revision history for this message
bugproxy (bugproxy) wrote : dmesg log

Default Comment by Bridge

Revision history for this message
bugproxy (bugproxy) wrote : kern.log

Default Comment by Bridge

Changed in ubuntu:
assignee: nobody → Ubuntu on IBM Power Systems Bug Triage (ubuntu-power-triage)
affects: ubuntu → linux (Ubuntu)
Revision history for this message
bugproxy (bugproxy) wrote : Comment bridged from LTC Bugzilla

------- Comment From <email address hidden> 2018-06-07 03:00 EDT-------
*** Bug 146317 has been marked as a duplicate of this bug. ***

summary: - ISST-LTE:Briggs:Ubuntu16.04:KVM - xfs_log_force: error -5 returned after
- stress test running after a day #345
+ Ubuntu16.04:KVM - xfs_log_force: error -5 returned after stress test
+ running after a day #345
Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2018-06-07 04:39 EDT-------
Hello Canonical,

Can you please look into XFS errors reported here ?

tags: added: triage-g
Changed in ubuntu-power-systems:
importance: Undecided → High
assignee: nobody → Canonical Kernel Team (canonical-kernel-team)
Revision history for this message
Frank Heimes (fheimes) wrote :

I'm personally not an xfs expert, but let me leave those two comments:
- The test was unfortunately done on an old and outdated kernel from 16.04.1: 4.4.0-36
  which got already replaced by several newer kernels
  Today we are already on level: 4.4.0.127.133
  (and an even newer one is in the 'pipe' - in proposed: 4.4.0.128.134)
     $ rmadison --arch=ppc64el linux-generic | grep xenial
      linux-generic | 4.4.0.21.22 | xenial | ppc64el
      linux-generic | 4.4.0.127.133 | xenial-security | ppc64el
      linux-generic | 4.4.0.127.133 | xenial-updates | ppc64el
      linux-generic | 4.4.0.128.134 | xenial-proposed | ppc64el
  So please always update the kernel to the latest version from <ubuntu-release>-updates
  (or even proposed) before running such kinds of tests.
  Just make sure you have access to the updates (Ubuntu archive mirror) and do (something like):
    sudo apt update && sudo apt full-upgrade && sudo apt autoremove
  (and in case a kernel update was installed, reboot to activate it)
  So please re-run this test on the latest 4.4.0 version (today either 4.4.0.127.133
  or even 4.4.0.128.134 from proposed.)
- Does this problem also happen with the (latest) upstream 4.4 kernel?
  Knowing this would help to narrow down the issue.

Changed in ubuntu-power-systems:
status: New → Triaged
Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

I'll mark this bug incomplete until there are test results using the latest Xenial 4.4 kernel.

tags: added: kernel-da-key
Changed in linux (Ubuntu):
status: New → Incomplete
Changed in linux (Ubuntu Xenial):
status: New → Incomplete
Changed in linux (Ubuntu):
importance: Undecided → High
Changed in linux (Ubuntu Xenial):
importance: Undecided → High
Frank Heimes (fheimes)
Changed in ubuntu-power-systems:
status: Triaged → Incomplete
Manoj Iyer (manjo)
Changed in linux (Ubuntu):
assignee: Ubuntu on IBM Power Systems Bug Triage (ubuntu-power-triage) → Canonical Kernel Team (canonical-kernel-team)
Changed in linux (Ubuntu Xenial):
assignee: nobody → Canonical Kernel Team (canonical-kernel-team)
Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2018-10-11 10:40 EDT-------
Given the age of this bug and the fact that there is currently no test setup to reproduce this issue on a recent level of code, I'm going to reject this at this point. If this issue is seen again on a recent level of Ubuntu, please reopen and we can then take a look. Thanks.

Revision history for this message
Andrew Cloke (andrew-cloke) wrote :

Thanks for the update. Marking as "invalid".

Changed in ubuntu-power-systems:
status: Incomplete → Invalid
Changed in linux (Ubuntu):
status: Incomplete → Invalid
Changed in linux (Ubuntu Xenial):
status: Incomplete → Invalid
bugproxy (bugproxy)
tags: added: targetmilestone-inin---
removed: targetmilestone-inin16045
Brad Figg (brad-figg)
tags: added: cscc
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.