Comment 20 for bug 1020436

Revision history for this message
Peter Petrakis (peter-petrakis) wrote :

Hmm, we've got some counter indicators here.

lvs claims that the volumes are active. But the probe itself
is showing problems reading the volumes.

XFS is telling us that it cannot write it's journal to disk.
[1039812.311433] Filesystem "dm-4": Log I/O Error Detected. Shutting down filesystem: dm-4

"fs/xfs/xfs_rw.c"

 90 /*
 91 * Force a shutdown of the filesystem instantly while keeping
 92 * the filesystem consistent. We don't do an unmount here; just shutdown
 93 * the shop, make sure that absolutely nothing persistent happens to
 94 * this filesystem after this point.
 95 */
 96 void
 97 xfs_do_force_shutdown(
...
127 if (flags & SHUTDOWN_CORRUPT_INCORE) {
128 xfs_cmn_err(XFS_PTAG_SHUTDOWN_CORRUPT, CE_ALERT, mp,
129 "Corruption of in-memory data detected. Shutting down filesystem: %s",
130 mp->m_fsname);
131 if (XFS_ERRLEVEL_HIGH <= xfs_error_level) {
132 xfs_stack_trace();
133 }
134 } else if (!(flags & SHUTDOWN_FORCE_UMOUNT)) {
135 if (logerror) {
136 xfs_cmn_err(XFS_PTAG_SHUTDOWN_LOGERROR, CE_ALERT, mp,
137 "Log I/O Error Detected. Shutting down filesystem: %s",
138 mp->m_fsname);

The logs conveniently tell us where it was called from too.
1009 void
1010 xlog_iodone(xfs_buf_t *bp)
1011 {

...

1036 /*
1037 * Race to shutdown the filesystem if we see an error.
1038 */
1039 if (XFS_TEST_ERROR((XFS_BUF_GETERROR(bp)), l->l_mp,
1040 XFS_ERRTAG_IODONE_IOERR, XFS_RANDOM_IODONE_IOERR)) {
1041 xfs_ioerror_alert("xlog_iodone", l->l_mp, bp, XFS_BUF_ADDR(bp));
1042 XFS_BUF_STALE(bp);
1043 xfs_force_shutdown(l->l_mp, SHUTDOWN_LOG_IO_ERROR);
1044 /*
1045 * This flag will be propagated to the trans-committed
1046 * callback routines to let them know that the log-commit
1047 * didn't succeed.
1048 */
1049 aborted = XFS_LI_ABORTED;

I assume dm-4 is the LV that XFS is mounted on, did you run the dd test on that?

I'm starting to wonder if the LVM device filter is lying to us, after failover, something
changes which misrepresents the LV and then XFS bails out.

If you can perform that DD for every PV that backs dm-4 successfully then there's
something wrong with the DM map for those LVs after failover occurs.

OK, what I need from you now is a before and after (same fault injection method) of:
0) ls -lR /dev/ > dev_major_minor.log
1) lvs -o lv_attr
2) pvdisplay -vvv
3) lvdisplay -vvv
4) dmsetup table -v
5) "dd test" on all block devices: lv, mp, sd
6) dmesg output

Please attach this as a single tarball, that has a timestamp in the filename
and has a directory structure of:

foo.tgz
  before/
  after/

If this all checks out, then what's probably happening is the when
multipath begins the failover process, there's enough of a delay that
XFS simply bails out early before IO is ready to be sent down the
remaining paths. group by priority may perform better here and is
something you can test.

I looked at the XFS mount arguments and didn't find anything that would
make it more lenient in these situations.

If you can manage it, a LUN formatted with ext3 under these circumstances
would help in ruling out whether the filesystem is part of the problem.

Thanks.