Comment 4 for bug 1030040

Revision history for this message
Peter Petrakis (peter-petrakis) wrote :

Hmm, A reproduction case is going to be necessary, there are some
weird things going on in syslog.2 which I'm unfamiliar with. Unfortunately,
I don't have access to a symmetrix either. I really need to have multipathd running
in the foreground, with full logging to a file, to catch what went down.

From 7/27/12 00:38:48 -- 00:38:58 I just see lines and lines of this:
Jul 27 00:38:58 Linux51 multipathd: mpath12: sdm - emc_clariion_checker: Logical Unit is unbound or LUNZ

For about every block device, this messages means the path is down. It's safety against issuing IO
to a LUNZ which will block forever.

libmultipath/checkers/emc_clariion.c

        if ( /* LUN should at least be bound somewhere and not be LUNZ */
                sense_buffer[4] == 0x00) {
                MSG(c, "emc_clariion_checker: Logical Unit is unbound "
                    "or LUNZ");
                return PATH_DOWN;
        }

After a few filters...
Jul 27 05:45:36 Linux51 kernel: [188724.191711] scsi 6:0:0:0: Direct-Access DGC LUNZ 0532 PQ: 0 ANSI: 4
Jul 27 05:45:36 Linux51 kernel: [188724.193584] scsi 6:0:1:0: Direct-Access DGC LUNZ 0532 PQ: 0 ANSI: 4
Jul 27 05:45:43 Linux51 kernel: [188730.940803] scsi 5:0:0:0: Direct-Access DGC LUNZ 0532 PQ: 0 ANSI: 4
Jul 27 05:45:43 Linux51 kernel: [188730.942590] scsi 5:0:1:0: Direct-Access DGC LUNZ 0532 PQ: 0 ANSI: 4

ok, that's 4, but there were dozens in the logs, which means the rest were unbound.

Then there's this,
Jul 27 00:39:49 Linux51 udevd[26799]: timeout '/sbin/blkid -o udev -p /dev/dm-53'
Jul 27 00:39:49 Linux51 udevd[26794]: timeout '/sbin/blkid -o udev -p /dev/dm-52'

Jul 27 00:39:50 Linux51 udevd[26848]: timeout: killing '/sbin/blkid -o udev -p /dev/dm-66' [26890]
Jul 27 00:39:50 Linux51 udevd[26857]: timeout: killing '/sbin/blkid -o udev -p /dev/dm-60' [26899]
...

Something is off, if multipath did create two maps to the same device I suspect it to be a victim
of these path availability issues.

<11:37:45>multipath_conf$ zcat syslog.2.gz | grep unbound | sort -u | wc -l
43055

That can't be right.