Activity log for bug #1718397

Date Who What changed Old value New value Message
2017-09-20 11:09:24 bugproxy bug added bug
2017-09-20 11:09:26 bugproxy tags architecture-ppc64le bugnameltc-146489 severity-critical targetmilestone-inin16043
2017-09-20 11:09:31 bugproxy attachment added output of multipath -ll with verbosity increased to 6 https://bugs.launchpad.net/bugs/1718397/+attachment/4953447/+files/multipath-trace.log
2017-09-20 11:09:32 bugproxy attachment added test-case: io_setup.c https://bugs.launchpad.net/bugs/1718397/+attachment/4953448/+files/io_setup.c
2017-09-20 11:09:34 bugproxy attachment added test-case: io_setup_v2.c https://bugs.launchpad.net/bugs/1718397/+attachment/4953449/+files/file_146489.txt
2017-09-20 11:09:35 bugproxy ubuntu: assignee Ubuntu on IBM Power Systems Bug Triage (ubuntu-power-triage)
2017-09-20 11:09:39 bugproxy affects ubuntu kernel-package (Ubuntu)
2017-09-20 11:10:49 Andrew Cloke bug task added ubuntu-power-systems
2017-09-20 11:11:00 Andrew Cloke ubuntu-power-systems: assignee Canonical Kernel Team (canonical-kernel-team)
2017-09-20 11:11:03 Andrew Cloke ubuntu-power-systems: importance Undecided Critical
2017-09-20 14:30:52 Mauricio Faria de Oliveira description Problem Description ================================= I am facing this issue for Texan Flash storage 840 disks which are coming from coho and salfish adapter coho adapter with 840 storage is 3G disks and salfish adapter with 840 is 12G disks I am able to see those disks in lsblk o/p but not in multipath -ll comamnd 0004:01:00.0 Coho: Saturn-X U78C9.001.WZS0060-P1-C6 0x10000090fa2a51f8 host10 Online 0004:01:00.1 Coho: Saturn-X U78C9.001.WZS0060-P1-C6 0x10000090fa2a51f9 host11 Online 0005:09:00.0 Sailfish: QLogic 8GB U78C9.001.WZS0060-P1-C9 0x21000024ff787778 host2 Online 0005:09:00.1 Sailfish: QLogic 8GB U78C9.001.WZS0060-P1-C9 0x21000024ff787779 host4 Online root@luckyv1:/dev/disk# multipath -ll | grep "size=3.0G" -B 1 root@luckyv1:/dev/disk# multipath -ll | grep "size=12G" -B 1 root@luckyv1:/dev/disk# == Comment: #3 - Luciano Chavez <chavez@us.ibm.com> - 2016-09-20 20:22:20 == I edited /etc/multipath.conf and added verbosity 6 to crank up the output and ran multipath -ll and saved it off to a text file (attached). All the using the directio checker failed and those using the tur checker seem to work. Sep 20 20:07:36 | loading //lib/multipath/libcheckdirectio.so checker Sep 20 20:07:36 | loading //lib/multipath/libprioconst.so prioritizer Sep 20 20:07:36 | Discover device /sys/devices/pci0000:00/0000:00:00.0/0000:01:00.0/host3/rport-3:0-2/target3:0:0/3:0:0:0/block/sdai Sep 20 20:07:36 | sdai: udev property ID_WWN whitelisted Sep 20 20:07:36 | sdai: not found in pathvec Sep 20 20:07:36 | sdai: mask = 0x25 Sep 20 20:07:36 | sdai: dev_t = 66:32 Sep 20 20:07:36 | open '/sys/devices/pci0000:00/0000:00:00.0/0000:01:00.0/host3/rport-3:0-2/target3:0:0/3:0:0:0/block/sdai/size' Sep 20 20:07:36 | sdai: size = 20971520 Sep 20 20:07:36 | sdai: vendor = IBM Sep 20 20:07:36 | sdai: product = FlashSystem-9840 Sep 20 20:07:36 | sdai: rev = 1442 Sep 20 20:07:36 | sdai: h:b:t:l = 3:0:0:0 Sep 20 20:07:36 | SCSI target 3:0:0 -> FC rport 3:0-2 Sep 20 20:07:36 | sdai: tgt_node_name = 0x500507605e839800 Sep 20 20:07:36 | open '/sys/devices/pci0000:00/0000:00:00.0/0000:01:00.0/host3/rport-3:0-2/target3:0:0/3:0:0:0/state' Sep 20 20:07:36 | sdai: path state = running Sep 20 20:07:36 | sdai: get_state Sep 20 20:07:36 | sdai: path_checker = directio (internal default) Sep 20 20:07:36 | sdai: checker timeout = 30 ms (internal default) Sep 20 20:07:36 | io_setup failed Sep 20 20:07:36 | sdai: checker init failed == Comment: #4 - Luciano Chavez <chavez@us.ibm.com> - 2016-09-20 21:00:23 == Hello Mauricio, I edited /etc/multipath.conf to use the tur path checker but the output still shows directio. Ideas? devices { device { vendor "IBM " product "FlashSystem-9840" path_selector "round-robin 0" path_grouping_policy multibus path_checker tur rr_weight uniform no_path_retry fail failback immediate dev_loss_tmo 300 fast_io_fail_tmo 25 } } == Comment: #7 - Mauricio Faria De Oliveira <mauricfo@br.ibm.com> - 2016-09-27 18:32:57 == The function is failing at the io_setup() system call. @ checkers/directio.c int libcheck_init (struct checker * c) { unsigned long pgsize = getpagesize(); struct directio_context * ct; long flags; ct = malloc(sizeof(struct directio_context)); if (!ct) return 1; memset(ct, 0, sizeof(struct directio_context)); if (io_setup(1, &ct->ioctx) != 0) { condlog(1, "io_setup failed"); free(ct); return 1; } <...> The syscall is failing w/ EAGAIN # grep ^io_setup multipath_-v2_-d.strace io_setup(1, 0x100163c9130) = -1 EAGAIN (Resource temporarily unavailable) io_setup(1, 0x10015bae2c0) = -1 EAGAIN (Resource temporarily unavailable) io_setup(1, 0x100164d65a0) = -1 EAGAIN (Resource temporarily unavailable) io_setup(1, 0x10016429f20) = -1 EAGAIN (Resource temporarily unavailable) io_setup(1, 0x100163535c0) = -1 EAGAIN (Resource temporarily unavailable) io_setup(1, 0x10016368510) = -1 EAGAIN (Resource temporarily unavailable) <...> According to the manpage (man 2 io_setup) NAME io_setup - create an asynchronous I/O context DESCRIPTION The io_setup() system call creates an asynchronous I/O context suitable for concurrently processing nr_events operations. <...> ERRORS EAGAIN The specified nr_events exceeds the user's limit of available events, as defined in /proc/sys/fs/aio-max-nr. On luckyv1: root@luckyv1:~/mauricfo/bz146849/sep27# cat /proc/sys/fs/aio-max-nr 65536 root@luckyv1:~/mauricfo/bz146849/sep27# cat /proc/sys/fs/aio-nr 130560 According to linux's Documentation/sysctl/fs.txt [1] aio-nr & aio-max-nr: aio-nr is the running total of the number of events specified on the io_setup system call for all currently active aio contexts. If aio-nr reaches aio-max-nr then io_setup will fail with EAGAIN. Note that raising aio-max-nr does not result in the pre-allocation or re-sizing of any kernel data structures. Interestingly, aio-nr is greater than aio-max-nr. Hm. Increased aio-max-nr to 262144, and could get some more maps created. Accidentally killed multipathd and got this: root@luckyv1:~/mauricfo/bz146849/sep27# cat /proc/sys/fs/aio-max-nr 262144 root@luckyv1:~/mauricfo/bz146849/sep27# cat /proc/sys/fs/aio-nr 0 With just a few io_setup failures (previously there were hundreds..) root@luckyv1:~/mauricfo/bz146849/sep27# multipath -v2 Sep 27 17:11:22 | io_setup failed Sep 27 17:11:22 | io_setup failed Sep 27 17:11:22 | io_setup failed Sep 27 17:11:22 | io_setup failed Sep 27 17:11:22 | io_setup failed Sep 27 17:11:22 | io_setup failed Sep 27 17:11:22 | io_setup failed Sep 27 17:11:22 | io_setup failed Sep 27 17:11:22 | io_setup failed Sep 27 17:11:22 | io_setup failed Sep 27 17:11:22 | io_setup failed Sep 27 17:11:22 | io_setup failed Sep 27 17:11:22 | io_setup failed Sep 27 17:11:22 | io_setup failed Sep 27 17:11:22 | io_setup failed Sep 27 17:11:22 | mpathcs: ignoring map Sep 27 17:11:22 | mpathcs: ignoring map Then noticed a huge increase: root@luckyv1:~/mauricfo/bz146849/sep27# cat /proc/sys/fs/aio-nr 354560 multipathd was restarted automatically. The number decreases when multipathd is being stopped/shutdown: let this running: root@luckyv1:~# systemctl stop multipathd and watch it: root@luckyv1:~/mauricfo/bz146849/sep27# cat /proc/sys/fs/aio-nr 354560 root@luckyv1:~/mauricfo/bz146849/sep27# cat /proc/sys/fs/aio-nr 276480 root@luckyv1:~/mauricfo/bz146849/sep27# cat /proc/sys/fs/aio-nr 267520 root@luckyv1:~/mauricfo/bz146849/sep27# cat /proc/sys/fs/aio-nr 262400 ... root@luckyv1:~/mauricfo/bz146849/sep27# cat /proc/sys/fs/aio-nr 43520 root@luckyv1:~/mauricfo/bz146849/sep27# cat /proc/sys/fs/aio-nr 29440 root@luckyv1:~/mauricfo/bz146849/sep27# cat /proc/sys/fs/aio-nr 0 Start it, and aio-nr goes high: root@luckyv1:~# cat /proc/sys/fs/aio-nr; 0 root@luckyv1:~# systemctl start multipathd root@luckyv1:~# cat /proc/sys/fs/aio-nr; 523520 And it seems there's something very wrong w/ the number of requests that are successful, and the number reported in aio-nr: root@luckyv1:~/mauricfo/bz146849/sep27# grep -c 'io_setup.*= 0' strace.multipathd_-d_-s.io_setup 409 root@luckyv1:~/mauricfo/bz146849/sep27# cat /proc/sys/fs/aio-nr 519680 root@luckyv1:~/mauricfo/bz146849/sep27# killall -9 multipathd root@luckyv1:~/mauricfo/bz146849/sep27# cat /proc/sys/fs/aio-nr 0 All request are for a single (one) AIO context (nr_events parameter): root@luckyv1:~/mauricfo/bz146849/sep27# grep -m3 'io_setup.*= 0' strace.multipathd_-d_-s.io_setup 130184 io_setup(1, [70366829412352]) = 0 130184 io_setup(1, [70366818795520]) = 0 130184 io_setup(1, [70366818729984]) = 0 Checking further.. [1] https://www.kernel.org/doc/Documentation/sysctl/fs.txt == Comment: #8 - Mauricio Faria De Oliveira <mauricfo@br.ibm.com> - 2016-09-27 18:56:08 == This attached test-case demonstrates that for each io_setup() request of 1 nr_event, actually 1280 seem to be allocated. root@luckyv1:~/mauricfo/bz146849/sep27# gcc -o io_setup io_setup.c -laio root@luckyv1:~/mauricfo/bz146849/sep27# cat /proc/sys/fs/aio-nr 0 root@luckyv1:~/mauricfo/bz146849/sep27# ./io_setup & [1] 12352 io_setup rc = 0 sleeping 10 seconds... root@luckyv1:~/mauricfo/bz146849/sep27# cat /proc/sys/fs/aio-nr 1280 <...> io_destroy rc = 0 [1]+ Done ./io_setup root@luckyv1:~/mauricfo/bz146849/sep27# cat /proc/sys/fs/aio-nr 0 == Comment: #9 - Mauricio Faria De Oliveira <mauricfo@br.ibm.com> - 2016-09-27 19:17:13 == (In reply to comment #8) > This attached test-case demonstrates that for each io_setup() request of 1 > nr_event, actually 1280 seem to be allocated. The kernel code for the io_setup() syscall (call to ioctx_alloc()) indeed does that. SYSCALL_DEFINE2(io_setup, unsigned, nr_events, aio_context_t __user *, ctxp) ... ioctx = ioctx_alloc(nr_events); ... static struct kioctx *ioctx_alloc(unsigned nr_events) ... nr_events = max(nr_events, num_possible_cpus() * 4); nr_events *= 2; ... The math is: root@luckyv1:~# cat /sys/devices/system/cpu/possible 0-159 root@luckyv1:~# echo $((160*4*2)) 1280 Now, need to understand why. == Comment: #10 - Mauricio Faria De Oliveira <mauricfo@br.ibm.com> - 2016-09-27 19:22:44 == Anyway, the number of paths to FlashSystem-9840 in the system is 424. root@luckyv1:~/mauricfo/bz146849/sep27# grep FlashSystem-9840 /sys/block/sd*/device/model | wc -l 424 For each one there's an io_setup() call to allocate 1 nr_event, which actually becomes 1280, so we have in total... 542720, when the default aio-max-nr is 65536 root@luckyv1:~/mauricfo/bz146849/sep27# echo $((424*1280)) 542720 So, obviously, it will exceed the max amount available (it already did, for some reason, as aio-nr showed 130560). Next steps is to understand why this is done, if this is a bug at all (you know, that multiplier of 160 * 4 is big on Power because we have lots of threads w/ P8/SMT8), and if so, if there's a proper "fix" for this. It may very well be just working correctly, and then we'll document this and instructions to increase the limit for FlashSystem-9840. == Comment: #11 - Mauricio Faria De Oliveira <mauricfo@br.ibm.com> - 2016-09-27 19:30:47 == (In reply to comment #10) > For each one there's an io_setup() call to allocate 1 nr_event, which > actually becomes 1280, so we have in total... 542720, when the default > aio-max-nr is 65536 <...> > So, obviously, it will exceed the max amount available (it already did, for > some reason, as aio-nr showed 130560). Got that; the check is against 2x aio-max-nr: if (aio_nr + nr_events > (aio_max_nr * 2UL) 130560 + 1280 = 131840 which is greater than 65536 * 2 = 131072 so the check fails from that point on. 130560 (default of 64k aio-max-nr) is enough for 102 paths (130560 / 1280). == Comment: #14 - LEKSHMI C. PILLAI <lekshmi.cpillai@in.ibm.com> - 2016-09-28 07:57:03 == Hi I applied the workaround to increase aio-max-nr. I did it on luckyv1.I changed the default value to 1048576.it didn't worked .Again changed to 4194304 root@luckyv1:/test/lucky# cat /proc/sys/fs/aio-max-nr 4194304---------------------------------------------------------------------> root@luckyv1:/test/lucky# How I did Edited the value in /etc/sysctl.conf and ran sysctl -p /etc/sysctl.conf . After that I am able to see all the disks Thanks for the workaround Thanks Lekshmi == Comment: #15 - Mauricio Faria De Oliveira <mauricfo@br.ibm.com> - 2016-09-28 08:44:39 == (In reply to comment #14) > I applied the workaround to increase aio-max-nr. > > I did it on luckyv1.I changed the default value to 1048576.it didn't worked > .Again changed to 4194304 <...> > After that I am able to see all the disks Yes, the actual value that works may vary depending on what multipathd is doing, and by which time did you run which commands before/after multipathd was running. The point is, multipathd allocates a number of events. I'd expect 424 FS9840 paths * 1280 events/path = 542720 events (if multipathd doesn't allocate a few ones for itself / path-independent)), so theoretically, half of that (271360) or close should pass. However, if multipathd was already running (ie, allocated a number of events), and /then/ you run multipath command to test, multipath would try to allocate /more/ events on top of those already allocated by multipathd, which would be much more likely to fail because a large number was already allocated. > Thanks for the workaround You're welcome! == Comment: #19 - Mauricio Faria De Oliveira <mauricfo@br.ibm.com> - 2016-10-03 10:44:10 == == Comment: #45 - Mauricio Faria De Oliveira <mauricfo@br.ibm.com> - 2017-09-19 18:32:10 == Verification of this commit with the linux-hwe-edge kernel in -proposed, using the attached test-case "io_setup_v2.c" commit 2a8a98673c13cb2a61a6476153acf8344adfa992 Author: Mauricio Faria de Oliveira <mauricfo@linux.vnet.ibm.com> Date: Wed Jul 5 10:53:16 2017 -0300 fs: aio: fix the increment of aio-nr and counting against aio-max-nr Test-case (attached) $ sudo apt-get install gcc libaio-dev $ gcc -o io_setup_v2 io_setup_v2.c -laio Original kernel: - Only 409 io_contexts could be allocated, but that took 130880 [ div by 2, per bug] = 65440 slots out of 65535 $ uname -rv 4.11.0-14-generic #20~16.04.1-Ubuntu SMP Wed Aug 9 09:06:18 UTC 2017 $ ./io_setup_v2 1 65536 nr_events: 1, nr_requests: 65536 rc = -11, i = 409 ^Z [1]+ Stopped ./io_setup_v2 1 65536 $ cat /proc/sys/fs/aio-nr 130880 $ cat /proc/sys/fs/aio-max-nr 65536 $ kill %% Patched kernel: - Now 65515 io_contexts could be allocated out of 65535 (much better) (and reporting correctly, without div by 2.) $ uname -rv 4.11.0-140-generic #20~16.04.1+bz146489 SMP Tue Sep 19 17:46:15 CDT 2017 $ ./io_setup_v2 1 65536 nr_events: 1, nr_requests: 65536 rc = -12, i = 65515 ^Z [1]+ Stopped ./io_setup_v2 1 65536 $ cat /proc/sys/fs/aio-nr 65515 $ kill %% Problem Description ================================= I am facing this issue for Texan Flash storage 840 disks which are coming from coho and salfish adapter coho adapter with 840 storage is 3G disks and salfish adapter with 840 is 12G disks I am able to see those disks in lsblk o/p but not in multipath -ll comamnd 0004:01:00.0 Coho: Saturn-X U78C9.001.WZS0060-P1-C6 0x10000090fa2a51f8 host10 Online 0004:01:00.1 Coho: Saturn-X U78C9.001.WZS0060-P1-C6 0x10000090fa2a51f9 host11 Online 0005:09:00.0 Sailfish: QLogic 8GB U78C9.001.WZS0060-P1-C9 0x21000024ff787778 host2 Online 0005:09:00.1 Sailfish: QLogic 8GB U78C9.001.WZS0060-P1-C9 0x21000024ff787779 host4 Online root@luckyv1:/dev/disk# multipath -ll | grep "size=3.0G" -B 1 root@luckyv1:/dev/disk# multipath -ll | grep "size=12G" -B 1 root@luckyv1:/dev/disk# == Comment: #3 - Luciano Chavez <chavez@us.ibm.com> - 2016-09-20 20:22:20 == I edited /etc/multipath.conf and added verbosity 6 to crank up the output and ran multipath -ll and saved it off to a text file (attached). All the using the directio checker failed and those using the tur checker seem to work. Sep 20 20:07:36 | loading //lib/multipath/libcheckdirectio.so checker Sep 20 20:07:36 | loading //lib/multipath/libprioconst.so prioritizer Sep 20 20:07:36 | Discover device /sys/devices/pci0000:00/0000:00:00.0/0000:01:00.0/host3/rport-3:0-2/target3:0:0/3:0:0:0/block/sdai Sep 20 20:07:36 | sdai: udev property ID_WWN whitelisted Sep 20 20:07:36 | sdai: not found in pathvec Sep 20 20:07:36 | sdai: mask = 0x25 Sep 20 20:07:36 | sdai: dev_t = 66:32 Sep 20 20:07:36 | open '/sys/devices/pci0000:00/0000:00:00.0/0000:01:00.0/host3/rport-3:0-2/target3:0:0/3:0:0:0/block/sdai/size' Sep 20 20:07:36 | sdai: size = 20971520 Sep 20 20:07:36 | sdai: vendor = IBM Sep 20 20:07:36 | sdai: product = FlashSystem-9840 Sep 20 20:07:36 | sdai: rev = 1442 Sep 20 20:07:36 | sdai: h:b:t:l = 3:0:0:0 Sep 20 20:07:36 | SCSI target 3:0:0 -> FC rport 3:0-2 Sep 20 20:07:36 | sdai: tgt_node_name = 0x500507605e839800 Sep 20 20:07:36 | open '/sys/devices/pci0000:00/0000:00:00.0/0000:01:00.0/host3/rport-3:0-2/target3:0:0/3:0:0:0/state' Sep 20 20:07:36 | sdai: path state = running Sep 20 20:07:36 | sdai: get_state Sep 20 20:07:36 | sdai: path_checker = directio (internal default) Sep 20 20:07:36 | sdai: checker timeout = 30 ms (internal default) Sep 20 20:07:36 | io_setup failed Sep 20 20:07:36 | sdai: checker init failed == Comment: #7 - Mauricio Faria De Oliveira <mauricfo@br.ibm.com> - 2016-09-27 18:32:57 == The function is failing at the io_setup() system call.  @ checkers/directio.c  int libcheck_init (struct checker * c)  {   unsigned long pgsize = getpagesize();   struct directio_context * ct;   long flags;   ct = malloc(sizeof(struct directio_context));   if (!ct)           return 1;   memset(ct, 0, sizeof(struct directio_context));   if (io_setup(1, &ct->ioctx) != 0) {           condlog(1, "io_setup failed");           free(ct);           return 1;   }  <...> The syscall is failing w/ EAGAIN  # grep ^io_setup multipath_-v2_-d.strace  io_setup(1, 0x100163c9130) = -1 EAGAIN (Resource temporarily unavailable)  io_setup(1, 0x10015bae2c0) = -1 EAGAIN (Resource temporarily unavailable)  io_setup(1, 0x100164d65a0) = -1 EAGAIN (Resource temporarily unavailable)  io_setup(1, 0x10016429f20) = -1 EAGAIN (Resource temporarily unavailable)  io_setup(1, 0x100163535c0) = -1 EAGAIN (Resource temporarily unavailable)  io_setup(1, 0x10016368510) = -1 EAGAIN (Resource temporarily unavailable)  <...> According to the manpage (man 2 io_setup)  NAME         io_setup - create an asynchronous I/O context  DESCRIPTION         The io_setup() system call creates an asynchronous I/O context suitable for concurrently processing nr_events operations. <...>  ERRORS         EAGAIN The specified nr_events exceeds the user's limit of available events, as defined in /proc/sys/fs/aio-max-nr. On luckyv1:  root@luckyv1:~/mauricfo/bz146849/sep27# cat /proc/sys/fs/aio-max-nr  65536  root@luckyv1:~/mauricfo/bz146849/sep27# cat /proc/sys/fs/aio-nr  130560 According to linux's Documentation/sysctl/fs.txt [1]  aio-nr & aio-max-nr:  aio-nr is the running total of the number of events specified on the  io_setup system call for all currently active aio contexts. If aio-nr  reaches aio-max-nr then io_setup will fail with EAGAIN. Note that  raising aio-max-nr does not result in the pre-allocation or re-sizing  of any kernel data structures. Interestingly, aio-nr is greater than aio-max-nr. Hm. Increased aio-max-nr to 262144, and could get some more maps created. == Comment: #8 - Mauricio Faria De Oliveira <mauricfo@br.ibm.com> - 2016-09-27 18:56:08 == This attached test-case demonstrates that for each io_setup() request of 1 nr_event, actually 1280 seem to be allocated. root@luckyv1:~/mauricfo/bz146849/sep27# gcc -o io_setup io_setup.c -laio root@luckyv1:~/mauricfo/bz146849/sep27# cat /proc/sys/fs/aio-nr 0 root@luckyv1:~/mauricfo/bz146849/sep27# ./io_setup & [1] 12352 io_setup rc = 0 sleeping 10 seconds... root@luckyv1:~/mauricfo/bz146849/sep27# cat /proc/sys/fs/aio-nr 1280 <...> io_destroy rc = 0 [1]+ Done ./io_setup root@luckyv1:~/mauricfo/bz146849/sep27# cat /proc/sys/fs/aio-nr 0 == Comment: #45 - Mauricio Faria De Oliveira <mauricfo@br.ibm.com> - 2017-09-19 18:32:10 == Verification of this commit with the linux-hwe-edge kernel in -proposed, using the attached test-case "io_setup_v2.c"     commit 2a8a98673c13cb2a61a6476153acf8344adfa992     Author: Mauricio Faria de Oliveira <mauricfo@linux.vnet.ibm.com>     Date: Wed Jul 5 10:53:16 2017 -0300         fs: aio: fix the increment of aio-nr and counting against aio-max-nr Test-case (attached)     $ sudo apt-get install gcc libaio-dev     $ gcc -o io_setup_v2 io_setup_v2.c -laio Original kernel:     - Only 409 io_contexts could be allocated,     but that took 130880 [ div by 2, per bug] = 65440 slots out of 65535     $ uname -rv     4.11.0-14-generic #20~16.04.1-Ubuntu SMP Wed Aug 9 09:06:18 UTC 2017     $ ./io_setup_v2 1 65536     nr_events: 1, nr_requests: 65536     rc = -11, i = 409     ^Z     [1]+ Stopped ./io_setup_v2 1 65536     $ cat /proc/sys/fs/aio-nr     130880     $ cat /proc/sys/fs/aio-max-nr     65536     $ kill %% Patched kernel:     - Now 65515 io_contexts could be allocated out of 65535 (much better)       (and reporting correctly, without div by 2.)     $ uname -rv     4.11.0-140-generic #20~16.04.1+bz146489 SMP Tue Sep 19 17:46:15 CDT 2017     $ ./io_setup_v2 1 65536     nr_events: 1, nr_requests: 65536     rc = -12, i = 65515     ^Z     [1]+ Stopped ./io_setup_v2 1 65536     $ cat /proc/sys/fs/aio-nr     65515     $ kill %%
2017-09-20 15:11:29 Mauricio Faria de Oliveira description Problem Description ================================= I am facing this issue for Texan Flash storage 840 disks which are coming from coho and salfish adapter coho adapter with 840 storage is 3G disks and salfish adapter with 840 is 12G disks I am able to see those disks in lsblk o/p but not in multipath -ll comamnd 0004:01:00.0 Coho: Saturn-X U78C9.001.WZS0060-P1-C6 0x10000090fa2a51f8 host10 Online 0004:01:00.1 Coho: Saturn-X U78C9.001.WZS0060-P1-C6 0x10000090fa2a51f9 host11 Online 0005:09:00.0 Sailfish: QLogic 8GB U78C9.001.WZS0060-P1-C9 0x21000024ff787778 host2 Online 0005:09:00.1 Sailfish: QLogic 8GB U78C9.001.WZS0060-P1-C9 0x21000024ff787779 host4 Online root@luckyv1:/dev/disk# multipath -ll | grep "size=3.0G" -B 1 root@luckyv1:/dev/disk# multipath -ll | grep "size=12G" -B 1 root@luckyv1:/dev/disk# == Comment: #3 - Luciano Chavez <chavez@us.ibm.com> - 2016-09-20 20:22:20 == I edited /etc/multipath.conf and added verbosity 6 to crank up the output and ran multipath -ll and saved it off to a text file (attached). All the using the directio checker failed and those using the tur checker seem to work. Sep 20 20:07:36 | loading //lib/multipath/libcheckdirectio.so checker Sep 20 20:07:36 | loading //lib/multipath/libprioconst.so prioritizer Sep 20 20:07:36 | Discover device /sys/devices/pci0000:00/0000:00:00.0/0000:01:00.0/host3/rport-3:0-2/target3:0:0/3:0:0:0/block/sdai Sep 20 20:07:36 | sdai: udev property ID_WWN whitelisted Sep 20 20:07:36 | sdai: not found in pathvec Sep 20 20:07:36 | sdai: mask = 0x25 Sep 20 20:07:36 | sdai: dev_t = 66:32 Sep 20 20:07:36 | open '/sys/devices/pci0000:00/0000:00:00.0/0000:01:00.0/host3/rport-3:0-2/target3:0:0/3:0:0:0/block/sdai/size' Sep 20 20:07:36 | sdai: size = 20971520 Sep 20 20:07:36 | sdai: vendor = IBM Sep 20 20:07:36 | sdai: product = FlashSystem-9840 Sep 20 20:07:36 | sdai: rev = 1442 Sep 20 20:07:36 | sdai: h:b:t:l = 3:0:0:0 Sep 20 20:07:36 | SCSI target 3:0:0 -> FC rport 3:0-2 Sep 20 20:07:36 | sdai: tgt_node_name = 0x500507605e839800 Sep 20 20:07:36 | open '/sys/devices/pci0000:00/0000:00:00.0/0000:01:00.0/host3/rport-3:0-2/target3:0:0/3:0:0:0/state' Sep 20 20:07:36 | sdai: path state = running Sep 20 20:07:36 | sdai: get_state Sep 20 20:07:36 | sdai: path_checker = directio (internal default) Sep 20 20:07:36 | sdai: checker timeout = 30 ms (internal default) Sep 20 20:07:36 | io_setup failed Sep 20 20:07:36 | sdai: checker init failed == Comment: #7 - Mauricio Faria De Oliveira <mauricfo@br.ibm.com> - 2016-09-27 18:32:57 == The function is failing at the io_setup() system call.  @ checkers/directio.c  int libcheck_init (struct checker * c)  {   unsigned long pgsize = getpagesize();   struct directio_context * ct;   long flags;   ct = malloc(sizeof(struct directio_context));   if (!ct)           return 1;   memset(ct, 0, sizeof(struct directio_context));   if (io_setup(1, &ct->ioctx) != 0) {           condlog(1, "io_setup failed");           free(ct);           return 1;   }  <...> The syscall is failing w/ EAGAIN  # grep ^io_setup multipath_-v2_-d.strace  io_setup(1, 0x100163c9130) = -1 EAGAIN (Resource temporarily unavailable)  io_setup(1, 0x10015bae2c0) = -1 EAGAIN (Resource temporarily unavailable)  io_setup(1, 0x100164d65a0) = -1 EAGAIN (Resource temporarily unavailable)  io_setup(1, 0x10016429f20) = -1 EAGAIN (Resource temporarily unavailable)  io_setup(1, 0x100163535c0) = -1 EAGAIN (Resource temporarily unavailable)  io_setup(1, 0x10016368510) = -1 EAGAIN (Resource temporarily unavailable)  <...> According to the manpage (man 2 io_setup)  NAME         io_setup - create an asynchronous I/O context  DESCRIPTION         The io_setup() system call creates an asynchronous I/O context suitable for concurrently processing nr_events operations. <...>  ERRORS         EAGAIN The specified nr_events exceeds the user's limit of available events, as defined in /proc/sys/fs/aio-max-nr. On luckyv1:  root@luckyv1:~/mauricfo/bz146849/sep27# cat /proc/sys/fs/aio-max-nr  65536  root@luckyv1:~/mauricfo/bz146849/sep27# cat /proc/sys/fs/aio-nr  130560 According to linux's Documentation/sysctl/fs.txt [1]  aio-nr & aio-max-nr:  aio-nr is the running total of the number of events specified on the  io_setup system call for all currently active aio contexts. If aio-nr  reaches aio-max-nr then io_setup will fail with EAGAIN. Note that  raising aio-max-nr does not result in the pre-allocation or re-sizing  of any kernel data structures. Interestingly, aio-nr is greater than aio-max-nr. Hm. Increased aio-max-nr to 262144, and could get some more maps created. == Comment: #8 - Mauricio Faria De Oliveira <mauricfo@br.ibm.com> - 2016-09-27 18:56:08 == This attached test-case demonstrates that for each io_setup() request of 1 nr_event, actually 1280 seem to be allocated. root@luckyv1:~/mauricfo/bz146849/sep27# gcc -o io_setup io_setup.c -laio root@luckyv1:~/mauricfo/bz146849/sep27# cat /proc/sys/fs/aio-nr 0 root@luckyv1:~/mauricfo/bz146849/sep27# ./io_setup & [1] 12352 io_setup rc = 0 sleeping 10 seconds... root@luckyv1:~/mauricfo/bz146849/sep27# cat /proc/sys/fs/aio-nr 1280 <...> io_destroy rc = 0 [1]+ Done ./io_setup root@luckyv1:~/mauricfo/bz146849/sep27# cat /proc/sys/fs/aio-nr 0 == Comment: #45 - Mauricio Faria De Oliveira <mauricfo@br.ibm.com> - 2017-09-19 18:32:10 == Verification of this commit with the linux-hwe-edge kernel in -proposed, using the attached test-case "io_setup_v2.c"     commit 2a8a98673c13cb2a61a6476153acf8344adfa992     Author: Mauricio Faria de Oliveira <mauricfo@linux.vnet.ibm.com>     Date: Wed Jul 5 10:53:16 2017 -0300         fs: aio: fix the increment of aio-nr and counting against aio-max-nr Test-case (attached)     $ sudo apt-get install gcc libaio-dev     $ gcc -o io_setup_v2 io_setup_v2.c -laio Original kernel:     - Only 409 io_contexts could be allocated,     but that took 130880 [ div by 2, per bug] = 65440 slots out of 65535     $ uname -rv     4.11.0-14-generic #20~16.04.1-Ubuntu SMP Wed Aug 9 09:06:18 UTC 2017     $ ./io_setup_v2 1 65536     nr_events: 1, nr_requests: 65536     rc = -11, i = 409     ^Z     [1]+ Stopped ./io_setup_v2 1 65536     $ cat /proc/sys/fs/aio-nr     130880     $ cat /proc/sys/fs/aio-max-nr     65536     $ kill %% Patched kernel:     - Now 65515 io_contexts could be allocated out of 65535 (much better)       (and reporting correctly, without div by 2.)     $ uname -rv     4.11.0-140-generic #20~16.04.1+bz146489 SMP Tue Sep 19 17:46:15 CDT 2017     $ ./io_setup_v2 1 65536     nr_events: 1, nr_requests: 65536     rc = -12, i = 65515     ^Z     [1]+ Stopped ./io_setup_v2 1 65536     $ cat /proc/sys/fs/aio-nr     65515     $ kill %% [Impact] * The number of available AIO contexts is severely limited on systems with a large number of possible CPUs (e.g., IBM POWER8 processors w/ 20ish cores * 8 threads/core, and other multithreaded server-class processors). * This prevents application such as multipath/directio checker to provide all of the available devices to the system. * Other applications which depend on AIO can be affected/limited. * The patch fixes how aio increments the number of active contexts (seen in /proc/sys/fs/aio-nr) and checks that against the global limit (seen in /proc/sys/fs/aio-max-nr). [Test Case] * A synthetic test-case is attached (io_setup_v2.c) and demonstrated (original/patched kernels) in comment #4. * Trying to perform multipath discovery in debug/verbose mode (i.e., "multipath -v3" command) with sufficient number of individual paths using the "directio" path checker should demonstrate the problem/solution as well (i.e., presence or not of "io_setup failed" messages). [Regression Potential] * Note the fix is trivial and has been tested by several users, even caused the introduction of a new test-case in "libaio"; (but that can never be a strong enough reason for no more errors). * Applications which use aio with small "nr_events" value as argument to "io_setup()" now have access to a much larger number of aio contexts; but hopefully those apps are already only requesting what they need, not trying to get more and more. * Applications which relied in the _incorrect_ behavior of '/proc/sys/fs/aio-nr' being possibly greater than '/proc/sys/fs/aio-max-nr' might have problems, but those apps should be fixed. Problem Description ================================= I am facing this issue for Texan Flash storage 840 disks which are coming from coho and salfish adapter coho adapter with 840 storage is 3G disks and salfish adapter with 840 is 12G disks I am able to see those disks in lsblk o/p but not in multipath -ll comamnd 0004:01:00.0 Coho: Saturn-X U78C9.001.WZS0060-P1-C6 0x10000090fa2a51f8 host10 Online 0004:01:00.1 Coho: Saturn-X U78C9.001.WZS0060-P1-C6 0x10000090fa2a51f9 host11 Online 0005:09:00.0 Sailfish: QLogic 8GB U78C9.001.WZS0060-P1-C9 0x21000024ff787778 host2 Online 0005:09:00.1 Sailfish: QLogic 8GB U78C9.001.WZS0060-P1-C9 0x21000024ff787779 host4 Online root@luckyv1:/dev/disk# multipath -ll | grep "size=3.0G" -B 1 root@luckyv1:/dev/disk# multipath -ll | grep "size=12G" -B 1 root@luckyv1:/dev/disk# == Comment: #3 - Luciano Chavez <chavez@us.ibm.com> - 2016-09-20 20:22:20 == I edited /etc/multipath.conf and added verbosity 6 to crank up the output and ran multipath -ll and saved it off to a text file (attached). All the using the directio checker failed and those using the tur checker seem to work. Sep 20 20:07:36 | loading //lib/multipath/libcheckdirectio.so checker Sep 20 20:07:36 | loading //lib/multipath/libprioconst.so prioritizer Sep 20 20:07:36 | Discover device /sys/devices/pci0000:00/0000:00:00.0/0000:01:00.0/host3/rport-3:0-2/target3:0:0/3:0:0:0/block/sdai Sep 20 20:07:36 | sdai: udev property ID_WWN whitelisted Sep 20 20:07:36 | sdai: not found in pathvec Sep 20 20:07:36 | sdai: mask = 0x25 Sep 20 20:07:36 | sdai: dev_t = 66:32 Sep 20 20:07:36 | open '/sys/devices/pci0000:00/0000:00:00.0/0000:01:00.0/host3/rport-3:0-2/target3:0:0/3:0:0:0/block/sdai/size' Sep 20 20:07:36 | sdai: size = 20971520 Sep 20 20:07:36 | sdai: vendor = IBM Sep 20 20:07:36 | sdai: product = FlashSystem-9840 Sep 20 20:07:36 | sdai: rev = 1442 Sep 20 20:07:36 | sdai: h:b:t:l = 3:0:0:0 Sep 20 20:07:36 | SCSI target 3:0:0 -> FC rport 3:0-2 Sep 20 20:07:36 | sdai: tgt_node_name = 0x500507605e839800 Sep 20 20:07:36 | open '/sys/devices/pci0000:00/0000:00:00.0/0000:01:00.0/host3/rport-3:0-2/target3:0:0/3:0:0:0/state' Sep 20 20:07:36 | sdai: path state = running Sep 20 20:07:36 | sdai: get_state Sep 20 20:07:36 | sdai: path_checker = directio (internal default) Sep 20 20:07:36 | sdai: checker timeout = 30 ms (internal default) Sep 20 20:07:36 | io_setup failed Sep 20 20:07:36 | sdai: checker init failed == Comment: #7 - Mauricio Faria De Oliveira <mauricfo@br.ibm.com> - 2016-09-27 18:32:57 == The function is failing at the io_setup() system call.  @ checkers/directio.c  int libcheck_init (struct checker * c)  {   unsigned long pgsize = getpagesize();   struct directio_context * ct;   long flags;   ct = malloc(sizeof(struct directio_context));   if (!ct)           return 1;   memset(ct, 0, sizeof(struct directio_context));   if (io_setup(1, &ct->ioctx) != 0) {           condlog(1, "io_setup failed");           free(ct);           return 1;   }  <...> The syscall is failing w/ EAGAIN  # grep ^io_setup multipath_-v2_-d.strace  io_setup(1, 0x100163c9130) = -1 EAGAIN (Resource temporarily unavailable)  io_setup(1, 0x10015bae2c0) = -1 EAGAIN (Resource temporarily unavailable)  io_setup(1, 0x100164d65a0) = -1 EAGAIN (Resource temporarily unavailable)  io_setup(1, 0x10016429f20) = -1 EAGAIN (Resource temporarily unavailable)  io_setup(1, 0x100163535c0) = -1 EAGAIN (Resource temporarily unavailable)  io_setup(1, 0x10016368510) = -1 EAGAIN (Resource temporarily unavailable)  <...> According to the manpage (man 2 io_setup)  NAME         io_setup - create an asynchronous I/O context  DESCRIPTION         The io_setup() system call creates an asynchronous I/O context suitable for concurrently processing nr_events operations. <...>  ERRORS         EAGAIN The specified nr_events exceeds the user's limit of available events, as defined in /proc/sys/fs/aio-max-nr. On luckyv1:  root@luckyv1:~/mauricfo/bz146849/sep27# cat /proc/sys/fs/aio-max-nr  65536  root@luckyv1:~/mauricfo/bz146849/sep27# cat /proc/sys/fs/aio-nr  130560 According to linux's Documentation/sysctl/fs.txt [1]  aio-nr & aio-max-nr:  aio-nr is the running total of the number of events specified on the  io_setup system call for all currently active aio contexts. If aio-nr  reaches aio-max-nr then io_setup will fail with EAGAIN. Note that  raising aio-max-nr does not result in the pre-allocation or re-sizing  of any kernel data structures. Interestingly, aio-nr is greater than aio-max-nr. Hm. Increased aio-max-nr to 262144, and could get some more maps created. == Comment: #8 - Mauricio Faria De Oliveira <mauricfo@br.ibm.com> - 2016-09-27 18:56:08 == This attached test-case demonstrates that for each io_setup() request of 1 nr_event, actually 1280 seem to be allocated. root@luckyv1:~/mauricfo/bz146849/sep27# gcc -o io_setup io_setup.c -laio root@luckyv1:~/mauricfo/bz146849/sep27# cat /proc/sys/fs/aio-nr 0 root@luckyv1:~/mauricfo/bz146849/sep27# ./io_setup & [1] 12352 io_setup rc = 0 sleeping 10 seconds... root@luckyv1:~/mauricfo/bz146849/sep27# cat /proc/sys/fs/aio-nr 1280 <...> io_destroy rc = 0 [1]+ Done ./io_setup root@luckyv1:~/mauricfo/bz146849/sep27# cat /proc/sys/fs/aio-nr 0 == Comment: #45 - Mauricio Faria De Oliveira <mauricfo@br.ibm.com> - 2017-09-19 18:32:10 == Verification of this commit with the linux-hwe-edge kernel in -proposed, using the attached test-case "io_setup_v2.c"     commit 2a8a98673c13cb2a61a6476153acf8344adfa992     Author: Mauricio Faria de Oliveira <mauricfo@linux.vnet.ibm.com>     Date: Wed Jul 5 10:53:16 2017 -0300         fs: aio: fix the increment of aio-nr and counting against aio-max-nr Test-case (attached)     $ sudo apt-get install gcc libaio-dev     $ gcc -o io_setup_v2 io_setup_v2.c -laio Original kernel:     - Only 409 io_contexts could be allocated,     but that took 130880 [ div by 2, per bug] = 65440 slots out of 65535     $ uname -rv     4.11.0-14-generic #20~16.04.1-Ubuntu SMP Wed Aug 9 09:06:18 UTC 2017     $ ./io_setup_v2 1 65536     nr_events: 1, nr_requests: 65536     rc = -11, i = 409     ^Z     [1]+ Stopped ./io_setup_v2 1 65536     $ cat /proc/sys/fs/aio-nr     130880     $ cat /proc/sys/fs/aio-max-nr     65536     $ kill %% Patched kernel:     - Now 65515 io_contexts could be allocated out of 65535 (much better)       (and reporting correctly, without div by 2.)     $ uname -rv     4.11.0-140-generic #20~16.04.1+bz146489 SMP Tue Sep 19 17:46:15 CDT 2017     $ ./io_setup_v2 1 65536     nr_events: 1, nr_requests: 65536     rc = -12, i = 65515     ^Z     [1]+ Stopped ./io_setup_v2 1 65536     $ cat /proc/sys/fs/aio-nr     65515     $ kill %%
2017-09-20 20:10:42 Joseph Salisbury kernel-package (Ubuntu): importance Undecided Critical
2017-09-20 20:12:24 Joseph Salisbury tags architecture-ppc64le bugnameltc-146489 severity-critical targetmilestone-inin16043 architecture-ppc64le bugnameltc-146489 kernel-da-key severity-critical targetmilestone-inin16043
2017-09-20 20:12:48 Joseph Salisbury affects kernel-package (Ubuntu) linux (Ubuntu)
2017-09-20 20:12:48 Joseph Salisbury linux (Ubuntu): status New Triaged
2017-09-20 20:13:46 Joseph Salisbury nominated for series Ubuntu Xenial
2017-09-20 20:13:46 Joseph Salisbury bug task added linux (Ubuntu Xenial)
2017-09-20 20:13:46 Joseph Salisbury nominated for series Ubuntu Artful
2017-09-20 20:13:46 Joseph Salisbury bug task added linux (Ubuntu Artful)
2017-09-20 20:13:46 Joseph Salisbury nominated for series Ubuntu Zesty
2017-09-20 20:13:46 Joseph Salisbury bug task added linux (Ubuntu Zesty)
2017-09-20 20:13:53 Joseph Salisbury linux (Ubuntu Zesty): status New Triaged
2017-09-20 20:13:57 Joseph Salisbury linux (Ubuntu Xenial): status New Triaged
2017-09-20 20:14:01 Joseph Salisbury linux (Ubuntu Zesty): importance Undecided Critical
2017-09-20 20:14:04 Joseph Salisbury linux (Ubuntu Xenial): importance Undecided Critical
2017-09-21 05:01:55 Frank Heimes ubuntu-power-systems: status New Triaged
2017-09-21 05:08:15 bugproxy attachment added output of multipath -ll with verbosity increased to 6 https://bugs.launchpad.net/bugs/1718397/+attachment/4954138/+files/multipath-trace.log
2017-09-21 05:08:17 bugproxy attachment added test-case: io_setup.c https://bugs.launchpad.net/bugs/1718397/+attachment/4954139/+files/io_setup.c
2017-09-21 05:08:18 bugproxy attachment added test-case: io_setup_v2.c https://bugs.launchpad.net/bugs/1718397/+attachment/4954140/+files/file_146489.txt
2017-09-22 12:21:48 Seth Forshee linux (Ubuntu Artful): status Triaged Fix Committed
2017-09-26 16:33:50 Launchpad Janitor linux (Ubuntu Artful): status Fix Committed Fix Released
2017-09-26 18:04:37 Frank Heimes ubuntu-power-systems: status Triaged In Progress
2017-10-02 18:34:18 Manoj Iyer linux (Ubuntu Zesty): assignee Canonical Kernel Team (canonical-kernel-team)
2017-10-02 18:34:36 Manoj Iyer tags architecture-ppc64le bugnameltc-146489 kernel-da-key severity-critical targetmilestone-inin16043 architecture-ppc64le bugnameltc-146489 kernel-da-key severity-critical targetmilestone-inin16043 triage-a
2017-10-09 10:39:52 bugproxy tags architecture-ppc64le bugnameltc-146489 kernel-da-key severity-critical targetmilestone-inin16043 triage-a architecture-ppc64le targetmilestone-inin16043
2017-10-09 19:26:20 Thadeu Lima de Souza Cascardo linux (Ubuntu Zesty): status Triaged Fix Committed
2017-10-09 19:58:29 Thadeu Lima de Souza Cascardo linux (Ubuntu Xenial): status Triaged Fix Committed
2017-10-10 11:17:42 Frank Heimes ubuntu-power-systems: status In Progress Fix Committed
2017-10-18 08:57:02 Kleber Sacilotto de Souza tags architecture-ppc64le targetmilestone-inin16043 architecture-ppc64le targetmilestone-inin16043 verification-needed-xenial
2017-10-18 08:59:28 Kleber Sacilotto de Souza tags architecture-ppc64le targetmilestone-inin16043 verification-needed-xenial architecture-ppc64le targetmilestone-inin16043 verification-needed-xenial verification-needed-zesty
2017-10-30 19:26:53 Launchpad Janitor linux (Ubuntu Zesty): status Fix Committed Fix Released
2017-10-30 19:26:53 Launchpad Janitor cve linked 2017-1000252
2017-10-30 19:26:53 Launchpad Janitor cve linked 2017-10663
2017-10-30 19:26:53 Launchpad Janitor cve linked 2017-10911
2017-10-30 19:26:53 Launchpad Janitor cve linked 2017-11176
2017-10-30 19:26:53 Launchpad Janitor cve linked 2017-14340
2017-10-30 19:43:07 Launchpad Janitor linux (Ubuntu Xenial): status Fix Committed Fix Released
2017-10-30 20:07:50 Frank Heimes ubuntu-power-systems: status Fix Committed Fix Released
2017-10-30 20:35:20 Mauricio Faria de Oliveira tags architecture-ppc64le targetmilestone-inin16043 verification-needed-xenial verification-needed-zesty architecture-ppc64le targetmilestone-inin16043 verification-done-xenial verification-needed-zesty
2017-10-30 20:49:26 Mauricio Faria de Oliveira tags architecture-ppc64le targetmilestone-inin16043 verification-done-xenial verification-needed-zesty architecture-ppc64le targetmilestone-inin16043 verification-done-xenial verification-done-zesty
2017-11-27 14:50:27 bugproxy tags architecture-ppc64le targetmilestone-inin16043 verification-done-xenial verification-done-zesty architecture-ppc64le bugnameltc-146489 severity-critical targetmilestone-inin16043