Activity log for bug #1357368

Date Who What changed Old value New value Message
2014-08-15 13:20:21 Jeegn Chen bug added bug
2014-08-15 13:20:34 Jeegn Chen nova: assignee Jeegn Chen (jeegn-chen)
2014-08-15 13:30:00 Jeegn Chen description When a volume is attached to a VM in the source compute node through multipath, the related files in /dev/disk/by-path/ are like this stack@ubuntu-server12:~/devstack$ ls /dev/disk/by-path/*24 /dev/disk/by-path/ip-192.168.3.50:3260-iscsi-iqn.1992-04.com.emc:cx.fnm00124500890.a5-lun-24 /dev/disk/by-path/ip-192.168.4.51:3260-iscsi-iqn.1992-04.com.emc:cx.fnm00124500890.b4-lun-24 The information on its corresponding multipath device is like this stack@ubuntu-server12:~/devstack$ sudo multipath -l 3600601602ba03400921130967724e411 3600601602ba03400921130967724e411 dm-3 DGC,VRAID size=1.0G features='1 queue_if_no_path' hwhandler='1 alua' wp=rw |-+- policy='round-robin 0' prio=-1 status=active | `- 19:0:0:24 sdl 8:176 active undef running `-+- policy='round-robin 0' prio=-1 status=enabled `- 18:0:0:24 sdj 8:144 active undef running But when the VM is migrated to the destination, the related information is like the following example since we CANNOT guarantee that all nodes are able to access the same iSCSI portals and the same target LUN number. And the information is used to overwrite connection_info in the DB before the post live migration logic is executed. stack@ubuntu-server13:~/devstack$ ls /dev/disk/by-path/*24 /dev/disk/by-path/ip-192.168.3.51:3260-iscsi-iqn.1992-04.com.emc:cx.fnm00124500890.b5-lun-100 /dev/disk/by-path/ip-192.168.4.51:3260-iscsi-iqn.1992-04.com.emc:cx.fnm00124500890.b4-lun-100 stack@ubuntu-server12:~/devstack$ sudo multipath -l 3600601602ba03400921130967724e411 3600601602ba03400921130967724e411 dm-3 DGC,VRAID size=1.0G features='1 queue_if_no_path' hwhandler='1 alua' wp=rw |-+- policy='round-robin 0' prio=-1 status=active | `- 19:0:0:100 sdf 8:176 active undef running `-+- policy='round-robin 0' prio=-1 status=enabled `- 18:0:0:100 sdg 8:144 active undef running As a result, if post live migration in source side uses <IP>, <IQN> and <TARGET LUN Number> to find the devices to clean up, it may use 192.168.3.51, iqn.1992-04.com.emc:cx.fnm00124500890.a5 and 100. However, the correct one should be 192.168.3.50, iqn.1992-04.com.emc:cx.fnm00124500890.a5 and 24. Similar philosophy in (https://bugs.launchpad.net/nova/+bug/1327497) can be used to fix it: Leverage the unchanged multipath_id to find correct devices to delete. When a volume is attached to a VM in the source compute node through multipath, the related files in /dev/disk/by-path/ are like this stack@ubuntu-server12:~/devstack$ ls /dev/disk/by-path/*24 /dev/disk/by-path/ip-192.168.3.50:3260-iscsi-iqn.1992-04.com.emc:cx.fnm00124500890.a5-lun-24 /dev/disk/by-path/ip-192.168.4.51:3260-iscsi-iqn.1992-04.com.emc:cx.fnm00124500890.b4-lun-24 The information on its corresponding multipath device is like this stack@ubuntu-server12:~/devstack$ sudo multipath -l 3600601602ba03400921130967724e411 3600601602ba03400921130967724e411 dm-3 DGC,VRAID size=1.0G features='1 queue_if_no_path' hwhandler='1 alua' wp=rw |-+- policy='round-robin 0' prio=-1 status=active | `- 19:0:0:24 sdl 8:176 active undef running `-+- policy='round-robin 0' prio=-1 status=enabled   `- 18:0:0:24 sdj 8:144 active undef running But when the VM is migrated to the destination, the related information is like the following example since we CANNOT guarantee that all nodes are able to access the same iSCSI portals and the same target LUN number. And the information is used to overwrite connection_info in the DB before the post live migration logic is executed. stack@ubuntu-server13:~/devstack$ ls /dev/disk/by-path/*24 /dev/disk/by-path/ip-192.168.3.51:3260-iscsi-iqn.1992-04.com.emc:cx.fnm00124500890.b5-lun-100 /dev/disk/by-path/ip-192.168.4.51:3260-iscsi-iqn.1992-04.com.emc:cx.fnm00124500890.b4-lun-100 stack@ubuntu-server13:~/devstack$ sudo multipath -l 3600601602ba03400921130967724e411 3600601602ba03400921130967724e411 dm-3 DGC,VRAID size=1.0G features='1 queue_if_no_path' hwhandler='1 alua' wp=rw |-+- policy='round-robin 0' prio=-1 status=active | `- 19:0:0:100 sdf 8:176 active undef running `-+- policy='round-robin 0' prio=-1 status=enabled   `- 18:0:0:100 sdg 8:144 active undef running As a result, if post live migration in source side uses <IP>, <IQN> and <TARGET LUN Number> to find the devices to clean up, it may use 192.168.3.51, iqn.1992-04.com.emc:cx.fnm00124500890.a5 and 100. However, the correct one should be 192.168.3.50, iqn.1992-04.com.emc:cx.fnm00124500890.a5 and 24. Similar philosophy in (https://bugs.launchpad.net/nova/+bug/1327497) can be used to fix it: Leverage the unchanged multipath_id to find correct devices to delete.
2014-08-15 14:07:36 OpenStack Infra nova: status New In Progress
2014-10-30 08:54:50 OpenStack Infra nova: status In Progress Fix Committed
2014-11-12 12:46:31 Shintaro Mizuno bug added subscriber Shintaro Mizuno
2014-11-13 22:20:41 OpenStack Infra tags in-stable-juno
2014-12-04 23:32:43 Alan Pevec nominated for series nova/juno
2014-12-04 23:32:44 Alan Pevec bug task added nova/juno
2014-12-04 23:34:33 Alan Pevec nova/juno: status New Fix Committed
2014-12-04 23:34:33 Alan Pevec nova/juno: milestone 2014.2.1
2014-12-05 08:16:43 Alan Pevec nova/juno: status Fix Committed Fix Released
2014-12-18 20:10:42 Thierry Carrez nova: status Fix Committed Fix Released
2014-12-18 20:10:42 Thierry Carrez nova: milestone kilo-1
2015-04-30 09:16:29 Thierry Carrez nova: milestone kilo-1 2015.1.0
2016-05-17 15:10:36 Launchpad Janitor branch linked lp:~ubuntu-server-dev/nova/icehouse
2016-05-18 09:43:41 Martin Pitt bug task added nova (Ubuntu)
2016-05-18 09:43:49 Martin Pitt nominated for series Ubuntu Trusty
2016-05-18 09:43:49 Martin Pitt bug task added nova (Ubuntu Trusty)
2016-05-24 21:40:31 Martin Pitt nova (Ubuntu): status New Fix Released
2016-05-24 21:41:27 Martin Pitt nova (Ubuntu Trusty): status New Fix Committed
2016-05-24 21:41:30 Martin Pitt bug added subscriber Ubuntu Stable Release Updates Team
2016-05-24 21:41:32 Martin Pitt bug added subscriber SRU Verification
2016-05-24 21:41:36 Martin Pitt tags in-stable-juno in-stable-juno verification-needed
2016-07-01 23:29:13 Billy Olsen description When a volume is attached to a VM in the source compute node through multipath, the related files in /dev/disk/by-path/ are like this stack@ubuntu-server12:~/devstack$ ls /dev/disk/by-path/*24 /dev/disk/by-path/ip-192.168.3.50:3260-iscsi-iqn.1992-04.com.emc:cx.fnm00124500890.a5-lun-24 /dev/disk/by-path/ip-192.168.4.51:3260-iscsi-iqn.1992-04.com.emc:cx.fnm00124500890.b4-lun-24 The information on its corresponding multipath device is like this stack@ubuntu-server12:~/devstack$ sudo multipath -l 3600601602ba03400921130967724e411 3600601602ba03400921130967724e411 dm-3 DGC,VRAID size=1.0G features='1 queue_if_no_path' hwhandler='1 alua' wp=rw |-+- policy='round-robin 0' prio=-1 status=active | `- 19:0:0:24 sdl 8:176 active undef running `-+- policy='round-robin 0' prio=-1 status=enabled   `- 18:0:0:24 sdj 8:144 active undef running But when the VM is migrated to the destination, the related information is like the following example since we CANNOT guarantee that all nodes are able to access the same iSCSI portals and the same target LUN number. And the information is used to overwrite connection_info in the DB before the post live migration logic is executed. stack@ubuntu-server13:~/devstack$ ls /dev/disk/by-path/*24 /dev/disk/by-path/ip-192.168.3.51:3260-iscsi-iqn.1992-04.com.emc:cx.fnm00124500890.b5-lun-100 /dev/disk/by-path/ip-192.168.4.51:3260-iscsi-iqn.1992-04.com.emc:cx.fnm00124500890.b4-lun-100 stack@ubuntu-server13:~/devstack$ sudo multipath -l 3600601602ba03400921130967724e411 3600601602ba03400921130967724e411 dm-3 DGC,VRAID size=1.0G features='1 queue_if_no_path' hwhandler='1 alua' wp=rw |-+- policy='round-robin 0' prio=-1 status=active | `- 19:0:0:100 sdf 8:176 active undef running `-+- policy='round-robin 0' prio=-1 status=enabled   `- 18:0:0:100 sdg 8:144 active undef running As a result, if post live migration in source side uses <IP>, <IQN> and <TARGET LUN Number> to find the devices to clean up, it may use 192.168.3.51, iqn.1992-04.com.emc:cx.fnm00124500890.a5 and 100. However, the correct one should be 192.168.3.50, iqn.1992-04.com.emc:cx.fnm00124500890.a5 and 24. Similar philosophy in (https://bugs.launchpad.net/nova/+bug/1327497) can be used to fix it: Leverage the unchanged multipath_id to find correct devices to delete. [Impact] When a volume is attached to a VM in the source compute node through multipath, the related files in /dev/disk/by-path/ are like this stack@ubuntu-server12:~/devstack$ ls /dev/disk/by-path/*24 /dev/disk/by-path/ip-192.168.3.50:3260-iscsi-iqn.1992-04.com.emc:cx.fnm00124500890.a5-lun-24 /dev/disk/by-path/ip-192.168.4.51:3260-iscsi-iqn.1992-04.com.emc:cx.fnm00124500890.b4-lun-24 The information on its corresponding multipath device is like this stack@ubuntu-server12:~/devstack$ sudo multipath -l 3600601602ba03400921130967724e411 3600601602ba03400921130967724e411 dm-3 DGC,VRAID size=1.0G features='1 queue_if_no_path' hwhandler='1 alua' wp=rw |-+- policy='round-robin 0' prio=-1 status=active | `- 19:0:0:24 sdl 8:176 active undef running `-+- policy='round-robin 0' prio=-1 status=enabled   `- 18:0:0:24 sdj 8:144 active undef running But when the VM is migrated to the destination, the related information is like the following example since we CANNOT guarantee that all nodes are able to access the same iSCSI portals and the same target LUN number. And the information is used to overwrite connection_info in the DB before the post live migration logic is executed. stack@ubuntu-server13:~/devstack$ ls /dev/disk/by-path/*24 /dev/disk/by-path/ip-192.168.3.51:3260-iscsi-iqn.1992-04.com.emc:cx.fnm00124500890.b5-lun-100 /dev/disk/by-path/ip-192.168.4.51:3260-iscsi-iqn.1992-04.com.emc:cx.fnm00124500890.b4-lun-100 stack@ubuntu-server13:~/devstack$ sudo multipath -l 3600601602ba03400921130967724e411 3600601602ba03400921130967724e411 dm-3 DGC,VRAID size=1.0G features='1 queue_if_no_path' hwhandler='1 alua' wp=rw |-+- policy='round-robin 0' prio=-1 status=active | `- 19:0:0:100 sdf 8:176 active undef running `-+- policy='round-robin 0' prio=-1 status=enabled   `- 18:0:0:100 sdg 8:144 active undef running As a result, if post live migration in source side uses <IP>, <IQN> and <TARGET LUN Number> to find the devices to clean up, it may use 192.168.3.51, iqn.1992-04.com.emc:cx.fnm00124500890.a5 and 100. However, the correct one should be 192.168.3.50, iqn.1992-04.com.emc:cx.fnm00124500890.a5 and 24. Similar philosophy in (https://bugs.launchpad.net/nova/+bug/1327497) can be used to fix it: Leverage the unchanged multipath_id to find correct devices to delete. [Test Case] Live migrate an instance which uses iSCSI multipath. Verify the correct target is removed on source hypervisor. [Regression Potential] Not much, its included in the next release (Juno). The change introduces a check to use a field already used by fiber multipath connections which was not used by iscsi multipath code path on cleanup. If it fails it would keep remaining behavior of not cleaning up iscsi sessions/paths.
2016-07-01 23:29:20 Billy Olsen tags in-stable-juno verification-needed verification-needed
2016-07-01 23:29:31 Billy Olsen tags verification-needed in-stable-juno verification-done
2016-07-04 09:57:51 Martin Pitt removed subscriber Ubuntu Stable Release Updates Team
2016-07-04 09:59:50 Launchpad Janitor nova (Ubuntu Trusty): status Fix Committed Fix Released