Activity log for bug #1550400

Date Who What changed Old value New value Message
2016-02-26 15:58:52 Andreas Scheuring bug added bug
2016-02-26 16:07:35 Dariusz Smigiel summary Macvtap driver/agent migrates instances on an invalid pyhsical network Macvtap driver/agent migrates instances on an invalid physical network
2016-02-26 16:07:39 Dariusz Smigiel neutron: status New Incomplete
2016-02-26 16:15:12 Andreas Scheuring description More details to come soon Scenario: Host1 has physical_interface_mappings: pyhsnet1:eth0 Host2 has physical_interface_mappings: physnet1:eth1, physnet2=eth0 Now Live migration from an instance hosted on host1 (connected to physnet1) to host2 succeeds. Libvirt just migrates the whole server with its domain.xml and the macvtap is plugged on the targets side eth0. Now the instance does not have access to its network anymore, but access to another physical network. The behavior is documented, however this needs to be fixed! The idea is to detect an invalid migration in the mechanism driver and in the agent and block port_binding if an invalid migration is happening. The agent patch is currently in review [1]. We agreed with armax to merge it, enhance documentation and have this bugfix for tracking. [1] https://review.openstack.org/#/c/275306
2016-02-26 16:15:51 Andreas Scheuring neutron: assignee Andreas Scheuring (andreas-scheuring)
2016-02-26 17:12:46 Dariusz Smigiel neutron: status Incomplete In Progress
2016-02-26 17:14:01 Dariusz Smigiel neutron: importance Undecided High
2016-03-01 14:47:17 Andreas Scheuring description Scenario: Host1 has physical_interface_mappings: pyhsnet1:eth0 Host2 has physical_interface_mappings: physnet1:eth1, physnet2=eth0 Now Live migration from an instance hosted on host1 (connected to physnet1) to host2 succeeds. Libvirt just migrates the whole server with its domain.xml and the macvtap is plugged on the targets side eth0. Now the instance does not have access to its network anymore, but access to another physical network. The behavior is documented, however this needs to be fixed! The idea is to detect an invalid migration in the mechanism driver and in the agent and block port_binding if an invalid migration is happening. The agent patch is currently in review [1]. We agreed with armax to merge it, enhance documentation and have this bugfix for tracking. [1] https://review.openstack.org/#/c/275306 Scenario: Host1 has physical_interface_mappings: pyhsnet1:eth0 Host2 has physical_interface_mappings: physnet1:eth1, physnet2=eth0 Now Live migration from an instance hosted on host1 (connected to physnet1) to host2 succeeds. Libvirt just migrates the whole server with its domain.xml and the macvtap is plugged on the targets side eth0. Now the instance does not have access to its network anymore, but access to another physical network. The behavior is documented, however this needs to be fixed! Solve the problem ================= #1 Solve it in Nova pre live migration -------------------------------------- This would allow migration although physical_interface mappings are different. On pre live migration nova should change the binding:host to the migration target. This will trigger the portbinding and the mech driver which will update the vif_details with the right macvtap source device information. Libvirt can then adapt the migration-xml to reflect the changes. Two things a) Currently the update of the binding is done in post migration. Is there a reason for it? Could we already do that in pre-live migration? b) This is a nova change, which will definitively not land in Mitaka. It needs some discussion with nova folks. #2 Solve it in nova post live migration --------------------------------------- Once the migration is started, libvirt takes care of completely executing it. Ideally, libvirt would start migration but would wait with completion, until the portbinding has happened AND until the device has been reported as up. #3 Device renaming in the macvtap agent --------------------------------------- This would allow migration although physical_interface mappings are different. Instead of physical_interface_mapping = physnet1:eth0 use a physical_interface_mac_mapping = physnet1:00:11:22:33:44:55:66 #where 00:11:22:33:44:55:66 is the mac address of the interface to use On agent startup, the agent could rename the associated device to "physnet1" (or to some other generic value) that is consistent cross all hosts! We would need to document that this interface should not be used by any other application (that relies on the interface name) #4 Use generic vlan device names -------------------------------- This solves the problem only for vlan networks! For flat networks it still would exist Today, the agent generates the vlan device names like this: for eth1 eth1.<vlan-id>. We could get rid of this pattern and use network-uuid.vlan instead. Where nework-uuid are the first 10 chars of the id. But this would not solve the issue for flat networks. Therefore the device renaming like proposed in #3 would be required. Prevent invalid migration ========================= #1 Let Port binding fail ------------------------ The idea is to detect an invalid migration in the mechanism driver and let port binding fail. This approach has two problems a) Portbinding happens AFTER Migration happened. In post live migration nova requests to update the binding:host-id to the target. But when doing so, the instance is already running on the target host. The binding will fail, but the migration happened, though. b) Detecting a migration in the mech driver is difficult. The idea is to use the PortContext.original port. This works if we add the original port to the context somewhere in the ml2 plugin (see proposed patch). The problem is that this is not an indicator for a live migration. The original port will also be added if scheduling on another node failed before and now the current node is picked. There was no live migration but the PortContext.original port is set to another host. Maybe this can be solved but it's worthwhile mentioning here. #2 Let agent detect invalid migration ------------------------------------- An invalid migration could be detected in the agent to avoid the agent setting the device status to up. But this is too late, as the agent detects the device after migration already started. There is no way to stop it again.
2016-03-12 01:04:31 Armando Migliaccio neutron: milestone newton-1
2016-04-08 09:51:21 Andreas Scheuring description Scenario: Host1 has physical_interface_mappings: pyhsnet1:eth0 Host2 has physical_interface_mappings: physnet1:eth1, physnet2=eth0 Now Live migration from an instance hosted on host1 (connected to physnet1) to host2 succeeds. Libvirt just migrates the whole server with its domain.xml and the macvtap is plugged on the targets side eth0. Now the instance does not have access to its network anymore, but access to another physical network. The behavior is documented, however this needs to be fixed! Solve the problem ================= #1 Solve it in Nova pre live migration -------------------------------------- This would allow migration although physical_interface mappings are different. On pre live migration nova should change the binding:host to the migration target. This will trigger the portbinding and the mech driver which will update the vif_details with the right macvtap source device information. Libvirt can then adapt the migration-xml to reflect the changes. Two things a) Currently the update of the binding is done in post migration. Is there a reason for it? Could we already do that in pre-live migration? b) This is a nova change, which will definitively not land in Mitaka. It needs some discussion with nova folks. #2 Solve it in nova post live migration --------------------------------------- Once the migration is started, libvirt takes care of completely executing it. Ideally, libvirt would start migration but would wait with completion, until the portbinding has happened AND until the device has been reported as up. #3 Device renaming in the macvtap agent --------------------------------------- This would allow migration although physical_interface mappings are different. Instead of physical_interface_mapping = physnet1:eth0 use a physical_interface_mac_mapping = physnet1:00:11:22:33:44:55:66 #where 00:11:22:33:44:55:66 is the mac address of the interface to use On agent startup, the agent could rename the associated device to "physnet1" (or to some other generic value) that is consistent cross all hosts! We would need to document that this interface should not be used by any other application (that relies on the interface name) #4 Use generic vlan device names -------------------------------- This solves the problem only for vlan networks! For flat networks it still would exist Today, the agent generates the vlan device names like this: for eth1 eth1.<vlan-id>. We could get rid of this pattern and use network-uuid.vlan instead. Where nework-uuid are the first 10 chars of the id. But this would not solve the issue for flat networks. Therefore the device renaming like proposed in #3 would be required. Prevent invalid migration ========================= #1 Let Port binding fail ------------------------ The idea is to detect an invalid migration in the mechanism driver and let port binding fail. This approach has two problems a) Portbinding happens AFTER Migration happened. In post live migration nova requests to update the binding:host-id to the target. But when doing so, the instance is already running on the target host. The binding will fail, but the migration happened, though. b) Detecting a migration in the mech driver is difficult. The idea is to use the PortContext.original port. This works if we add the original port to the context somewhere in the ml2 plugin (see proposed patch). The problem is that this is not an indicator for a live migration. The original port will also be added if scheduling on another node failed before and now the current node is picked. There was no live migration but the PortContext.original port is set to another host. Maybe this can be solved but it's worthwhile mentioning here. #2 Let agent detect invalid migration ------------------------------------- An invalid migration could be detected in the agent to avoid the agent setting the device status to up. But this is too late, as the agent detects the device after migration already started. There is no way to stop it again. Scenario1 - Migration on wrong physical network =============================================== Host1 has physical_interface_mappings: pyhsnet1:eth0, physnet2=eth2 Host2 has physical_interface_mappings: physnet1:eth1, physnet2=eth0 Now Live migration from an instance hosted on host1 (connected to physnet1) to host2 succeeds. Libvirt just migrates the whole server with its domain.xml and the macvtap is plugged on the targets side eth0. Now the instance does not have access to its network anymore, but access to another physical network. The behavior is documented, however this needs to be fixed! Scenario2 - Migration fails =========================== Host1 has physical_interface_mappings: pyhsnet1:eth0 Host2 has physical_interface_mappings: physnet1:eth1 Let's assume a vlan setup. Let's assume a migration from host1 to host2. Host to does NOT have a interface eth0. Migration will fail in instance will remain active on the source, as nova plug on host2 failed to create a vlan device on eth0. If you have a flat network - definition of he libvirt xml will fail on host2. Two approaches are thinkable * Solve the problem (Scenario 1+2) * just prevent such an invalid migration (Scenario 1, and live with the fact of a failing migration in scenario 2) Solve the problem ================= #1 Solve it in Nova pre live migration -------------------------------------- This would allow migration although physical_interface mappings are different. On pre live migration nova should change the binding:host to the migration target. This will trigger the portbinding and the mech driver which will update the vif_details with the right macvtap source device information. Libvirt can then adapt the migration-xml to reflect the changes. Currently the update of the binding is done in post migration, after migration succeeded. Can we already do it in pre_live_migration and on failure undo it in rollback ? - There's no issue for the reference implementations - But there might be some external SDN Controllers, that might shut down ports on the source host as soon as the host_id is being updated. The alternative would be to allow a port to be bound to multiple hosts simultaneously. So in pre_live migration, nova would add a binding for the target host and in post_live_migration it would remove the original binding. This would require - simultaneous port binding. This will be achieved by https://bugs.launchpad.net/neutron/+bug/1367391 - allow such a binding for compute ports as well - Update APIs to reflect multiple port_bindings - Create / Update / Show Port - host_id is not reflect for DVR ports today [1] #2 Solve it in nova post live migration --------------------------------------- Once the migration is started, libvirt takes care of completely executing it. Ideally, libvirt would start migration but would wait with completion, until the portbinding has happened AND until the device has been reported as up. #3 Device renaming in the macvtap agent --------------------------------------- This would allow migration although physical_interface mappings are different. Instead of      physical_interface_mapping = physnet1:eth0 use a      physical_interface_mac_mapping = physnet1:00:11:22:33:44:55:66 #where 00:11:22:33:44:55:66 is the mac address of the interface to use On agent startup, the agent could rename the associated device to "physnet1" (or to some other generic value) that is consistent cross all hosts! We would need to document that this interface should not be used by any other application (that relies on the interface name) #4 Use generic vlan device names -------------------------------- This solves the problem only for vlan networks! For flat networks it still would exist Today, the agent generates the vlan device names like this: for eth1 eth1.<vlan-id>. We could get rid of this pattern and use network-uuid.vlan instead. Where nework-uuid are the first 10 chars of the id. But this would not solve the issue for flat networks. Therefore the device renaming like proposed in #3 would be required. Prevent invalid migration ========================= #1 Let Port binding fail ------------------------ The idea is to detect an invalid migration in the mechanism driver and let port binding fail. This approach has two problems a) Portbinding happens AFTER Migration happened. In post live migration nova requests to update the binding:host-id to the target. But when doing so, the instance is already running on the target host. The binding will fail, but the migration happened, though. b) Detecting a migration in the mech driver is difficult. The idea is to use the PortContext.original port. This works if we add the original port to the context somewhere in the ml2 plugin (see proposed patch). The problem is that this is not an indicator for a live migration. The original port will also be added if scheduling on another node failed before and now the current node is picked. There was no live migration but the PortContext.original port is set to another host. Maybe this can be solved but it's worthwhile mentioning here. #2 Let agent detect invalid migration ------------------------------------- An invalid migration could be detected in the agent to avoid the agent setting the device status to up. But this is too late, as the agent detects the device after migration already started. There is no way to stop it again. References ========== [1] curl -g -i -X GET http://192.168.122.30:9696/v2.0/ports/b780af01-a3c6-4279-a355-fa3289bc1ec3.json {"port": {"status": "ACTIVE", "binding:host_id": "", "description": "", "allowed_address_pairs": [], "extra_dhcp_opts": [], "updated_at": "2016-04-08T08:59:08", "device_owner": "network:router_interface_distributed", "port_security_enabled": false, "binding:profile": {}, "fixed_ips": [{"subnet_id": "7c57f270-46a9-4cde-a2a7-82e66da1d084", "ip_address": "10.0.0.1"}], "id": "b780af01-a3c6-4279-a355-fa3289bc1ec3", "security_groups": [], "device_id": "48120dd2-2523-4b11-9f67-8ea944d11012", "name": "", "admin_state_up": true, "network_id": "2be9f80a-e3ac-42e6-9249-5ccca241ad85", "dns_name": null, "binding:vif_details": {}, "binding:vnic_type": "normal", "binding:vif_type": "distributed", "tenant_id": "1c9d0fc21afc40a2959d3d3d4acca528", "mac_address": "fa:16:3e:a7:d8:80", "created_at": "2016-04-07T15:05:57"}}
2016-04-08 09:52:16 Andreas Scheuring description Scenario1 - Migration on wrong physical network =============================================== Host1 has physical_interface_mappings: pyhsnet1:eth0, physnet2=eth2 Host2 has physical_interface_mappings: physnet1:eth1, physnet2=eth0 Now Live migration from an instance hosted on host1 (connected to physnet1) to host2 succeeds. Libvirt just migrates the whole server with its domain.xml and the macvtap is plugged on the targets side eth0. Now the instance does not have access to its network anymore, but access to another physical network. The behavior is documented, however this needs to be fixed! Scenario2 - Migration fails =========================== Host1 has physical_interface_mappings: pyhsnet1:eth0 Host2 has physical_interface_mappings: physnet1:eth1 Let's assume a vlan setup. Let's assume a migration from host1 to host2. Host to does NOT have a interface eth0. Migration will fail in instance will remain active on the source, as nova plug on host2 failed to create a vlan device on eth0. If you have a flat network - definition of he libvirt xml will fail on host2. Two approaches are thinkable * Solve the problem (Scenario 1+2) * just prevent such an invalid migration (Scenario 1, and live with the fact of a failing migration in scenario 2) Solve the problem ================= #1 Solve it in Nova pre live migration -------------------------------------- This would allow migration although physical_interface mappings are different. On pre live migration nova should change the binding:host to the migration target. This will trigger the portbinding and the mech driver which will update the vif_details with the right macvtap source device information. Libvirt can then adapt the migration-xml to reflect the changes. Currently the update of the binding is done in post migration, after migration succeeded. Can we already do it in pre_live_migration and on failure undo it in rollback ? - There's no issue for the reference implementations - But there might be some external SDN Controllers, that might shut down ports on the source host as soon as the host_id is being updated. The alternative would be to allow a port to be bound to multiple hosts simultaneously. So in pre_live migration, nova would add a binding for the target host and in post_live_migration it would remove the original binding. This would require - simultaneous port binding. This will be achieved by https://bugs.launchpad.net/neutron/+bug/1367391 - allow such a binding for compute ports as well - Update APIs to reflect multiple port_bindings - Create / Update / Show Port - host_id is not reflect for DVR ports today [1] #2 Solve it in nova post live migration --------------------------------------- Once the migration is started, libvirt takes care of completely executing it. Ideally, libvirt would start migration but would wait with completion, until the portbinding has happened AND until the device has been reported as up. #3 Device renaming in the macvtap agent --------------------------------------- This would allow migration although physical_interface mappings are different. Instead of      physical_interface_mapping = physnet1:eth0 use a      physical_interface_mac_mapping = physnet1:00:11:22:33:44:55:66 #where 00:11:22:33:44:55:66 is the mac address of the interface to use On agent startup, the agent could rename the associated device to "physnet1" (or to some other generic value) that is consistent cross all hosts! We would need to document that this interface should not be used by any other application (that relies on the interface name) #4 Use generic vlan device names -------------------------------- This solves the problem only for vlan networks! For flat networks it still would exist Today, the agent generates the vlan device names like this: for eth1 eth1.<vlan-id>. We could get rid of this pattern and use network-uuid.vlan instead. Where nework-uuid are the first 10 chars of the id. But this would not solve the issue for flat networks. Therefore the device renaming like proposed in #3 would be required. Prevent invalid migration ========================= #1 Let Port binding fail ------------------------ The idea is to detect an invalid migration in the mechanism driver and let port binding fail. This approach has two problems a) Portbinding happens AFTER Migration happened. In post live migration nova requests to update the binding:host-id to the target. But when doing so, the instance is already running on the target host. The binding will fail, but the migration happened, though. b) Detecting a migration in the mech driver is difficult. The idea is to use the PortContext.original port. This works if we add the original port to the context somewhere in the ml2 plugin (see proposed patch). The problem is that this is not an indicator for a live migration. The original port will also be added if scheduling on another node failed before and now the current node is picked. There was no live migration but the PortContext.original port is set to another host. Maybe this can be solved but it's worthwhile mentioning here. #2 Let agent detect invalid migration ------------------------------------- An invalid migration could be detected in the agent to avoid the agent setting the device status to up. But this is too late, as the agent detects the device after migration already started. There is no way to stop it again. References ========== [1] curl -g -i -X GET http://192.168.122.30:9696/v2.0/ports/b780af01-a3c6-4279-a355-fa3289bc1ec3.json {"port": {"status": "ACTIVE", "binding:host_id": "", "description": "", "allowed_address_pairs": [], "extra_dhcp_opts": [], "updated_at": "2016-04-08T08:59:08", "device_owner": "network:router_interface_distributed", "port_security_enabled": false, "binding:profile": {}, "fixed_ips": [{"subnet_id": "7c57f270-46a9-4cde-a2a7-82e66da1d084", "ip_address": "10.0.0.1"}], "id": "b780af01-a3c6-4279-a355-fa3289bc1ec3", "security_groups": [], "device_id": "48120dd2-2523-4b11-9f67-8ea944d11012", "name": "", "admin_state_up": true, "network_id": "2be9f80a-e3ac-42e6-9249-5ccca241ad85", "dns_name": null, "binding:vif_details": {}, "binding:vnic_type": "normal", "binding:vif_type": "distributed", "tenant_id": "1c9d0fc21afc40a2959d3d3d4acca528", "mac_address": "fa:16:3e:a7:d8:80", "created_at": "2016-04-07T15:05:57"}} Scenario1 - Migration on wrong physical network - High Prio =========================================================== Host1 has physical_interface_mappings: pyhsnet1:eth0, physnet2=eth2 Host2 has physical_interface_mappings: physnet1:eth1, physnet2=eth0 Now Live migration from an instance hosted on host1 (connected to physnet1) to host2 succeeds. Libvirt just migrates the whole server with its domain.xml and the macvtap is plugged on the targets side eth0. Now the instance does not have access to its network anymore, but access to another physical network. The behavior is documented, however this needs to be fixed! Scenario2 - Migration fails - Low Prio ====================================== Host1 has physical_interface_mappings: pyhsnet1:eth0 Host2 has physical_interface_mappings: physnet1:eth1 Let's assume a vlan setup. Let's assume a migration from host1 to host2. Host to does NOT have a interface eth0. Migration will fail in instance will remain active on the source, as nova plug on host2 failed to create a vlan device on eth0. If you have a flat network - definition of he libvirt xml will fail on host2. Two approaches are thinkable * Solve the problem (Scenario 1+2) * just prevent such an invalid migration (let scenario 1 fail like scenario 2 fails today) Solve the problem ================= #1 Solve it in Nova pre live migration -------------------------------------- This would allow migration although physical_interface mappings are different. On pre live migration nova should change the binding:host to the migration target. This will trigger the portbinding and the mech driver which will update the vif_details with the right macvtap source device information. Libvirt can then adapt the migration-xml to reflect the changes. Currently the update of the binding is done in post migration, after migration succeeded. Can we already do it in pre_live_migration and on failure undo it in rollback ? - There's no issue for the reference implementations - But there might be some external SDN Controllers, that might shut down ports on the source host as soon as the host_id is being updated. The alternative would be to allow a port to be bound to multiple hosts simultaneously. So in pre_live migration, nova would add a binding for the target host and in post_live_migration it would remove the original binding. This would require - simultaneous port binding. This will be achieved by https://bugs.launchpad.net/neutron/+bug/1367391 - allow such a binding for compute ports as well - Update APIs to reflect multiple port_bindings   - Create / Update / Show Port   - host_id is not reflect for DVR ports today [1] #2 Solve it in nova post live migration --------------------------------------- Once the migration is started, libvirt takes care of completely executing it. Ideally, libvirt would start migration but would wait with completion, until the portbinding has happened AND until the device has been reported as up. #3 Device renaming in the macvtap agent --------------------------------------- This would allow migration although physical_interface mappings are different. Instead of      physical_interface_mapping = physnet1:eth0 use a      physical_interface_mac_mapping = physnet1:00:11:22:33:44:55:66 #where 00:11:22:33:44:55:66 is the mac address of the interface to use On agent startup, the agent could rename the associated device to "physnet1" (or to some other generic value) that is consistent cross all hosts! We would need to document that this interface should not be used by any other application (that relies on the interface name) #4 Use generic vlan device names -------------------------------- This solves the problem only for vlan networks! For flat networks it still would exist Today, the agent generates the vlan device names like this: for eth1 eth1.<vlan-id>. We could get rid of this pattern and use network-uuid.vlan instead. Where nework-uuid are the first 10 chars of the id. But this would not solve the issue for flat networks. Therefore the device renaming like proposed in #3 would be required. Prevent invalid migration ========================= #1 Let Port binding fail ------------------------ The idea is to detect an invalid migration in the mechanism driver and let port binding fail. This approach has two problems a) Portbinding happens AFTER Migration happened. In post live migration nova requests to update the binding:host-id to the target. But when doing so, the instance is already running on the target host. The binding will fail, but the migration happened, though. b) Detecting a migration in the mech driver is difficult. The idea is to use the PortContext.original port. This works if we add the original port to the context somewhere in the ml2 plugin (see proposed patch). The problem is that this is not an indicator for a live migration. The original port will also be added if scheduling on another node failed before and now the current node is picked. There was no live migration but the PortContext.original port is set to another host. Maybe this can be solved but it's worthwhile mentioning here. #2 Let agent detect invalid migration ------------------------------------- An invalid migration could be detected in the agent to avoid the agent setting the device status to up. But this is too late, as the agent detects the device after migration already started. There is no way to stop it again. References ========== [1] curl -g -i -X GET http://192.168.122.30:9696/v2.0/ports/b780af01-a3c6-4279-a355-fa3289bc1ec3.json {"port": {"status": "ACTIVE", "binding:host_id": "", "description": "", "allowed_address_pairs": [], "extra_dhcp_opts": [], "updated_at": "2016-04-08T08:59:08", "device_owner": "network:router_interface_distributed", "port_security_enabled": false, "binding:profile": {}, "fixed_ips": [{"subnet_id": "7c57f270-46a9-4cde-a2a7-82e66da1d084", "ip_address": "10.0.0.1"}], "id": "b780af01-a3c6-4279-a355-fa3289bc1ec3", "security_groups": [], "device_id": "48120dd2-2523-4b11-9f67-8ea944d11012", "name": "", "admin_state_up": true, "network_id": "2be9f80a-e3ac-42e6-9249-5ccca241ad85", "dns_name": null, "binding:vif_details": {}, "binding:vnic_type": "normal", "binding:vif_type": "distributed", "tenant_id": "1c9d0fc21afc40a2959d3d3d4acca528", "mac_address": "fa:16:3e:a7:d8:80", "created_at": "2016-04-07T15:05:57"}}
2016-04-08 09:54:49 Andreas Scheuring description Scenario1 - Migration on wrong physical network - High Prio =========================================================== Host1 has physical_interface_mappings: pyhsnet1:eth0, physnet2=eth2 Host2 has physical_interface_mappings: physnet1:eth1, physnet2=eth0 Now Live migration from an instance hosted on host1 (connected to physnet1) to host2 succeeds. Libvirt just migrates the whole server with its domain.xml and the macvtap is plugged on the targets side eth0. Now the instance does not have access to its network anymore, but access to another physical network. The behavior is documented, however this needs to be fixed! Scenario2 - Migration fails - Low Prio ====================================== Host1 has physical_interface_mappings: pyhsnet1:eth0 Host2 has physical_interface_mappings: physnet1:eth1 Let's assume a vlan setup. Let's assume a migration from host1 to host2. Host to does NOT have a interface eth0. Migration will fail in instance will remain active on the source, as nova plug on host2 failed to create a vlan device on eth0. If you have a flat network - definition of he libvirt xml will fail on host2. Two approaches are thinkable * Solve the problem (Scenario 1+2) * just prevent such an invalid migration (let scenario 1 fail like scenario 2 fails today) Solve the problem ================= #1 Solve it in Nova pre live migration -------------------------------------- This would allow migration although physical_interface mappings are different. On pre live migration nova should change the binding:host to the migration target. This will trigger the portbinding and the mech driver which will update the vif_details with the right macvtap source device information. Libvirt can then adapt the migration-xml to reflect the changes. Currently the update of the binding is done in post migration, after migration succeeded. Can we already do it in pre_live_migration and on failure undo it in rollback ? - There's no issue for the reference implementations - But there might be some external SDN Controllers, that might shut down ports on the source host as soon as the host_id is being updated. The alternative would be to allow a port to be bound to multiple hosts simultaneously. So in pre_live migration, nova would add a binding for the target host and in post_live_migration it would remove the original binding. This would require - simultaneous port binding. This will be achieved by https://bugs.launchpad.net/neutron/+bug/1367391 - allow such a binding for compute ports as well - Update APIs to reflect multiple port_bindings   - Create / Update / Show Port   - host_id is not reflect for DVR ports today [1] #2 Solve it in nova post live migration --------------------------------------- Once the migration is started, libvirt takes care of completely executing it. Ideally, libvirt would start migration but would wait with completion, until the portbinding has happened AND until the device has been reported as up. #3 Device renaming in the macvtap agent --------------------------------------- This would allow migration although physical_interface mappings are different. Instead of      physical_interface_mapping = physnet1:eth0 use a      physical_interface_mac_mapping = physnet1:00:11:22:33:44:55:66 #where 00:11:22:33:44:55:66 is the mac address of the interface to use On agent startup, the agent could rename the associated device to "physnet1" (or to some other generic value) that is consistent cross all hosts! We would need to document that this interface should not be used by any other application (that relies on the interface name) #4 Use generic vlan device names -------------------------------- This solves the problem only for vlan networks! For flat networks it still would exist Today, the agent generates the vlan device names like this: for eth1 eth1.<vlan-id>. We could get rid of this pattern and use network-uuid.vlan instead. Where nework-uuid are the first 10 chars of the id. But this would not solve the issue for flat networks. Therefore the device renaming like proposed in #3 would be required. Prevent invalid migration ========================= #1 Let Port binding fail ------------------------ The idea is to detect an invalid migration in the mechanism driver and let port binding fail. This approach has two problems a) Portbinding happens AFTER Migration happened. In post live migration nova requests to update the binding:host-id to the target. But when doing so, the instance is already running on the target host. The binding will fail, but the migration happened, though. b) Detecting a migration in the mech driver is difficult. The idea is to use the PortContext.original port. This works if we add the original port to the context somewhere in the ml2 plugin (see proposed patch). The problem is that this is not an indicator for a live migration. The original port will also be added if scheduling on another node failed before and now the current node is picked. There was no live migration but the PortContext.original port is set to another host. Maybe this can be solved but it's worthwhile mentioning here. #2 Let agent detect invalid migration ------------------------------------- An invalid migration could be detected in the agent to avoid the agent setting the device status to up. But this is too late, as the agent detects the device after migration already started. There is no way to stop it again. References ========== [1] curl -g -i -X GET http://192.168.122.30:9696/v2.0/ports/b780af01-a3c6-4279-a355-fa3289bc1ec3.json {"port": {"status": "ACTIVE", "binding:host_id": "", "description": "", "allowed_address_pairs": [], "extra_dhcp_opts": [], "updated_at": "2016-04-08T08:59:08", "device_owner": "network:router_interface_distributed", "port_security_enabled": false, "binding:profile": {}, "fixed_ips": [{"subnet_id": "7c57f270-46a9-4cde-a2a7-82e66da1d084", "ip_address": "10.0.0.1"}], "id": "b780af01-a3c6-4279-a355-fa3289bc1ec3", "security_groups": [], "device_id": "48120dd2-2523-4b11-9f67-8ea944d11012", "name": "", "admin_state_up": true, "network_id": "2be9f80a-e3ac-42e6-9249-5ccca241ad85", "dns_name": null, "binding:vif_details": {}, "binding:vnic_type": "normal", "binding:vif_type": "distributed", "tenant_id": "1c9d0fc21afc40a2959d3d3d4acca528", "mac_address": "fa:16:3e:a7:d8:80", "created_at": "2016-04-07T15:05:57"}} Scenario1 - Migration on wrong physical network - High Prio =========================================================== Host1 has physical_interface_mappings: pyhsnet1:eth0, physnet2=eth2 Host2 has physical_interface_mappings: physnet1:eth1, physnet2=eth0 Now Live migration from an instance hosted on host1 (connected to physnet1) to host2 succeeds. Libvirt just migrates the whole server with its domain.xml and the macvtap is plugged on the targets side eth0. Now the instance does not have access to its network anymore, but access to another physical network. The behavior is documented, however this needs to be fixed! Scenario2 - Migration fails - Low Prio ====================================== Host1 has physical_interface_mappings: pyhsnet1:eth0 Host2 has physical_interface_mappings: physnet1:eth1 Let's assume a vlan setup. Let's assume a migration from host1 to host2. Host to does NOT have a interface eth0. Migration will fail in instance will remain active on the source, as nova plug on host2 failed to create a vlan device on eth0. If you have a flat network - definition of he libvirt xml will fail on host2. Two approaches are thinkable * Solve the problem (Scenario 1+2) * just prevent such an invalid migration (let scenario 1 fail like scenario 2 fails today) Solve the problem ================= #1 Solve it in Nova pre live migration -------------------------------------- This would allow migration although physical_interface mappings are different. a) On pre live migration nova should change the binding:host to the migration target. This will trigger the portbinding and the mech driver which will update the vif_details with the right macvtap source device information. Libvirt can then adapt the migration-xml to reflect the changes. Currently the update of the binding is done in post migration, after migration succeeded. Can we already do it in pre_live_migration and on failure undo it in rollback ? - There's no issue for the reference implementations - But there might be some external SDN Controllers, that might shut down ports on the source host as soon as the host_id is being updated. b) The alternative would be to allow a port to be bound to multiple hosts simultaneously. So in pre_live migration, nova would add a binding for the target host and in post_live_migration it would remove the original binding. This would require - simultaneous port binding. This will be achieved by https://bugs.launchpad.net/neutron/+bug/1367391 - allow such a binding for compute ports as well - Update APIs to reflect multiple port_bindings   - Create / Update / Show Port   - host_id is not reflect for DVR ports today [1] #2 Solve it in nova post live migration --------------------------------------- Once the migration is started, libvirt takes care of completely executing it. Ideally, libvirt would start migration but would wait with completion, until the portbinding has happened AND until the device has been reported as up. #3 Device renaming in the macvtap agent --------------------------------------- This would allow migration although physical_interface mappings are different. Instead of      physical_interface_mapping = physnet1:eth0 use a      physical_interface_mac_mapping = physnet1:00:11:22:33:44:55:66 #where 00:11:22:33:44:55:66 is the mac address of the interface to use On agent startup, the agent could rename the associated device to "physnet1" (or to some other generic value) that is consistent cross all hosts! We would need to document that this interface should not be used by any other application (that relies on the interface name) #4 Use generic vlan device names -------------------------------- This solves the problem only for vlan networks! For flat networks it still would exist Today, the agent generates the vlan device names like this: for eth1 eth1.<vlan-id>. We could get rid of this pattern and use network-uuid.vlan instead. Where nework-uuid are the first 10 chars of the id. But this would not solve the issue for flat networks. Therefore the device renaming like proposed in #3 would be required. Prevent invalid migration ========================= #1 Let Port binding fail (not working) ------------------------------------- The idea is to detect an invalid migration in the mechanism driver and let port binding fail. This approach has two problems a) Portbinding happens AFTER Migration happened. In post live migration nova requests to update the binding:host-id to the target. But when doing so, the instance is already running on the target host. The binding will fail, but the migration happened, though. b) Detecting a migration in the mech driver is difficult. The idea is to use the PortContext.original port. This works if we add the original port to the context somewhere in the ml2 plugin (see proposed patch). The problem is that this is not an indicator for a live migration. The original port will also be added if scheduling on another node failed before and now the current node is picked. There was no live migration but the PortContext.original port is set to another host. Maybe this can be solved but it's worthwhile mentioning here. see patch https://review.openstack.org/293404 #2 Let agent detect invalid migration (not working) --------------------------------------------------- An invalid migration could be detected in the agent to avoid the agent setting the device status to up. But this is too late, as the agent detects the device after migration already started. There is no way to stop it again. see patch https://review.openstack.org/293403 References ========== [1] curl -g -i -X GET http://192.168.122.30:9696/v2.0/ports/b780af01-a3c6-4279-a355-fa3289bc1ec3.json {"port": {"status": "ACTIVE", "binding:host_id": "", "description": "", "allowed_address_pairs": [], "extra_dhcp_opts": [], "updated_at": "2016-04-08T08:59:08", "device_owner": "network:router_interface_distributed", "port_security_enabled": false, "binding:profile": {}, "fixed_ips": [{"subnet_id": "7c57f270-46a9-4cde-a2a7-82e66da1d084", "ip_address": "10.0.0.1"}], "id": "b780af01-a3c6-4279-a355-fa3289bc1ec3", "security_groups": [], "device_id": "48120dd2-2523-4b11-9f67-8ea944d11012", "name": "", "admin_state_up": true, "network_id": "2be9f80a-e3ac-42e6-9249-5ccca241ad85", "dns_name": null, "binding:vif_details": {}, "binding:vnic_type": "normal", "binding:vif_type": "distributed", "tenant_id": "1c9d0fc21afc40a2959d3d3d4acca528", "mac_address": "fa:16:3e:a7:d8:80", "created_at": "2016-04-07T15:05:57"}}
2016-04-08 10:56:05 Andreas Scheuring description Scenario1 - Migration on wrong physical network - High Prio =========================================================== Host1 has physical_interface_mappings: pyhsnet1:eth0, physnet2=eth2 Host2 has physical_interface_mappings: physnet1:eth1, physnet2=eth0 Now Live migration from an instance hosted on host1 (connected to physnet1) to host2 succeeds. Libvirt just migrates the whole server with its domain.xml and the macvtap is plugged on the targets side eth0. Now the instance does not have access to its network anymore, but access to another physical network. The behavior is documented, however this needs to be fixed! Scenario2 - Migration fails - Low Prio ====================================== Host1 has physical_interface_mappings: pyhsnet1:eth0 Host2 has physical_interface_mappings: physnet1:eth1 Let's assume a vlan setup. Let's assume a migration from host1 to host2. Host to does NOT have a interface eth0. Migration will fail in instance will remain active on the source, as nova plug on host2 failed to create a vlan device on eth0. If you have a flat network - definition of he libvirt xml will fail on host2. Two approaches are thinkable * Solve the problem (Scenario 1+2) * just prevent such an invalid migration (let scenario 1 fail like scenario 2 fails today) Solve the problem ================= #1 Solve it in Nova pre live migration -------------------------------------- This would allow migration although physical_interface mappings are different. a) On pre live migration nova should change the binding:host to the migration target. This will trigger the portbinding and the mech driver which will update the vif_details with the right macvtap source device information. Libvirt can then adapt the migration-xml to reflect the changes. Currently the update of the binding is done in post migration, after migration succeeded. Can we already do it in pre_live_migration and on failure undo it in rollback ? - There's no issue for the reference implementations - But there might be some external SDN Controllers, that might shut down ports on the source host as soon as the host_id is being updated. b) The alternative would be to allow a port to be bound to multiple hosts simultaneously. So in pre_live migration, nova would add a binding for the target host and in post_live_migration it would remove the original binding. This would require - simultaneous port binding. This will be achieved by https://bugs.launchpad.net/neutron/+bug/1367391 - allow such a binding for compute ports as well - Update APIs to reflect multiple port_bindings   - Create / Update / Show Port   - host_id is not reflect for DVR ports today [1] #2 Solve it in nova post live migration --------------------------------------- Once the migration is started, libvirt takes care of completely executing it. Ideally, libvirt would start migration but would wait with completion, until the portbinding has happened AND until the device has been reported as up. #3 Device renaming in the macvtap agent --------------------------------------- This would allow migration although physical_interface mappings are different. Instead of      physical_interface_mapping = physnet1:eth0 use a      physical_interface_mac_mapping = physnet1:00:11:22:33:44:55:66 #where 00:11:22:33:44:55:66 is the mac address of the interface to use On agent startup, the agent could rename the associated device to "physnet1" (or to some other generic value) that is consistent cross all hosts! We would need to document that this interface should not be used by any other application (that relies on the interface name) #4 Use generic vlan device names -------------------------------- This solves the problem only for vlan networks! For flat networks it still would exist Today, the agent generates the vlan device names like this: for eth1 eth1.<vlan-id>. We could get rid of this pattern and use network-uuid.vlan instead. Where nework-uuid are the first 10 chars of the id. But this would not solve the issue for flat networks. Therefore the device renaming like proposed in #3 would be required. Prevent invalid migration ========================= #1 Let Port binding fail (not working) ------------------------------------- The idea is to detect an invalid migration in the mechanism driver and let port binding fail. This approach has two problems a) Portbinding happens AFTER Migration happened. In post live migration nova requests to update the binding:host-id to the target. But when doing so, the instance is already running on the target host. The binding will fail, but the migration happened, though. b) Detecting a migration in the mech driver is difficult. The idea is to use the PortContext.original port. This works if we add the original port to the context somewhere in the ml2 plugin (see proposed patch). The problem is that this is not an indicator for a live migration. The original port will also be added if scheduling on another node failed before and now the current node is picked. There was no live migration but the PortContext.original port is set to another host. Maybe this can be solved but it's worthwhile mentioning here. see patch https://review.openstack.org/293404 #2 Let agent detect invalid migration (not working) --------------------------------------------------- An invalid migration could be detected in the agent to avoid the agent setting the device status to up. But this is too late, as the agent detects the device after migration already started. There is no way to stop it again. see patch https://review.openstack.org/293403 References ========== [1] curl -g -i -X GET http://192.168.122.30:9696/v2.0/ports/b780af01-a3c6-4279-a355-fa3289bc1ec3.json {"port": {"status": "ACTIVE", "binding:host_id": "", "description": "", "allowed_address_pairs": [], "extra_dhcp_opts": [], "updated_at": "2016-04-08T08:59:08", "device_owner": "network:router_interface_distributed", "port_security_enabled": false, "binding:profile": {}, "fixed_ips": [{"subnet_id": "7c57f270-46a9-4cde-a2a7-82e66da1d084", "ip_address": "10.0.0.1"}], "id": "b780af01-a3c6-4279-a355-fa3289bc1ec3", "security_groups": [], "device_id": "48120dd2-2523-4b11-9f67-8ea944d11012", "name": "", "admin_state_up": true, "network_id": "2be9f80a-e3ac-42e6-9249-5ccca241ad85", "dns_name": null, "binding:vif_details": {}, "binding:vnic_type": "normal", "binding:vif_type": "distributed", "tenant_id": "1c9d0fc21afc40a2959d3d3d4acca528", "mac_address": "fa:16:3e:a7:d8:80", "created_at": "2016-04-07T15:05:57"}} Scenario1 - Migration on wrong physical network - High Prio =========================================================== Host1 has physical_interface_mappings: pyhsnet1:eth0, physnet2=eth2 Host2 has physical_interface_mappings: physnet1:eth1, physnet2=eth0 Now Live migration from an instance hosted on host1 (connected to physnet1) to host2 succeeds. Libvirt just migrates the whole server with its domain.xml and the macvtap is plugged on the targets side eth0. Now the instance does not have access to its network anymore, but access to another physical network. The behavior is documented, however this needs to be fixed! Scenario2 - Migration fails - Low Prio ====================================== Host1 has physical_interface_mappings: pyhsnet1:eth0 Host2 has physical_interface_mappings: physnet1:eth1 Let's assume a vlan setup. Let's assume a migration from host1 to host2. Host to does NOT have a interface eth0. Migration will fail in instance will remain active on the source, as nova plug on host2 failed to create a vlan device on eth0. If you have a flat network - definition of he libvirt xml will fail on host2. Two approaches are thinkable * Solve the problem (Scenario 1+2) * just prevent such an invalid migration (let scenario 1 fail like scenario 2 fails today) Solve the problem ================= #1 Solve it in Nova pre live migration -------------------------------------- This would allow migration although physical_interface mappings are different. a) On pre live migration nova should change the binding:host to the migration target. This will trigger the portbinding and the mech driver which will update the vif_details with the right macvtap source device information. Libvirt can then adapt the migration-xml to reflect the changes. Currently the update of the binding is done in post migration, after migration succeeded. Can we already do it in pre_live_migration and on failure undo it in rollback ? - There's no issue for the reference implementations - But there might be some mechanisms for external SDN Controllers that might shut down ports on the source host as soon as the host_id is being updated. On the other hand, if controller rely on this mechanism, they will set the port up a little too late today, as the update host_id is sent after live migration succeeded. b) The alternative would be to allow a port to be bound to multiple hosts simultaneously. So in pre_live migration, nova would add a binding for the target host and in post_live_migration it would remove the original binding. This would require - simultaneous port binding. This will be achieved by https://bugs.launchpad.net/neutron/+bug/1367391 - allow such a binding for compute ports as well - Update APIs to reflect multiple port_bindings   - Create / Update / Show Port   - host_id is not reflect for DVR ports today [1] #2 Solve it in nova post live migration --------------------------------------- Once the migration is started, libvirt takes care of completely executing it. Ideally, libvirt would start migration but would wait with completion, until the portbinding has happened AND until the device has been reported as up. #3 Device renaming in the macvtap agent --------------------------------------- This would allow migration although physical_interface mappings are different. Instead of      physical_interface_mapping = physnet1:eth0 use a      physical_interface_mac_mapping = physnet1:00:11:22:33:44:55:66 #where 00:11:22:33:44:55:66 is the mac address of the interface to use On agent startup, the agent could rename the associated device to "physnet1" (or to some other generic value) that is consistent cross all hosts! We would need to document that this interface should not be used by any other application (that relies on the interface name) #4 Use generic vlan device names -------------------------------- This solves the problem only for vlan networks! For flat networks it still would exist Today, the agent generates the vlan device names like this: for eth1 eth1.<vlan-id>. We could get rid of this pattern and use network-uuid.vlan instead. Where nework-uuid are the first 10 chars of the id. But this would not solve the issue for flat networks. Therefore the device renaming like proposed in #3 would be required. Prevent invalid migration ========================= #1 Let Port binding fail (not working) ------------------------------------- The idea is to detect an invalid migration in the mechanism driver and let port binding fail. This approach has two problems a) Portbinding happens AFTER Migration happened. In post live migration nova requests to update the binding:host-id to the target. But when doing so, the instance is already running on the target host. The binding will fail, but the migration happened, though. b) Detecting a migration in the mech driver is difficult. The idea is to use the PortContext.original port. This works if we add the original port to the context somewhere in the ml2 plugin (see proposed patch). The problem is that this is not an indicator for a live migration. The original port will also be added if scheduling on another node failed before and now the current node is picked. There was no live migration but the PortContext.original port is set to another host. Maybe this can be solved but it's worthwhile mentioning here. see patch https://review.openstack.org/293404 #2 Let agent detect invalid migration (not working) --------------------------------------------------- An invalid migration could be detected in the agent to avoid the agent setting the device status to up. But this is too late, as the agent detects the device after migration already started. There is no way to stop it again. see patch https://review.openstack.org/293403 References ========== [1] curl -g -i -X GET http://192.168.122.30:9696/v2.0/ports/b780af01-a3c6-4279-a355-fa3289bc1ec3.json {"port": {"status": "ACTIVE", "binding:host_id": "", "description": "", "allowed_address_pairs": [], "extra_dhcp_opts": [], "updated_at": "2016-04-08T08:59:08", "device_owner": "network:router_interface_distributed", "port_security_enabled": false, "binding:profile": {}, "fixed_ips": [{"subnet_id": "7c57f270-46a9-4cde-a2a7-82e66da1d084", "ip_address": "10.0.0.1"}], "id": "b780af01-a3c6-4279-a355-fa3289bc1ec3", "security_groups": [], "device_id": "48120dd2-2523-4b11-9f67-8ea944d11012", "name": "", "admin_state_up": true, "network_id": "2be9f80a-e3ac-42e6-9249-5ccca241ad85", "dns_name": null, "binding:vif_details": {}, "binding:vnic_type": "normal", "binding:vif_type": "distributed", "tenant_id": "1c9d0fc21afc40a2959d3d3d4acca528", "mac_address": "fa:16:3e:a7:d8:80", "created_at": "2016-04-07T15:05:57"}}
2016-04-11 07:43:21 Andreas Scheuring description Scenario1 - Migration on wrong physical network - High Prio =========================================================== Host1 has physical_interface_mappings: pyhsnet1:eth0, physnet2=eth2 Host2 has physical_interface_mappings: physnet1:eth1, physnet2=eth0 Now Live migration from an instance hosted on host1 (connected to physnet1) to host2 succeeds. Libvirt just migrates the whole server with its domain.xml and the macvtap is plugged on the targets side eth0. Now the instance does not have access to its network anymore, but access to another physical network. The behavior is documented, however this needs to be fixed! Scenario2 - Migration fails - Low Prio ====================================== Host1 has physical_interface_mappings: pyhsnet1:eth0 Host2 has physical_interface_mappings: physnet1:eth1 Let's assume a vlan setup. Let's assume a migration from host1 to host2. Host to does NOT have a interface eth0. Migration will fail in instance will remain active on the source, as nova plug on host2 failed to create a vlan device on eth0. If you have a flat network - definition of he libvirt xml will fail on host2. Two approaches are thinkable * Solve the problem (Scenario 1+2) * just prevent such an invalid migration (let scenario 1 fail like scenario 2 fails today) Solve the problem ================= #1 Solve it in Nova pre live migration -------------------------------------- This would allow migration although physical_interface mappings are different. a) On pre live migration nova should change the binding:host to the migration target. This will trigger the portbinding and the mech driver which will update the vif_details with the right macvtap source device information. Libvirt can then adapt the migration-xml to reflect the changes. Currently the update of the binding is done in post migration, after migration succeeded. Can we already do it in pre_live_migration and on failure undo it in rollback ? - There's no issue for the reference implementations - But there might be some mechanisms for external SDN Controllers that might shut down ports on the source host as soon as the host_id is being updated. On the other hand, if controller rely on this mechanism, they will set the port up a little too late today, as the update host_id is sent after live migration succeeded. b) The alternative would be to allow a port to be bound to multiple hosts simultaneously. So in pre_live migration, nova would add a binding for the target host and in post_live_migration it would remove the original binding. This would require - simultaneous port binding. This will be achieved by https://bugs.launchpad.net/neutron/+bug/1367391 - allow such a binding for compute ports as well - Update APIs to reflect multiple port_bindings   - Create / Update / Show Port   - host_id is not reflect for DVR ports today [1] #2 Solve it in nova post live migration --------------------------------------- Once the migration is started, libvirt takes care of completely executing it. Ideally, libvirt would start migration but would wait with completion, until the portbinding has happened AND until the device has been reported as up. #3 Device renaming in the macvtap agent --------------------------------------- This would allow migration although physical_interface mappings are different. Instead of      physical_interface_mapping = physnet1:eth0 use a      physical_interface_mac_mapping = physnet1:00:11:22:33:44:55:66 #where 00:11:22:33:44:55:66 is the mac address of the interface to use On agent startup, the agent could rename the associated device to "physnet1" (or to some other generic value) that is consistent cross all hosts! We would need to document that this interface should not be used by any other application (that relies on the interface name) #4 Use generic vlan device names -------------------------------- This solves the problem only for vlan networks! For flat networks it still would exist Today, the agent generates the vlan device names like this: for eth1 eth1.<vlan-id>. We could get rid of this pattern and use network-uuid.vlan instead. Where nework-uuid are the first 10 chars of the id. But this would not solve the issue for flat networks. Therefore the device renaming like proposed in #3 would be required. Prevent invalid migration ========================= #1 Let Port binding fail (not working) ------------------------------------- The idea is to detect an invalid migration in the mechanism driver and let port binding fail. This approach has two problems a) Portbinding happens AFTER Migration happened. In post live migration nova requests to update the binding:host-id to the target. But when doing so, the instance is already running on the target host. The binding will fail, but the migration happened, though. b) Detecting a migration in the mech driver is difficult. The idea is to use the PortContext.original port. This works if we add the original port to the context somewhere in the ml2 plugin (see proposed patch). The problem is that this is not an indicator for a live migration. The original port will also be added if scheduling on another node failed before and now the current node is picked. There was no live migration but the PortContext.original port is set to another host. Maybe this can be solved but it's worthwhile mentioning here. see patch https://review.openstack.org/293404 #2 Let agent detect invalid migration (not working) --------------------------------------------------- An invalid migration could be detected in the agent to avoid the agent setting the device status to up. But this is too late, as the agent detects the device after migration already started. There is no way to stop it again. see patch https://review.openstack.org/293403 References ========== [1] curl -g -i -X GET http://192.168.122.30:9696/v2.0/ports/b780af01-a3c6-4279-a355-fa3289bc1ec3.json {"port": {"status": "ACTIVE", "binding:host_id": "", "description": "", "allowed_address_pairs": [], "extra_dhcp_opts": [], "updated_at": "2016-04-08T08:59:08", "device_owner": "network:router_interface_distributed", "port_security_enabled": false, "binding:profile": {}, "fixed_ips": [{"subnet_id": "7c57f270-46a9-4cde-a2a7-82e66da1d084", "ip_address": "10.0.0.1"}], "id": "b780af01-a3c6-4279-a355-fa3289bc1ec3", "security_groups": [], "device_id": "48120dd2-2523-4b11-9f67-8ea944d11012", "name": "", "admin_state_up": true, "network_id": "2be9f80a-e3ac-42e6-9249-5ccca241ad85", "dns_name": null, "binding:vif_details": {}, "binding:vnic_type": "normal", "binding:vif_type": "distributed", "tenant_id": "1c9d0fc21afc40a2959d3d3d4acca528", "mac_address": "fa:16:3e:a7:d8:80", "created_at": "2016-04-07T15:05:57"}} Scenario1 - Migration on wrong physical network - High Prio =========================================================== Host1 has physical_interface_mappings: pyhsnet1:eth0, physnet2=eth2 Host2 has physical_interface_mappings: physnet1:eth1, physnet2=eth0 Now Live migration from an instance hosted on host1 (connected to physnet1) to host2 succeeds. Libvirt just migrates the whole server with its domain.xml and the macvtap is plugged on the targets side eth0. Now the instance does not have access to its network anymore, but access to another physical network. The behavior is documented, however this needs to be fixed! Scenario2 - Migration fails - Low Prio ====================================== Host1 has physical_interface_mappings: pyhsnet1:eth0 Host2 has physical_interface_mappings: physnet1:eth1 Let's assume a vlan setup. Let's assume a migration from host1 to host2. Host to does NOT have a interface eth0. Migration will fail in instance will remain active on the source, as nova plug on host2 failed to create a vlan device on eth0. If you have a flat network - definition of he libvirt xml will fail on host2. Two approaches are thinkable * Solve the problem (Scenario 1+2) * just prevent such an invalid migration (let scenario 1 fail like scenario 2 fails today) Solve the problem ================= #1 Solve it in Nova pre live migration -------------------------------------- This would allow migration although physical_interface mappings are different. a) On pre live migration nova should change the binding:host to the migration target. This will trigger the portbinding and the mech driver which will update the vif_details with the right macvtap source device information. Libvirt can then adapt the migration-xml to reflect the changes. Currently the update of the binding is done in post migration, after migration succeeded. Can we already do it in pre_live_migration and on failure undo it in rollback ? - There's no issue for the reference implementations - See the prototype: https://review.openstack.org/301090 - But there might be some mechanisms for external SDN Controllers that might shut down ports on the source host as soon as the host_id is being updated. On the other hand, if controller rely on this mechanism, they will set the port up a little too late today, as the update host_id is sent after live migration succeeded. b) The alternative would be to allow a port to be bound to multiple hosts simultaneously. So in pre_live migration, nova would add a binding for the target host and in post_live_migration it would remove the original binding. This would require - simultaneous port binding. This will be achieved by https://bugs.launchpad.net/neutron/+bug/1367391 - allow such a binding for compute ports as well - Update APIs to reflect multiple port_bindings   - Create / Update / Show Port   - host_id is not reflect for DVR ports today [1] #2 Solve it in nova post live migration --------------------------------------- Once the migration is started, libvirt takes care of completely executing it. Ideally, libvirt would start migration but would wait with completion, until the portbinding has happened AND until the device has been reported as up. #3 Device renaming in the macvtap agent --------------------------------------- This would allow migration although physical_interface mappings are different. Instead of      physical_interface_mapping = physnet1:eth0 use a      physical_interface_mac_mapping = physnet1:00:11:22:33:44:55:66 #where 00:11:22:33:44:55:66 is the mac address of the interface to use On agent startup, the agent could rename the associated device to "physnet1" (or to some other generic value) that is consistent cross all hosts! We would need to document that this interface should not be used by any other application (that relies on the interface name) #4 Use generic vlan device names -------------------------------- This solves the problem only for vlan networks! For flat networks it still would exist Today, the agent generates the vlan device names like this: for eth1 eth1.<vlan-id>. We could get rid of this pattern and use network-uuid.vlan instead. Where nework-uuid are the first 10 chars of the id. But this would not solve the issue for flat networks. Therefore the device renaming like proposed in #3 would be required. Prevent invalid migration ========================= #1 Let Port binding fail (not working) ------------------------------------- The idea is to detect an invalid migration in the mechanism driver and let port binding fail. This approach has two problems a) Portbinding happens AFTER Migration happened. In post live migration nova requests to update the binding:host-id to the target. But when doing so, the instance is already running on the target host. The binding will fail, but the migration happened, though. b) Detecting a migration in the mech driver is difficult. The idea is to use the PortContext.original port. This works if we add the original port to the context somewhere in the ml2 plugin (see proposed patch). The problem is that this is not an indicator for a live migration. The original port will also be added if scheduling on another node failed before and now the current node is picked. There was no live migration but the PortContext.original port is set to another host. Maybe this can be solved but it's worthwhile mentioning here. see patch https://review.openstack.org/293404 #2 Let agent detect invalid migration (not working) --------------------------------------------------- An invalid migration could be detected in the agent to avoid the agent setting the device status to up. But this is too late, as the agent detects the device after migration already started. There is no way to stop it again. see patch https://review.openstack.org/293403 References ========== [1] curl -g -i -X GET http://192.168.122.30:9696/v2.0/ports/b780af01-a3c6-4279-a355-fa3289bc1ec3.json {"port": {"status": "ACTIVE", "binding:host_id": "", "description": "", "allowed_address_pairs": [], "extra_dhcp_opts": [], "updated_at": "2016-04-08T08:59:08", "device_owner": "network:router_interface_distributed", "port_security_enabled": false, "binding:profile": {}, "fixed_ips": [{"subnet_id": "7c57f270-46a9-4cde-a2a7-82e66da1d084", "ip_address": "10.0.0.1"}], "id": "b780af01-a3c6-4279-a355-fa3289bc1ec3", "security_groups": [], "device_id": "48120dd2-2523-4b11-9f67-8ea944d11012", "name": "", "admin_state_up": true, "network_id": "2be9f80a-e3ac-42e6-9249-5ccca241ad85", "dns_name": null, "binding:vif_details": {}, "binding:vnic_type": "normal", "binding:vif_type": "distributed", "tenant_id": "1c9d0fc21afc40a2959d3d3d4acca528", "mac_address": "fa:16:3e:a7:d8:80", "created_at": "2016-04-07T15:05:57"}}
2016-04-11 11:54:39 Andreas Scheuring description Scenario1 - Migration on wrong physical network - High Prio =========================================================== Host1 has physical_interface_mappings: pyhsnet1:eth0, physnet2=eth2 Host2 has physical_interface_mappings: physnet1:eth1, physnet2=eth0 Now Live migration from an instance hosted on host1 (connected to physnet1) to host2 succeeds. Libvirt just migrates the whole server with its domain.xml and the macvtap is plugged on the targets side eth0. Now the instance does not have access to its network anymore, but access to another physical network. The behavior is documented, however this needs to be fixed! Scenario2 - Migration fails - Low Prio ====================================== Host1 has physical_interface_mappings: pyhsnet1:eth0 Host2 has physical_interface_mappings: physnet1:eth1 Let's assume a vlan setup. Let's assume a migration from host1 to host2. Host to does NOT have a interface eth0. Migration will fail in instance will remain active on the source, as nova plug on host2 failed to create a vlan device on eth0. If you have a flat network - definition of he libvirt xml will fail on host2. Two approaches are thinkable * Solve the problem (Scenario 1+2) * just prevent such an invalid migration (let scenario 1 fail like scenario 2 fails today) Solve the problem ================= #1 Solve it in Nova pre live migration -------------------------------------- This would allow migration although physical_interface mappings are different. a) On pre live migration nova should change the binding:host to the migration target. This will trigger the portbinding and the mech driver which will update the vif_details with the right macvtap source device information. Libvirt can then adapt the migration-xml to reflect the changes. Currently the update of the binding is done in post migration, after migration succeeded. Can we already do it in pre_live_migration and on failure undo it in rollback ? - There's no issue for the reference implementations - See the prototype: https://review.openstack.org/301090 - But there might be some mechanisms for external SDN Controllers that might shut down ports on the source host as soon as the host_id is being updated. On the other hand, if controller rely on this mechanism, they will set the port up a little too late today, as the update host_id is sent after live migration succeeded. b) The alternative would be to allow a port to be bound to multiple hosts simultaneously. So in pre_live migration, nova would add a binding for the target host and in post_live_migration it would remove the original binding. This would require - simultaneous port binding. This will be achieved by https://bugs.launchpad.net/neutron/+bug/1367391 - allow such a binding for compute ports as well - Update APIs to reflect multiple port_bindings   - Create / Update / Show Port   - host_id is not reflect for DVR ports today [1] #2 Solve it in nova post live migration --------------------------------------- Once the migration is started, libvirt takes care of completely executing it. Ideally, libvirt would start migration but would wait with completion, until the portbinding has happened AND until the device has been reported as up. #3 Device renaming in the macvtap agent --------------------------------------- This would allow migration although physical_interface mappings are different. Instead of      physical_interface_mapping = physnet1:eth0 use a      physical_interface_mac_mapping = physnet1:00:11:22:33:44:55:66 #where 00:11:22:33:44:55:66 is the mac address of the interface to use On agent startup, the agent could rename the associated device to "physnet1" (or to some other generic value) that is consistent cross all hosts! We would need to document that this interface should not be used by any other application (that relies on the interface name) #4 Use generic vlan device names -------------------------------- This solves the problem only for vlan networks! For flat networks it still would exist Today, the agent generates the vlan device names like this: for eth1 eth1.<vlan-id>. We could get rid of this pattern and use network-uuid.vlan instead. Where nework-uuid are the first 10 chars of the id. But this would not solve the issue for flat networks. Therefore the device renaming like proposed in #3 would be required. Prevent invalid migration ========================= #1 Let Port binding fail (not working) ------------------------------------- The idea is to detect an invalid migration in the mechanism driver and let port binding fail. This approach has two problems a) Portbinding happens AFTER Migration happened. In post live migration nova requests to update the binding:host-id to the target. But when doing so, the instance is already running on the target host. The binding will fail, but the migration happened, though. b) Detecting a migration in the mech driver is difficult. The idea is to use the PortContext.original port. This works if we add the original port to the context somewhere in the ml2 plugin (see proposed patch). The problem is that this is not an indicator for a live migration. The original port will also be added if scheduling on another node failed before and now the current node is picked. There was no live migration but the PortContext.original port is set to another host. Maybe this can be solved but it's worthwhile mentioning here. see patch https://review.openstack.org/293404 #2 Let agent detect invalid migration (not working) --------------------------------------------------- An invalid migration could be detected in the agent to avoid the agent setting the device status to up. But this is too late, as the agent detects the device after migration already started. There is no way to stop it again. see patch https://review.openstack.org/293403 References ========== [1] curl -g -i -X GET http://192.168.122.30:9696/v2.0/ports/b780af01-a3c6-4279-a355-fa3289bc1ec3.json {"port": {"status": "ACTIVE", "binding:host_id": "", "description": "", "allowed_address_pairs": [], "extra_dhcp_opts": [], "updated_at": "2016-04-08T08:59:08", "device_owner": "network:router_interface_distributed", "port_security_enabled": false, "binding:profile": {}, "fixed_ips": [{"subnet_id": "7c57f270-46a9-4cde-a2a7-82e66da1d084", "ip_address": "10.0.0.1"}], "id": "b780af01-a3c6-4279-a355-fa3289bc1ec3", "security_groups": [], "device_id": "48120dd2-2523-4b11-9f67-8ea944d11012", "name": "", "admin_state_up": true, "network_id": "2be9f80a-e3ac-42e6-9249-5ccca241ad85", "dns_name": null, "binding:vif_details": {}, "binding:vnic_type": "normal", "binding:vif_type": "distributed", "tenant_id": "1c9d0fc21afc40a2959d3d3d4acca528", "mac_address": "fa:16:3e:a7:d8:80", "created_at": "2016-04-07T15:05:57"}} Scenario1 - Migration on wrong physical network - High Prio =========================================================== Host1 has physical_interface_mappings: pyhsnet1:eth0, physnet2=eth2 Host2 has physical_interface_mappings: physnet1:eth1, physnet2=eth0 Now Live migration from an instance hosted on host1 (connected to physnet1) to host2 succeeds. Libvirt just migrates the whole server with its domain.xml and the macvtap is plugged on the targets side eth0. Now the instance does not have access to its network anymore, but access to another physical network. The behavior is documented, however this needs to be fixed! Scenario2 - Migration fails - Low Prio ====================================== Host1 has physical_interface_mappings: pyhsnet1:eth0 Host2 has physical_interface_mappings: physnet1:eth1 Let's assume a vlan setup. Let's assume a migration from host1 to host2. Host to does NOT have a interface eth0. Migration will fail in instance will remain active on the source, as nova plug on host2 failed to create a vlan device on eth0. If you have a flat network - definition of he libvirt xml will fail on host2. Two approaches are thinkable * Solve the problem (Scenario 1+2) * just prevent such an invalid migration (let scenario 1 fail like scenario 2 fails today) Solve the problem ================= #1 Solve it in Nova pre live migration -------------------------------------- This would allow migration although physical_interface mappings are different. a) On pre live migration nova should change the binding:host to the migration target. This will trigger the portbinding and the mech driver which will update the vif_details with the right macvtap source device information. Libvirt can then adapt the migration-xml to reflect the changes. Currently the update of the binding is done in post migration, after migration succeeded. Can we already do it in pre_live_migration and on failure undo it in rollback ? - There's no issue for the reference implementations - See the prototype: https://review.openstack.org/301090 - But there might be some mechanisms for external SDN Controllers that might shut down ports on the source host as soon as the host_id is being updated. On the other hand, if controller rely on this mechanism, they will set the port up a little too late today, as the update host_id is sent after live migration succeeded. b) The alternative would be to allow a port to be bound to multiple hosts simultaneously. So in pre_live migration, nova would add a binding for the target host and in post_live_migration it would remove the original binding. This would require - simultaneous port binding. This will be achieved by https://bugs.launchpad.net/neutron/+bug/1367391 - allow such a binding for compute ports as well - Update APIs to reflect multiple port_bindings   - Create / Update / Show Port   - host_id is not reflect for DVR ports today [1] #2 Moved to Prevent section --------------------------- #3 Device renaming in the macvtap agent --------------------------------------- This would allow migration although physical_interface mappings are different. Instead of      physical_interface_mapping = physnet1:eth0 use a      physical_interface_mac_mapping = physnet1:00:11:22:33:44:55:66 #where 00:11:22:33:44:55:66 is the mac address of the interface to use On agent startup, the agent could rename the associated device to "physnet1" (or to some other generic value) that is consistent cross all hosts! We would need to document that this interface should not be used by any other application (that relies on the interface name) #4 Use generic vlan device names -------------------------------- This solves the problem only for vlan networks! For flat networks it still would exist Today, the agent generates the vlan device names like this: for eth1 eth1.<vlan-id>. We could get rid of this pattern and use network-uuid.vlan instead. Where nework-uuid are the first 10 chars of the id. But this would not solve the issue for flat networks. Therefore the device renaming like proposed in #3 would be required. Prevent invalid migration ========================= #1 Let Port binding fail (not working) ------------------------------------- The idea is to detect an invalid migration in the mechanism driver and let port binding fail. This approach has two problems a) Portbinding happens AFTER Migration happened. In post live migration nova requests to update the binding:host-id to the target. But when doing so, the instance is already running on the target host. The binding will fail, but the migration happened, though. b) Detecting a migration in the mech driver is difficult. The idea is to use the PortContext.original port. This works if we add the original port to the context somewhere in the ml2 plugin (see proposed patch). The problem is that this is not an indicator for a live migration. The original port will also be added if scheduling on another node failed before and now the current node is picked. There was no live migration but the PortContext.original port is set to another host. Maybe this can be solved but it's worthwhile mentioning here. see patch https://review.openstack.org/293404 #2 Let agent detect invalid migration (not working) --------------------------------------------------- An invalid migration could be detected in the agent to avoid the agent setting the device status to up. But this is too late, as the agent detects the device after migration already started. There is no way to stop it again. see patch https://review.openstack.org/293403 #3 Solve it in nova post live migration ------------------------------------ The idea is, that nova starts the migration and then listens on plug_vif event that is emitted by neutron after the agent reported the device as up. Nova also waits for the portbinding to occur. If one of both runs into a timeout or fails, either the migration should be rolled back (if still possible) or the instance should be set into error state and the network locked down (which is the default for ovs - not sure about other right now). There are some patchsets out that try to achieve something similar, but for the ovs-hybrid plug only. For others it's much more complicated, as the agent will only report the device up after it occured on the target (after migration already started) https://review.openstack.org/246898 References ========== [1] curl -g -i -X GET http://192.168.122.30:9696/v2.0/ports/b780af01-a3c6-4279-a355-fa3289bc1ec3.json {"port": {"status": "ACTIVE", "binding:host_id": "", "description": "", "allowed_address_pairs": [], "extra_dhcp_opts": [], "updated_at": "2016-04-08T08:59:08", "device_owner": "network:router_interface_distributed", "port_security_enabled": false, "binding:profile": {}, "fixed_ips": [{"subnet_id": "7c57f270-46a9-4cde-a2a7-82e66da1d084", "ip_address": "10.0.0.1"}], "id": "b780af01-a3c6-4279-a355-fa3289bc1ec3", "security_groups": [], "device_id": "48120dd2-2523-4b11-9f67-8ea944d11012", "name": "", "admin_state_up": true, "network_id": "2be9f80a-e3ac-42e6-9249-5ccca241ad85", "dns_name": null, "binding:vif_details": {}, "binding:vnic_type": "normal", "binding:vif_type": "distributed", "tenant_id": "1c9d0fc21afc40a2959d3d3d4acca528", "mac_address": "fa:16:3e:a7:d8:80", "created_at": "2016-04-07T15:05:57"}}
2016-04-11 14:13:51 Andreas Scheuring description Scenario1 - Migration on wrong physical network - High Prio =========================================================== Host1 has physical_interface_mappings: pyhsnet1:eth0, physnet2=eth2 Host2 has physical_interface_mappings: physnet1:eth1, physnet2=eth0 Now Live migration from an instance hosted on host1 (connected to physnet1) to host2 succeeds. Libvirt just migrates the whole server with its domain.xml and the macvtap is plugged on the targets side eth0. Now the instance does not have access to its network anymore, but access to another physical network. The behavior is documented, however this needs to be fixed! Scenario2 - Migration fails - Low Prio ====================================== Host1 has physical_interface_mappings: pyhsnet1:eth0 Host2 has physical_interface_mappings: physnet1:eth1 Let's assume a vlan setup. Let's assume a migration from host1 to host2. Host to does NOT have a interface eth0. Migration will fail in instance will remain active on the source, as nova plug on host2 failed to create a vlan device on eth0. If you have a flat network - definition of he libvirt xml will fail on host2. Two approaches are thinkable * Solve the problem (Scenario 1+2) * just prevent such an invalid migration (let scenario 1 fail like scenario 2 fails today) Solve the problem ================= #1 Solve it in Nova pre live migration -------------------------------------- This would allow migration although physical_interface mappings are different. a) On pre live migration nova should change the binding:host to the migration target. This will trigger the portbinding and the mech driver which will update the vif_details with the right macvtap source device information. Libvirt can then adapt the migration-xml to reflect the changes. Currently the update of the binding is done in post migration, after migration succeeded. Can we already do it in pre_live_migration and on failure undo it in rollback ? - There's no issue for the reference implementations - See the prototype: https://review.openstack.org/301090 - But there might be some mechanisms for external SDN Controllers that might shut down ports on the source host as soon as the host_id is being updated. On the other hand, if controller rely on this mechanism, they will set the port up a little too late today, as the update host_id is sent after live migration succeeded. b) The alternative would be to allow a port to be bound to multiple hosts simultaneously. So in pre_live migration, nova would add a binding for the target host and in post_live_migration it would remove the original binding. This would require - simultaneous port binding. This will be achieved by https://bugs.launchpad.net/neutron/+bug/1367391 - allow such a binding for compute ports as well - Update APIs to reflect multiple port_bindings   - Create / Update / Show Port   - host_id is not reflect for DVR ports today [1] #2 Moved to Prevent section --------------------------- #3 Device renaming in the macvtap agent --------------------------------------- This would allow migration although physical_interface mappings are different. Instead of      physical_interface_mapping = physnet1:eth0 use a      physical_interface_mac_mapping = physnet1:00:11:22:33:44:55:66 #where 00:11:22:33:44:55:66 is the mac address of the interface to use On agent startup, the agent could rename the associated device to "physnet1" (or to some other generic value) that is consistent cross all hosts! We would need to document that this interface should not be used by any other application (that relies on the interface name) #4 Use generic vlan device names -------------------------------- This solves the problem only for vlan networks! For flat networks it still would exist Today, the agent generates the vlan device names like this: for eth1 eth1.<vlan-id>. We could get rid of this pattern and use network-uuid.vlan instead. Where nework-uuid are the first 10 chars of the id. But this would not solve the issue for flat networks. Therefore the device renaming like proposed in #3 would be required. Prevent invalid migration ========================= #1 Let Port binding fail (not working) ------------------------------------- The idea is to detect an invalid migration in the mechanism driver and let port binding fail. This approach has two problems a) Portbinding happens AFTER Migration happened. In post live migration nova requests to update the binding:host-id to the target. But when doing so, the instance is already running on the target host. The binding will fail, but the migration happened, though. b) Detecting a migration in the mech driver is difficult. The idea is to use the PortContext.original port. This works if we add the original port to the context somewhere in the ml2 plugin (see proposed patch). The problem is that this is not an indicator for a live migration. The original port will also be added if scheduling on another node failed before and now the current node is picked. There was no live migration but the PortContext.original port is set to another host. Maybe this can be solved but it's worthwhile mentioning here. see patch https://review.openstack.org/293404 #2 Let agent detect invalid migration (not working) --------------------------------------------------- An invalid migration could be detected in the agent to avoid the agent setting the device status to up. But this is too late, as the agent detects the device after migration already started. There is no way to stop it again. see patch https://review.openstack.org/293403 #3 Solve it in nova post live migration ------------------------------------ The idea is, that nova starts the migration and then listens on plug_vif event that is emitted by neutron after the agent reported the device as up. Nova also waits for the portbinding to occur. If one of both runs into a timeout or fails, either the migration should be rolled back (if still possible) or the instance should be set into error state and the network locked down (which is the default for ovs - not sure about other right now). There are some patchsets out that try to achieve something similar, but for the ovs-hybrid plug only. For others it's much more complicated, as the agent will only report the device up after it occured on the target (after migration already started) https://review.openstack.org/246898 References ========== [1] curl -g -i -X GET http://192.168.122.30:9696/v2.0/ports/b780af01-a3c6-4279-a355-fa3289bc1ec3.json {"port": {"status": "ACTIVE", "binding:host_id": "", "description": "", "allowed_address_pairs": [], "extra_dhcp_opts": [], "updated_at": "2016-04-08T08:59:08", "device_owner": "network:router_interface_distributed", "port_security_enabled": false, "binding:profile": {}, "fixed_ips": [{"subnet_id": "7c57f270-46a9-4cde-a2a7-82e66da1d084", "ip_address": "10.0.0.1"}], "id": "b780af01-a3c6-4279-a355-fa3289bc1ec3", "security_groups": [], "device_id": "48120dd2-2523-4b11-9f67-8ea944d11012", "name": "", "admin_state_up": true, "network_id": "2be9f80a-e3ac-42e6-9249-5ccca241ad85", "dns_name": null, "binding:vif_details": {}, "binding:vnic_type": "normal", "binding:vif_type": "distributed", "tenant_id": "1c9d0fc21afc40a2959d3d3d4acca528", "mac_address": "fa:16:3e:a7:d8:80", "created_at": "2016-04-07T15:05:57"}} Scenario1 - Migration on wrong physical network - High Prio =========================================================== Host1 has physical_interface_mappings: pyhsnet1:eth0, physnet2=eth2 Host2 has physical_interface_mappings: physnet1:eth1, physnet2=eth0 Now Live migration from an instance hosted on host1 (connected to physnet1) to host2 succeeds. Libvirt just migrates the whole server with its domain.xml and the macvtap is plugged on the targets side eth0. Now the instance does not have access to its network anymore, but access to another physical network. The behavior is documented, however this needs to be fixed! Scenario2 - Migration fails - Low Prio ====================================== Host1 has physical_interface_mappings: pyhsnet1:eth0 Host2 has physical_interface_mappings: physnet1:eth1 Let's assume a vlan setup. Let's assume a migration from host1 to host2. Host to does NOT have a interface eth0. Migration will fail in instance will remain active on the source, as nova plug on host2 failed to create a vlan device on eth0. If you have a flat network - definition of he libvirt xml will fail on host2. Two approaches are thinkable * Solve the problem (Scenario 1+2) * just prevent such an invalid migration (let scenario 1 fail like scenario 2 fails today) Solve the problem ================= #1 Solve it in Nova pre live migration -------------------------------------- This would allow migration although physical_interface mappings are different. a) On pre live migration nova should change the binding:host to the migration target. This will trigger the portbinding and the mech driver which will update the vif_details with the right macvtap source device information. Libvirt can then adapt the migration-xml to reflect the changes. Currently the update of the binding is done in post migration, after migration succeeded. Can we already do it in pre_live_migration and on failure undo it in rollback ? - There's no issue for the reference implementations - See the prototype: https://review.openstack.org/297100 - But there might be some mechanisms for external SDN Controllers that might shut down ports on the source host as soon as the host_id is being updated. On the other hand, if controller rely on this mechanism, they will set the port up a little too late today, as the update host_id is sent after live migration succeeded. b) The alternative would be to allow a port to be bound to multiple hosts simultaneously. So in pre_live migration, nova would add a binding for the target host and in post_live_migration it would remove the original binding. This would require - simultaneous port binding. This will be achieved by https://bugs.launchpad.net/neutron/+bug/1367391 - allow such a binding for compute ports as well - Update APIs to reflect multiple port_bindings   - Create / Update / Show Port   - host_id is not reflect for DVR ports today [1] #2 Moved to Prevent section --------------------------- #3 Device renaming in the macvtap agent --------------------------------------- This would allow migration although physical_interface mappings are different. Instead of      physical_interface_mapping = physnet1:eth0 use a      physical_interface_mac_mapping = physnet1:00:11:22:33:44:55:66 #where 00:11:22:33:44:55:66 is the mac address of the interface to use On agent startup, the agent could rename the associated device to "physnet1" (or to some other generic value) that is consistent cross all hosts! We would need to document that this interface should not be used by any other application (that relies on the interface name) #4 Use generic vlan device names -------------------------------- This solves the problem only for vlan networks! For flat networks it still would exist Today, the agent generates the vlan device names like this: for eth1 eth1.<vlan-id>. We could get rid of this pattern and use network-uuid.vlan instead. Where nework-uuid are the first 10 chars of the id. But this would not solve the issue for flat networks. Therefore the device renaming like proposed in #3 would be required. Prevent invalid migration ========================= #1 Let Port binding fail (not working) ------------------------------------- The idea is to detect an invalid migration in the mechanism driver and let port binding fail. This approach has two problems a) Portbinding happens AFTER Migration happened. In post live migration nova requests to update the binding:host-id to the target. But when doing so, the instance is already running on the target host. The binding will fail, but the migration happened, though. b) Detecting a migration in the mech driver is difficult. The idea is to use the PortContext.original port. This works if we add the original port to the context somewhere in the ml2 plugin (see proposed patch). The problem is that this is not an indicator for a live migration. The original port will also be added if scheduling on another node failed before and now the current node is picked. There was no live migration but the PortContext.original port is set to another host. Maybe this can be solved but it's worthwhile mentioning here. see patch https://review.openstack.org/293404 #2 Let agent detect invalid migration (not working) --------------------------------------------------- An invalid migration could be detected in the agent to avoid the agent setting the device status to up. But this is too late, as the agent detects the device after migration already started. There is no way to stop it again. see patch https://review.openstack.org/293403 #3 Solve it in nova post live migration ------------------------------------ The idea is, that nova starts the migration and then listens on plug_vif event that is emitted by neutron after the agent reported the device as up. Nova also waits for the portbinding to occur. If one of both runs into a timeout or fails, either the migration should be rolled back (if still possible) or the instance should be set into error state and the network locked down (which is the default for ovs - not sure about other right now). There are some patchsets out that try to achieve something similar, but for the ovs-hybrid plug only. For others it's much more complicated, as the agent will only report the device up after it occured on the target (after migration already started) https://review.openstack.org/246898 References ========== [1] curl -g -i -X GET http://192.168.122.30:9696/v2.0/ports/b780af01-a3c6-4279-a355-fa3289bc1ec3.json {"port": {"status": "ACTIVE", "binding:host_id": "", "description": "", "allowed_address_pairs": [], "extra_dhcp_opts": [], "updated_at": "2016-04-08T08:59:08", "device_owner": "network:router_interface_distributed", "port_security_enabled": false, "binding:profile": {}, "fixed_ips": [{"subnet_id": "7c57f270-46a9-4cde-a2a7-82e66da1d084", "ip_address": "10.0.0.1"}], "id": "b780af01-a3c6-4279-a355-fa3289bc1ec3", "security_groups": [], "device_id": "48120dd2-2523-4b11-9f67-8ea944d11012", "name": "", "admin_state_up": true, "network_id": "2be9f80a-e3ac-42e6-9249-5ccca241ad85", "dns_name": null, "binding:vif_details": {}, "binding:vnic_type": "normal", "binding:vif_type": "distributed", "tenant_id": "1c9d0fc21afc40a2959d3d3d4acca528", "mac_address": "fa:16:3e:a7:d8:80", "created_at": "2016-04-07T15:05:57"}}
2016-06-03 19:31:35 Armando Migliaccio neutron: milestone newton-1 newton-2
2016-07-15 23:31:25 Armando Migliaccio neutron: milestone newton-2 newton-3
2016-07-18 12:42:32 Andreas Scheuring description Scenario1 - Migration on wrong physical network - High Prio =========================================================== Host1 has physical_interface_mappings: pyhsnet1:eth0, physnet2=eth2 Host2 has physical_interface_mappings: physnet1:eth1, physnet2=eth0 Now Live migration from an instance hosted on host1 (connected to physnet1) to host2 succeeds. Libvirt just migrates the whole server with its domain.xml and the macvtap is plugged on the targets side eth0. Now the instance does not have access to its network anymore, but access to another physical network. The behavior is documented, however this needs to be fixed! Scenario2 - Migration fails - Low Prio ====================================== Host1 has physical_interface_mappings: pyhsnet1:eth0 Host2 has physical_interface_mappings: physnet1:eth1 Let's assume a vlan setup. Let's assume a migration from host1 to host2. Host to does NOT have a interface eth0. Migration will fail in instance will remain active on the source, as nova plug on host2 failed to create a vlan device on eth0. If you have a flat network - definition of he libvirt xml will fail on host2. Two approaches are thinkable * Solve the problem (Scenario 1+2) * just prevent such an invalid migration (let scenario 1 fail like scenario 2 fails today) Solve the problem ================= #1 Solve it in Nova pre live migration -------------------------------------- This would allow migration although physical_interface mappings are different. a) On pre live migration nova should change the binding:host to the migration target. This will trigger the portbinding and the mech driver which will update the vif_details with the right macvtap source device information. Libvirt can then adapt the migration-xml to reflect the changes. Currently the update of the binding is done in post migration, after migration succeeded. Can we already do it in pre_live_migration and on failure undo it in rollback ? - There's no issue for the reference implementations - See the prototype: https://review.openstack.org/297100 - But there might be some mechanisms for external SDN Controllers that might shut down ports on the source host as soon as the host_id is being updated. On the other hand, if controller rely on this mechanism, they will set the port up a little too late today, as the update host_id is sent after live migration succeeded. b) The alternative would be to allow a port to be bound to multiple hosts simultaneously. So in pre_live migration, nova would add a binding for the target host and in post_live_migration it would remove the original binding. This would require - simultaneous port binding. This will be achieved by https://bugs.launchpad.net/neutron/+bug/1367391 - allow such a binding for compute ports as well - Update APIs to reflect multiple port_bindings   - Create / Update / Show Port   - host_id is not reflect for DVR ports today [1] #2 Moved to Prevent section --------------------------- #3 Device renaming in the macvtap agent --------------------------------------- This would allow migration although physical_interface mappings are different. Instead of      physical_interface_mapping = physnet1:eth0 use a      physical_interface_mac_mapping = physnet1:00:11:22:33:44:55:66 #where 00:11:22:33:44:55:66 is the mac address of the interface to use On agent startup, the agent could rename the associated device to "physnet1" (or to some other generic value) that is consistent cross all hosts! We would need to document that this interface should not be used by any other application (that relies on the interface name) #4 Use generic vlan device names -------------------------------- This solves the problem only for vlan networks! For flat networks it still would exist Today, the agent generates the vlan device names like this: for eth1 eth1.<vlan-id>. We could get rid of this pattern and use network-uuid.vlan instead. Where nework-uuid are the first 10 chars of the id. But this would not solve the issue for flat networks. Therefore the device renaming like proposed in #3 would be required. Prevent invalid migration ========================= #1 Let Port binding fail (not working) ------------------------------------- The idea is to detect an invalid migration in the mechanism driver and let port binding fail. This approach has two problems a) Portbinding happens AFTER Migration happened. In post live migration nova requests to update the binding:host-id to the target. But when doing so, the instance is already running on the target host. The binding will fail, but the migration happened, though. b) Detecting a migration in the mech driver is difficult. The idea is to use the PortContext.original port. This works if we add the original port to the context somewhere in the ml2 plugin (see proposed patch). The problem is that this is not an indicator for a live migration. The original port will also be added if scheduling on another node failed before and now the current node is picked. There was no live migration but the PortContext.original port is set to another host. Maybe this can be solved but it's worthwhile mentioning here. see patch https://review.openstack.org/293404 #2 Let agent detect invalid migration (not working) --------------------------------------------------- An invalid migration could be detected in the agent to avoid the agent setting the device status to up. But this is too late, as the agent detects the device after migration already started. There is no way to stop it again. see patch https://review.openstack.org/293403 #3 Solve it in nova post live migration ------------------------------------ The idea is, that nova starts the migration and then listens on plug_vif event that is emitted by neutron after the agent reported the device as up. Nova also waits for the portbinding to occur. If one of both runs into a timeout or fails, either the migration should be rolled back (if still possible) or the instance should be set into error state and the network locked down (which is the default for ovs - not sure about other right now). There are some patchsets out that try to achieve something similar, but for the ovs-hybrid plug only. For others it's much more complicated, as the agent will only report the device up after it occured on the target (after migration already started) https://review.openstack.org/246898 References ========== [1] curl -g -i -X GET http://192.168.122.30:9696/v2.0/ports/b780af01-a3c6-4279-a355-fa3289bc1ec3.json {"port": {"status": "ACTIVE", "binding:host_id": "", "description": "", "allowed_address_pairs": [], "extra_dhcp_opts": [], "updated_at": "2016-04-08T08:59:08", "device_owner": "network:router_interface_distributed", "port_security_enabled": false, "binding:profile": {}, "fixed_ips": [{"subnet_id": "7c57f270-46a9-4cde-a2a7-82e66da1d084", "ip_address": "10.0.0.1"}], "id": "b780af01-a3c6-4279-a355-fa3289bc1ec3", "security_groups": [], "device_id": "48120dd2-2523-4b11-9f67-8ea944d11012", "name": "", "admin_state_up": true, "network_id": "2be9f80a-e3ac-42e6-9249-5ccca241ad85", "dns_name": null, "binding:vif_details": {}, "binding:vnic_type": "normal", "binding:vif_type": "distributed", "tenant_id": "1c9d0fc21afc40a2959d3d3d4acca528", "mac_address": "fa:16:3e:a7:d8:80", "created_at": "2016-04-07T15:05:57"}} Scenario1 - Migration on wrong physical network - High Prio =========================================================== Host1 has physical_interface_mappings: pyhsnet1:eth0, physnet2=eth2 Host2 has physical_interface_mappings: physnet1:eth1, physnet2=eth0 Now Live migration from an instance hosted on host1 (connected to physnet1) to host2 succeeds. Libvirt just migrates the whole server with its domain.xml and the macvtap is plugged on the targets side eth0. Now the instance does not have access to its network anymore, but access to another physical network. The behavior is documented, however this needs to be fixed! Scenario2 - Migration fails - Low Prio ====================================== Host1 has physical_interface_mappings: pyhsnet1:eth0 Host2 has physical_interface_mappings: physnet1:eth1 Let's assume a vlan setup. Let's assume a migration from host1 to host2. Host to does NOT have a interface eth0. Migration will fail in instance will remain active on the source, as nova plug on host2 failed to create a vlan device on eth0. If you have a flat network - definition of he libvirt xml will fail on host2. Two approaches are thinkable * Solve the problem (Scenario 1+2) * just prevent such an invalid migration (let scenario 1 fail like scenario 2 fails today) Solve the problem ================= #1 Solve it in Nova pre live migration -------------------------------------- This would allow migration although physical_interface mappings are different. a) On pre live migration nova should change the binding:host to the migration target. This will trigger the portbinding and the mech driver which will update the vif_details with the right macvtap source device information. Libvirt can then adapt the migration-xml to reflect the changes. Currently the update of the binding is done in post migration, after migration succeeded. Can we already do it in pre_live_migration and on failure undo it in rollback ? - There's no issue for the reference implementations - See the prototype: https://review.openstack.org/297100 - But there might be some mechanisms for external SDN Controllers that might shut down ports on the source host as soon as the host_id is being updated. On the other hand, if controller rely on this mechanism, they will set the port up a little too late today, as the update host_id is sent after live migration succeeded. b) The alternative would be to allow a port to be bound to multiple hosts simultaneously. So in pre_live migration, nova would add a binding for the target host and in post_live_migration it would remove the original binding. This would require - simultaneous port binding. This will be achieved by https://bugs.launchpad.net/neutron/+bug/1367391 - allow such a binding for compute ports as well - Update APIs to reflect multiple port_bindings   - Create / Update / Show Port   - host_id is not reflect for DVR ports today [1] #2 Moved to Prevent section --------------------------- #3 Device renaming in the macvtap agent --------------------------------------- This would allow migration although physical_interface mappings are different. Instead of      physical_interface_mapping = physnet1:eth0 use a      physical_interface_mac_mapping = physnet1:00:11:22:33:44:55:66 #where 00:11:22:33:44:55:66 is the mac address of the interface to use On agent startup, the agent could rename the associated device to "physnet1" (or to some other generic value) that is consistent cross all hosts! We would need to document that this interface should not be used by any other application (that relies on the interface name) #4 Use generic vlan device names -------------------------------- This solves the problem only for vlan networks! For flat networks it still would exist Today, the agent generates the vlan device names like this: for eth1 eth1.<vlan-id>. We could get rid of this pattern and use network-uuid.vlan instead. Where nework-uuid are the first 10 chars of the id. But this would not solve the issue for flat networks. Therefore the device renaming like proposed in #3 would be required. Prevent invalid migration ========================= #1 Let Port binding fail (not working) ------------------------------------- The idea is to detect an invalid migration in the mechanism driver and let port binding fail. This approach has two problems a) Portbinding happens AFTER Migration happened. In post live migration nova requests to update the binding:host-id to the target. But when doing so, the instance is already running on the target host. The binding will fail, but the migration happened, though. b) Detecting a migration in the mech driver is difficult. The idea is to use the PortContext.original port. This works if we add the original port to the context somewhere in the ml2 plugin (see proposed patch). The problem is that this is not an indicator for a live migration. The original port will also be added if scheduling on another node failed before and now the current node is picked. There was no live migration but the PortContext.original port is set to another host. Maybe this can be solved but it's worthwhile mentioning here. see patch https://review.openstack.org/293404 #2 Let agent detect invalid migration (not working) --------------------------------------------------- An invalid migration could be detected in the agent to avoid the agent setting the device status to up. But this is too late, as the agent detects the device after migration already started. There is no way to stop it again. see patch https://review.openstack.org/293403 #3 Solve it in nova post live migration ------------------------------------ The idea is, that nova starts the migration and then listens on plug_vif event that is emitted by neutron after the agent reported the device as up. Nova also waits for the portbinding to occur. If one of both runs into a timeout or fails, either the migration should be rolled back (if still possible) or the instance should be set into error state and the network locked down (which is the default for ovs - not sure about other right now). There are some patchsets out that try to achieve something similar, but for the ovs-hybrid plug only. For others it's much more complicated, as the agent will only report the device up after it occured on the target (after migration already started) https://review.openstack.org/246898 #4 Prohibit agent start with invalid mapping -------------------------------------------- Do not allow different mappings at all. a) Compare mapping at agent startup via RPC ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The first registered agent defines which mapping is valid. Send the agents mapping via RPC to the server, which validates it against all other existing agents in the db. If mapping is equal continue with agent start. Otherwise exit. This approach is pretty racy. Just assume no agent exists at the beginning. Now 2 agents request different mapping simultaneously. Both queries succeed, as non of them is written into the db yet. This happens some time later when the first agent status is sent to the server. This could end up in different mappings! b) Verify mapping on every status_report ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The first registered agent defines which mapping is valid. Before writing a status report to the DB, the plugin could query existing agents and compare the config. This could happen in the mechanism drivers via a new mech_driver interface. Only if the mappings fits to other mappings write it into db. If the mapping is not equal, a new status is being returned which will cause the agent to terminate. This approach also has some problems with potential races. We would need to lock the whole table (to avoid parallel inserts) to do the query, compare and write actions. Just locking a column is not sufficient c) Having a master interface_mapping on the q-svc ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Have a new mech_macvtap specific config option on the controller, that specifies the master mapping. Instead of comparing the requested mapping against already existing mappings, we could compare against that master mapping. This would work with approach a) and b). But this will lead to duplicated configuraiton. References ========== [1] curl -g -i -X GET http://192.168.122.30:9696/v2.0/ports/b780af01-a3c6-4279-a355-fa3289bc1ec3.json {"port": {"status": "ACTIVE", "binding:host_id": "", "description": "", "allowed_address_pairs": [], "extra_dhcp_opts": [], "updated_at": "2016-04-08T08:59:08", "device_owner": "network:router_interface_distributed", "port_security_enabled": false, "binding:profile": {}, "fixed_ips": [{"subnet_id": "7c57f270-46a9-4cde-a2a7-82e66da1d084", "ip_address": "10.0.0.1"}], "id": "b780af01-a3c6-4279-a355-fa3289bc1ec3", "security_groups": [], "device_id": "48120dd2-2523-4b11-9f67-8ea944d11012", "name": "", "admin_state_up": true, "network_id": "2be9f80a-e3ac-42e6-9249-5ccca241ad85", "dns_name": null, "binding:vif_details": {}, "binding:vnic_type": "normal", "binding:vif_type": "distributed", "tenant_id": "1c9d0fc21afc40a2959d3d3d4acca528", "mac_address": "fa:16:3e:a7:d8:80", "created_at": "2016-04-07T15:05:57"}}
2016-07-18 12:56:29 Andreas Scheuring description Scenario1 - Migration on wrong physical network - High Prio =========================================================== Host1 has physical_interface_mappings: pyhsnet1:eth0, physnet2=eth2 Host2 has physical_interface_mappings: physnet1:eth1, physnet2=eth0 Now Live migration from an instance hosted on host1 (connected to physnet1) to host2 succeeds. Libvirt just migrates the whole server with its domain.xml and the macvtap is plugged on the targets side eth0. Now the instance does not have access to its network anymore, but access to another physical network. The behavior is documented, however this needs to be fixed! Scenario2 - Migration fails - Low Prio ====================================== Host1 has physical_interface_mappings: pyhsnet1:eth0 Host2 has physical_interface_mappings: physnet1:eth1 Let's assume a vlan setup. Let's assume a migration from host1 to host2. Host to does NOT have a interface eth0. Migration will fail in instance will remain active on the source, as nova plug on host2 failed to create a vlan device on eth0. If you have a flat network - definition of he libvirt xml will fail on host2. Two approaches are thinkable * Solve the problem (Scenario 1+2) * just prevent such an invalid migration (let scenario 1 fail like scenario 2 fails today) Solve the problem ================= #1 Solve it in Nova pre live migration -------------------------------------- This would allow migration although physical_interface mappings are different. a) On pre live migration nova should change the binding:host to the migration target. This will trigger the portbinding and the mech driver which will update the vif_details with the right macvtap source device information. Libvirt can then adapt the migration-xml to reflect the changes. Currently the update of the binding is done in post migration, after migration succeeded. Can we already do it in pre_live_migration and on failure undo it in rollback ? - There's no issue for the reference implementations - See the prototype: https://review.openstack.org/297100 - But there might be some mechanisms for external SDN Controllers that might shut down ports on the source host as soon as the host_id is being updated. On the other hand, if controller rely on this mechanism, they will set the port up a little too late today, as the update host_id is sent after live migration succeeded. b) The alternative would be to allow a port to be bound to multiple hosts simultaneously. So in pre_live migration, nova would add a binding for the target host and in post_live_migration it would remove the original binding. This would require - simultaneous port binding. This will be achieved by https://bugs.launchpad.net/neutron/+bug/1367391 - allow such a binding for compute ports as well - Update APIs to reflect multiple port_bindings   - Create / Update / Show Port   - host_id is not reflect for DVR ports today [1] #2 Moved to Prevent section --------------------------- #3 Device renaming in the macvtap agent --------------------------------------- This would allow migration although physical_interface mappings are different. Instead of      physical_interface_mapping = physnet1:eth0 use a      physical_interface_mac_mapping = physnet1:00:11:22:33:44:55:66 #where 00:11:22:33:44:55:66 is the mac address of the interface to use On agent startup, the agent could rename the associated device to "physnet1" (or to some other generic value) that is consistent cross all hosts! We would need to document that this interface should not be used by any other application (that relies on the interface name) #4 Use generic vlan device names -------------------------------- This solves the problem only for vlan networks! For flat networks it still would exist Today, the agent generates the vlan device names like this: for eth1 eth1.<vlan-id>. We could get rid of this pattern and use network-uuid.vlan instead. Where nework-uuid are the first 10 chars of the id. But this would not solve the issue for flat networks. Therefore the device renaming like proposed in #3 would be required. Prevent invalid migration ========================= #1 Let Port binding fail (not working) ------------------------------------- The idea is to detect an invalid migration in the mechanism driver and let port binding fail. This approach has two problems a) Portbinding happens AFTER Migration happened. In post live migration nova requests to update the binding:host-id to the target. But when doing so, the instance is already running on the target host. The binding will fail, but the migration happened, though. b) Detecting a migration in the mech driver is difficult. The idea is to use the PortContext.original port. This works if we add the original port to the context somewhere in the ml2 plugin (see proposed patch). The problem is that this is not an indicator for a live migration. The original port will also be added if scheduling on another node failed before and now the current node is picked. There was no live migration but the PortContext.original port is set to another host. Maybe this can be solved but it's worthwhile mentioning here. see patch https://review.openstack.org/293404 #2 Let agent detect invalid migration (not working) --------------------------------------------------- An invalid migration could be detected in the agent to avoid the agent setting the device status to up. But this is too late, as the agent detects the device after migration already started. There is no way to stop it again. see patch https://review.openstack.org/293403 #3 Solve it in nova post live migration ------------------------------------ The idea is, that nova starts the migration and then listens on plug_vif event that is emitted by neutron after the agent reported the device as up. Nova also waits for the portbinding to occur. If one of both runs into a timeout or fails, either the migration should be rolled back (if still possible) or the instance should be set into error state and the network locked down (which is the default for ovs - not sure about other right now). There are some patchsets out that try to achieve something similar, but for the ovs-hybrid plug only. For others it's much more complicated, as the agent will only report the device up after it occured on the target (after migration already started) https://review.openstack.org/246898 #4 Prohibit agent start with invalid mapping -------------------------------------------- Do not allow different mappings at all. a) Compare mapping at agent startup via RPC ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The first registered agent defines which mapping is valid. Send the agents mapping via RPC to the server, which validates it against all other existing agents in the db. If mapping is equal continue with agent start. Otherwise exit. This approach is pretty racy. Just assume no agent exists at the beginning. Now 2 agents request different mapping simultaneously. Both queries succeed, as non of them is written into the db yet. This happens some time later when the first agent status is sent to the server. This could end up in different mappings! b) Verify mapping on every status_report ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The first registered agent defines which mapping is valid. Before writing a status report to the DB, the plugin could query existing agents and compare the config. This could happen in the mechanism drivers via a new mech_driver interface. Only if the mappings fits to other mappings write it into db. If the mapping is not equal, a new status is being returned which will cause the agent to terminate. This approach also has some problems with potential races. We would need to lock the whole table (to avoid parallel inserts) to do the query, compare and write actions. Just locking a column is not sufficient c) Having a master interface_mapping on the q-svc ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Have a new mech_macvtap specific config option on the controller, that specifies the master mapping. Instead of comparing the requested mapping against already existing mappings, we could compare against that master mapping. This would work with approach a) and b). But this will lead to duplicated configuraiton. References ========== [1] curl -g -i -X GET http://192.168.122.30:9696/v2.0/ports/b780af01-a3c6-4279-a355-fa3289bc1ec3.json {"port": {"status": "ACTIVE", "binding:host_id": "", "description": "", "allowed_address_pairs": [], "extra_dhcp_opts": [], "updated_at": "2016-04-08T08:59:08", "device_owner": "network:router_interface_distributed", "port_security_enabled": false, "binding:profile": {}, "fixed_ips": [{"subnet_id": "7c57f270-46a9-4cde-a2a7-82e66da1d084", "ip_address": "10.0.0.1"}], "id": "b780af01-a3c6-4279-a355-fa3289bc1ec3", "security_groups": [], "device_id": "48120dd2-2523-4b11-9f67-8ea944d11012", "name": "", "admin_state_up": true, "network_id": "2be9f80a-e3ac-42e6-9249-5ccca241ad85", "dns_name": null, "binding:vif_details": {}, "binding:vnic_type": "normal", "binding:vif_type": "distributed", "tenant_id": "1c9d0fc21afc40a2959d3d3d4acca528", "mac_address": "fa:16:3e:a7:d8:80", "created_at": "2016-04-07T15:05:57"}} Scenario1 - Migration on wrong physical network - High Prio =========================================================== Host1 has physical_interface_mappings: pyhsnet1:eth0, physnet2=eth2 Host2 has physical_interface_mappings: physnet1:eth1, physnet2=eth0 Now Live migration from an instance hosted on host1 (connected to physnet1) to host2 succeeds. Libvirt just migrates the whole server with its domain.xml and the macvtap is plugged on the targets side eth0. Now the instance does not have access to its network anymore, but access to another physical network. The behavior is documented, however this needs to be fixed! Scenario2 - Migration fails - Low Prio ====================================== Host1 has physical_interface_mappings: pyhsnet1:eth0 Host2 has physical_interface_mappings: physnet1:eth1 Let's assume a vlan setup. Let's assume a migration from host1 to host2. Host to does NOT have a interface eth0. Migration will fail in instance will remain active on the source, as nova plug on host2 failed to create a vlan device on eth0. If you have a flat network - definition of he libvirt xml will fail on host2. Two approaches are thinkable * Solve the problem (Scenario 1+2) * just prevent such an invalid migration (let scenario 1 fail like scenario 2 fails today) Solve the problem ================= #1 Solve it in Nova pre live migration -------------------------------------- This would allow migration although physical_interface mappings are different. a) On pre live migration nova should change the binding:host to the migration target. This will trigger the portbinding and the mech driver which will update the vif_details with the right macvtap source device information. Libvirt can then adapt the migration-xml to reflect the changes. Currently the update of the binding is done in post migration, after migration succeeded. Can we already do it in pre_live_migration and on failure undo it in rollback ? - There's no issue for the reference implementations - See the prototype: https://review.openstack.org/297100 - But there might be some mechanisms for external SDN Controllers that might shut down ports on the source host as soon as the host_id is being updated. On the other hand, if controller rely on this mechanism, they will set the port up a little too late today, as the update host_id is sent after live migration succeeded. b) The alternative would be to allow a port to be bound to multiple hosts simultaneously. So in pre_live migration, nova would add a binding for the target host and in post_live_migration it would remove the original binding. This would require - simultaneous port binding. This will be achieved by https://bugs.launchpad.net/neutron/+bug/1367391 - allow such a binding for compute ports as well - Update APIs to reflect multiple port_bindings   - Create / Update / Show Port   - host_id is not reflect for DVR ports today [1] #2 Moved to Prevent section --------------------------- #3 Device renaming in the macvtap agent --------------------------------------- This would allow migration although physical_interface mappings are different. Instead of      physical_interface_mapping = physnet1:eth0 use a      physical_interface_mac_mapping = physnet1:00:11:22:33:44:55:66 #where 00:11:22:33:44:55:66 is the mac address of the interface to use On agent startup, the agent could rename the associated device to "physnet1" (or to some other generic value) that is consistent cross all hosts! We would need to document that this interface should not be used by any other application (that relies on the interface name) #4 Use generic vlan device names -------------------------------- This solves the problem only for vlan networks! For flat networks it still would exist Today, the agent generates the vlan device names like this: for eth1 eth1.<vlan-id>. We could get rid of this pattern and use network-uuid.vlan instead. Where nework-uuid are the first 10 chars of the id. But this would not solve the issue for flat networks. Therefore the device renaming like proposed in #3 would be required. Prevent invalid migration ========================= #1 Let Port binding fail (not working) ------------------------------------- The idea is to detect an invalid migration in the mechanism driver and let port binding fail. This approach has two problems a) Portbinding happens AFTER Migration happened. In post live migration nova requests to update the binding:host-id to the target. But when doing so, the instance is already running on the target host. The binding will fail, but the migration happened, though. b) Detecting a migration in the mech driver is difficult. The idea is to use the PortContext.original port. This works if we add the original port to the context somewhere in the ml2 plugin (see proposed patch). The problem is that this is not an indicator for a live migration. The original port will also be added if scheduling on another node failed before and now the current node is picked. There was no live migration but the PortContext.original port is set to another host. Maybe this can be solved but it's worthwhile mentioning here. see patch https://review.openstack.org/293404 #2 Let agent detect invalid migration (not working) --------------------------------------------------- An invalid migration could be detected in the agent to avoid the agent setting the device status to up. But this is too late, as the agent detects the device after migration already started. There is no way to stop it again. see patch https://review.openstack.org/293403 #3 Solve it in nova post live migration ------------------------------------ The idea is, that nova starts the migration and then listens on plug_vif event that is emitted by neutron after the agent reported the device as up. Nova also waits for the portbinding to occur. If one of both runs into a timeout or fails, either the migration should be rolled back (if still possible) or the instance should be set into error state and the network locked down (which is the default for ovs - not sure about other right now). There are some patchsets out that try to achieve something similar, but for the ovs-hybrid plug only. For others it's much more complicated, as the agent will only report the device up after it occured on the target (after migration already started) https://review.openstack.org/246898 #4 Prohibit agent start with invalid mapping -------------------------------------------- Do not allow different mappings at all. a) Compare mapping at agent startup via RPC ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The first registered agent defines which mapping is valid.  Send the agents mapping via RPC to the server, which validates it against all other existing agents in the db. If mapping is equal continue with agent start. Otherwise exit. This approach is pretty racy. Just assume no agent exists at the beginning. Now 2 agents request different mapping simultaneously. Both queries succeed, as non of them is written into the db yet. This happens some time later when the first agent status is sent to the server. This could end up in different mappings! See patch [2] b) Verify mapping on every status_report ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The first registered agent defines which mapping is valid. Before writing a status report to the DB, the plugin could query existing agents and compare the config. This could happen in the mechanism drivers via a new mech_driver interface. Only if the mappings fits to other mappings write it into db. If the mapping is not equal, a new status is being returned which will cause the agent to terminate. This approach also has some problems with potential races. We would need to lock the whole table (to avoid parallel inserts) to do the query, compare and write actions. Just locking a column is not sufficient c) Having a master interface_mapping on the q-svc ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Have a new mech_macvtap specific config option on the controller, that specifies the master mapping. Instead of comparing the requested mapping against already existing mappings, we could compare against that master mapping. This would work with approach a) and b). But this will lead to duplicated configuraiton. References ========== [1] curl -g -i -X GET http://192.168.122.30:9696/v2.0/ports/b780af01-a3c6-4279-a355-fa3289bc1ec3.json {"port": {"status": "ACTIVE", "binding:host_id": "", "description": "", "allowed_address_pairs": [], "extra_dhcp_opts": [], "updated_at": "2016-04-08T08:59:08", "device_owner": "network:router_interface_distributed", "port_security_enabled": false, "binding:profile": {}, "fixed_ips": [{"subnet_id": "7c57f270-46a9-4cde-a2a7-82e66da1d084", "ip_address": "10.0.0.1"}], "id": "b780af01-a3c6-4279-a355-fa3289bc1ec3", "security_groups": [], "device_id": "48120dd2-2523-4b11-9f67-8ea944d11012", "name": "", "admin_state_up": true, "network_id": "2be9f80a-e3ac-42e6-9249-5ccca241ad85", "dns_name": null, "binding:vif_details": {}, "binding:vnic_type": "normal", "binding:vif_type": "distributed", "tenant_id": "1c9d0fc21afc40a2959d3d3d4acca528", "mac_address": "fa:16:3e:a7:d8:80", "created_at": "2016-04-07T15:05:57"}} [2] https://review.openstack.org/342872
2016-07-19 07:16:25 Andreas Scheuring description Scenario1 - Migration on wrong physical network - High Prio =========================================================== Host1 has physical_interface_mappings: pyhsnet1:eth0, physnet2=eth2 Host2 has physical_interface_mappings: physnet1:eth1, physnet2=eth0 Now Live migration from an instance hosted on host1 (connected to physnet1) to host2 succeeds. Libvirt just migrates the whole server with its domain.xml and the macvtap is plugged on the targets side eth0. Now the instance does not have access to its network anymore, but access to another physical network. The behavior is documented, however this needs to be fixed! Scenario2 - Migration fails - Low Prio ====================================== Host1 has physical_interface_mappings: pyhsnet1:eth0 Host2 has physical_interface_mappings: physnet1:eth1 Let's assume a vlan setup. Let's assume a migration from host1 to host2. Host to does NOT have a interface eth0. Migration will fail in instance will remain active on the source, as nova plug on host2 failed to create a vlan device on eth0. If you have a flat network - definition of he libvirt xml will fail on host2. Two approaches are thinkable * Solve the problem (Scenario 1+2) * just prevent such an invalid migration (let scenario 1 fail like scenario 2 fails today) Solve the problem ================= #1 Solve it in Nova pre live migration -------------------------------------- This would allow migration although physical_interface mappings are different. a) On pre live migration nova should change the binding:host to the migration target. This will trigger the portbinding and the mech driver which will update the vif_details with the right macvtap source device information. Libvirt can then adapt the migration-xml to reflect the changes. Currently the update of the binding is done in post migration, after migration succeeded. Can we already do it in pre_live_migration and on failure undo it in rollback ? - There's no issue for the reference implementations - See the prototype: https://review.openstack.org/297100 - But there might be some mechanisms for external SDN Controllers that might shut down ports on the source host as soon as the host_id is being updated. On the other hand, if controller rely on this mechanism, they will set the port up a little too late today, as the update host_id is sent after live migration succeeded. b) The alternative would be to allow a port to be bound to multiple hosts simultaneously. So in pre_live migration, nova would add a binding for the target host and in post_live_migration it would remove the original binding. This would require - simultaneous port binding. This will be achieved by https://bugs.launchpad.net/neutron/+bug/1367391 - allow such a binding for compute ports as well - Update APIs to reflect multiple port_bindings   - Create / Update / Show Port   - host_id is not reflect for DVR ports today [1] #2 Moved to Prevent section --------------------------- #3 Device renaming in the macvtap agent --------------------------------------- This would allow migration although physical_interface mappings are different. Instead of      physical_interface_mapping = physnet1:eth0 use a      physical_interface_mac_mapping = physnet1:00:11:22:33:44:55:66 #where 00:11:22:33:44:55:66 is the mac address of the interface to use On agent startup, the agent could rename the associated device to "physnet1" (or to some other generic value) that is consistent cross all hosts! We would need to document that this interface should not be used by any other application (that relies on the interface name) #4 Use generic vlan device names -------------------------------- This solves the problem only for vlan networks! For flat networks it still would exist Today, the agent generates the vlan device names like this: for eth1 eth1.<vlan-id>. We could get rid of this pattern and use network-uuid.vlan instead. Where nework-uuid are the first 10 chars of the id. But this would not solve the issue for flat networks. Therefore the device renaming like proposed in #3 would be required. Prevent invalid migration ========================= #1 Let Port binding fail (not working) ------------------------------------- The idea is to detect an invalid migration in the mechanism driver and let port binding fail. This approach has two problems a) Portbinding happens AFTER Migration happened. In post live migration nova requests to update the binding:host-id to the target. But when doing so, the instance is already running on the target host. The binding will fail, but the migration happened, though. b) Detecting a migration in the mech driver is difficult. The idea is to use the PortContext.original port. This works if we add the original port to the context somewhere in the ml2 plugin (see proposed patch). The problem is that this is not an indicator for a live migration. The original port will also be added if scheduling on another node failed before and now the current node is picked. There was no live migration but the PortContext.original port is set to another host. Maybe this can be solved but it's worthwhile mentioning here. see patch https://review.openstack.org/293404 #2 Let agent detect invalid migration (not working) --------------------------------------------------- An invalid migration could be detected in the agent to avoid the agent setting the device status to up. But this is too late, as the agent detects the device after migration already started. There is no way to stop it again. see patch https://review.openstack.org/293403 #3 Solve it in nova post live migration ------------------------------------ The idea is, that nova starts the migration and then listens on plug_vif event that is emitted by neutron after the agent reported the device as up. Nova also waits for the portbinding to occur. If one of both runs into a timeout or fails, either the migration should be rolled back (if still possible) or the instance should be set into error state and the network locked down (which is the default for ovs - not sure about other right now). There are some patchsets out that try to achieve something similar, but for the ovs-hybrid plug only. For others it's much more complicated, as the agent will only report the device up after it occured on the target (after migration already started) https://review.openstack.org/246898 #4 Prohibit agent start with invalid mapping -------------------------------------------- Do not allow different mappings at all. a) Compare mapping at agent startup via RPC ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The first registered agent defines which mapping is valid.  Send the agents mapping via RPC to the server, which validates it against all other existing agents in the db. If mapping is equal continue with agent start. Otherwise exit. This approach is pretty racy. Just assume no agent exists at the beginning. Now 2 agents request different mapping simultaneously. Both queries succeed, as non of them is written into the db yet. This happens some time later when the first agent status is sent to the server. This could end up in different mappings! See patch [2] b) Verify mapping on every status_report ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The first registered agent defines which mapping is valid. Before writing a status report to the DB, the plugin could query existing agents and compare the config. This could happen in the mechanism drivers via a new mech_driver interface. Only if the mappings fits to other mappings write it into db. If the mapping is not equal, a new status is being returned which will cause the agent to terminate. This approach also has some problems with potential races. We would need to lock the whole table (to avoid parallel inserts) to do the query, compare and write actions. Just locking a column is not sufficient c) Having a master interface_mapping on the q-svc ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Have a new mech_macvtap specific config option on the controller, that specifies the master mapping. Instead of comparing the requested mapping against already existing mappings, we could compare against that master mapping. This would work with approach a) and b). But this will lead to duplicated configuraiton. References ========== [1] curl -g -i -X GET http://192.168.122.30:9696/v2.0/ports/b780af01-a3c6-4279-a355-fa3289bc1ec3.json {"port": {"status": "ACTIVE", "binding:host_id": "", "description": "", "allowed_address_pairs": [], "extra_dhcp_opts": [], "updated_at": "2016-04-08T08:59:08", "device_owner": "network:router_interface_distributed", "port_security_enabled": false, "binding:profile": {}, "fixed_ips": [{"subnet_id": "7c57f270-46a9-4cde-a2a7-82e66da1d084", "ip_address": "10.0.0.1"}], "id": "b780af01-a3c6-4279-a355-fa3289bc1ec3", "security_groups": [], "device_id": "48120dd2-2523-4b11-9f67-8ea944d11012", "name": "", "admin_state_up": true, "network_id": "2be9f80a-e3ac-42e6-9249-5ccca241ad85", "dns_name": null, "binding:vif_details": {}, "binding:vnic_type": "normal", "binding:vif_type": "distributed", "tenant_id": "1c9d0fc21afc40a2959d3d3d4acca528", "mac_address": "fa:16:3e:a7:d8:80", "created_at": "2016-04-07T15:05:57"}} [2] https://review.openstack.org/342872 Scenario1 - Migration on wrong physical network - High Prio =========================================================== Host1 has physical_interface_mappings: pyhsnet1:eth0, physnet2=eth2 Host2 has physical_interface_mappings: physnet1:eth1, physnet2=eth0 Now Live migration from an instance hosted on host1 (connected to physnet1) to host2 succeeds. Libvirt just migrates the whole server with its domain.xml and the macvtap is plugged on the targets side eth0. Now the instance does not have access to its network anymore, but access to another physical network. The behavior is documented, however this needs to be fixed! Scenario2 - Migration fails - Low Prio ====================================== Host1 has physical_interface_mappings: pyhsnet1:eth0 Host2 has physical_interface_mappings: physnet1:eth1 Let's assume a vlan setup. Let's assume a migration from host1 to host2. Host to does NOT have a interface eth0. Migration will fail in instance will remain active on the source, as nova plug on host2 failed to create a vlan device on eth0. If you have a flat network - definition of he libvirt xml will fail on host2. Two approaches are thinkable * Solve the problem (Scenario 1+2) * just prevent such an invalid migration (let scenario 1 fail like scenario 2 fails today) Solve the problem ================= #1 Solve it in Nova pre live migration -------------------------------------- This would allow migration although physical_interface mappings are different. a) On pre live migration nova should change the binding:host to the migration target. This will trigger the portbinding and the mech driver which will update the vif_details with the right macvtap source device information. Libvirt can then adapt the migration-xml to reflect the changes. Currently the update of the binding is done in post migration, after migration succeeded. Can we already do it in pre_live_migration and on failure undo it in rollback ? - There's no issue for the reference implementations - See the prototype: https://review.openstack.org/297100 - But there might be some mechanisms for external SDN Controllers that might shut down ports on the source host as soon as the host_id is being updated. On the other hand, if controller rely on this mechanism, they will set the port up a little too late today, as the update host_id is sent after live migration succeeded. b) The alternative would be to allow a port to be bound to multiple hosts simultaneously. So in pre_live migration, nova would add a binding for the target host and in post_live_migration it would remove the original binding. This would require - simultaneous port binding. This will be achieved by https://bugs.launchpad.net/neutron/+bug/1367391 - allow such a binding for compute ports as well - Update APIs to reflect multiple port_bindings   - Create / Update / Show Port   - host_id is not reflect for DVR ports today [1] #2 Moved to Prevent section --------------------------- #3 Device renaming in the macvtap agent --------------------------------------- This would allow migration although physical_interface mappings are different. Instead of      physical_interface_mapping = physnet1:eth0 use a      physical_interface_mac_mapping = physnet1:00:11:22:33:44:55:66 #where 00:11:22:33:44:55:66 is the mac address of the interface to use On agent startup, the agent could rename the associated device to "physnet1" (or to some other generic value) that is consistent cross all hosts! We would need to document that this interface should not be used by any other application (that relies on the interface name) #4 Use generic vlan device names -------------------------------- This solves the problem only for vlan networks! For flat networks it still would exist Today, the agent generates the vlan device names like this: for eth1 eth1.<vlan-id>. We could get rid of this pattern and use network-uuid.vlan instead. Where nework-uuid are the first 10 chars of the id. But this would not solve the issue for flat networks. Therefore the device renaming like proposed in #3 would be required. Prevent invalid migration ========================= #1 Let Port binding fail (not working) ------------------------------------- The idea is to detect an invalid migration in the mechanism driver and let port binding fail. This approach has two problems a) Portbinding happens AFTER Migration happened. In post live migration nova requests to update the binding:host-id to the target. But when doing so, the instance is already running on the target host. The binding will fail, but the migration happened, though. b) Detecting a migration in the mech driver is difficult. The idea is to use the PortContext.original port. This works if we add the original port to the context somewhere in the ml2 plugin (see proposed patch). The problem is that this is not an indicator for a live migration. The original port will also be added if scheduling on another node failed before and now the current node is picked. There was no live migration but the PortContext.original port is set to another host. Maybe this can be solved but it's worthwhile mentioning here. see patch https://review.openstack.org/293404 #2 Let agent detect invalid migration (not working) --------------------------------------------------- An invalid migration could be detected in the agent to avoid the agent setting the device status to up. But this is too late, as the agent detects the device after migration already started. There is no way to stop it again. see patch https://review.openstack.org/293403 #3 Solve it in nova post live migration ------------------------------------ The idea is, that nova starts the migration and then listens on plug_vif event that is emitted by neutron after the agent reported the device as up. Nova also waits for the portbinding to occur. If one of both runs into a timeout or fails, either the migration should be rolled back (if still possible) or the instance should be set into error state and the network locked down (which is the default for ovs - not sure about other right now). There are some patchsets out that try to achieve something similar, but for the ovs-hybrid plug only. For others it's much more complicated, as the agent will only report the device up after it occured on the target (after migration already started) https://review.openstack.org/246898 #4 Prohibit agent start with invalid mapping -------------------------------------------- Do not allow different mappings at all. How to trigger the validtion? * Have an RPC call from the Agent to the Neutron plugin at agent start. --> Less resource consumption, but extra rpc call * Use the regular agent status reports. --> Checking on every status report consumes a lot of resources (db query and potential mech_driver calls) What to compare? * Have a master interface mappings config option configured on the plugin/mech_driver. All agent mappings must match that mapping --> If the server changes the master mapping, there's no way to notify the agents (or it must get implemented) --> Config option duplication * Query the master mapping from the server and ignore local mapping if one has been configured. * Compare against existing mappings in database. The first agent that sends his mapping via status reports defines the valid mapping. --> We need explicit table locking (locking rows is not sufficient) to avoid races, especially for the cases where the first agents get added. Where to do the validation/gather data for validation? * In the mech driver --> Most natural way, but requires a new mech_driver interface --> Also a new plugin interface is required * In the rpc callbacks class --> As the validation depends on the mech_driver, we would have mech_driver specific code there. But we would get around new interfaces Proposal: * Agent has a new config option "safe_migration_mode" = True|False (Default False, to stay backward compatible) * If it is set, the servers master mapping is queried by the agent on agent start via RPC. * If it does not map the local mapping, the agents terminates * The RPC call will be made to the plugin, which then triggers all mechanism drivers. Those have a generic method like 'get_plugin_configuration()' or similar * If this method is not present, the code will just continue (to not break other drivers) * The plugins returns a dict mech_driver:configuration to the agent. If the mech_driver did not provide any configuration (as not required or method not implemented), it will not be part of the return dict. References ========== [1] curl -g -i -X GET http://192.168.122.30:9696/v2.0/ports/b780af01-a3c6-4279-a355-fa3289bc1ec3.json {"port": {"status": "ACTIVE", "binding:host_id": "", "description": "", "allowed_address_pairs": [], "extra_dhcp_opts": [], "updated_at": "2016-04-08T08:59:08", "device_owner": "network:router_interface_distributed", "port_security_enabled": false, "binding:profile": {}, "fixed_ips": [{"subnet_id": "7c57f270-46a9-4cde-a2a7-82e66da1d084", "ip_address": "10.0.0.1"}], "id": "b780af01-a3c6-4279-a355-fa3289bc1ec3", "security_groups": [], "device_id": "48120dd2-2523-4b11-9f67-8ea944d11012", "name": "", "admin_state_up": true, "network_id": "2be9f80a-e3ac-42e6-9249-5ccca241ad85", "dns_name": null, "binding:vif_details": {}, "binding:vnic_type": "normal", "binding:vif_type": "distributed", "tenant_id": "1c9d0fc21afc40a2959d3d3d4acca528", "mac_address": "fa:16:3e:a7:d8:80", "created_at": "2016-04-07T15:05:57"}} [2] https://review.openstack.org/342872
2016-07-19 07:20:06 Andreas Scheuring description Scenario1 - Migration on wrong physical network - High Prio =========================================================== Host1 has physical_interface_mappings: pyhsnet1:eth0, physnet2=eth2 Host2 has physical_interface_mappings: physnet1:eth1, physnet2=eth0 Now Live migration from an instance hosted on host1 (connected to physnet1) to host2 succeeds. Libvirt just migrates the whole server with its domain.xml and the macvtap is plugged on the targets side eth0. Now the instance does not have access to its network anymore, but access to another physical network. The behavior is documented, however this needs to be fixed! Scenario2 - Migration fails - Low Prio ====================================== Host1 has physical_interface_mappings: pyhsnet1:eth0 Host2 has physical_interface_mappings: physnet1:eth1 Let's assume a vlan setup. Let's assume a migration from host1 to host2. Host to does NOT have a interface eth0. Migration will fail in instance will remain active on the source, as nova plug on host2 failed to create a vlan device on eth0. If you have a flat network - definition of he libvirt xml will fail on host2. Two approaches are thinkable * Solve the problem (Scenario 1+2) * just prevent such an invalid migration (let scenario 1 fail like scenario 2 fails today) Solve the problem ================= #1 Solve it in Nova pre live migration -------------------------------------- This would allow migration although physical_interface mappings are different. a) On pre live migration nova should change the binding:host to the migration target. This will trigger the portbinding and the mech driver which will update the vif_details with the right macvtap source device information. Libvirt can then adapt the migration-xml to reflect the changes. Currently the update of the binding is done in post migration, after migration succeeded. Can we already do it in pre_live_migration and on failure undo it in rollback ? - There's no issue for the reference implementations - See the prototype: https://review.openstack.org/297100 - But there might be some mechanisms for external SDN Controllers that might shut down ports on the source host as soon as the host_id is being updated. On the other hand, if controller rely on this mechanism, they will set the port up a little too late today, as the update host_id is sent after live migration succeeded. b) The alternative would be to allow a port to be bound to multiple hosts simultaneously. So in pre_live migration, nova would add a binding for the target host and in post_live_migration it would remove the original binding. This would require - simultaneous port binding. This will be achieved by https://bugs.launchpad.net/neutron/+bug/1367391 - allow such a binding for compute ports as well - Update APIs to reflect multiple port_bindings   - Create / Update / Show Port   - host_id is not reflect for DVR ports today [1] #2 Moved to Prevent section --------------------------- #3 Device renaming in the macvtap agent --------------------------------------- This would allow migration although physical_interface mappings are different. Instead of      physical_interface_mapping = physnet1:eth0 use a      physical_interface_mac_mapping = physnet1:00:11:22:33:44:55:66 #where 00:11:22:33:44:55:66 is the mac address of the interface to use On agent startup, the agent could rename the associated device to "physnet1" (or to some other generic value) that is consistent cross all hosts! We would need to document that this interface should not be used by any other application (that relies on the interface name) #4 Use generic vlan device names -------------------------------- This solves the problem only for vlan networks! For flat networks it still would exist Today, the agent generates the vlan device names like this: for eth1 eth1.<vlan-id>. We could get rid of this pattern and use network-uuid.vlan instead. Where nework-uuid are the first 10 chars of the id. But this would not solve the issue for flat networks. Therefore the device renaming like proposed in #3 would be required. Prevent invalid migration ========================= #1 Let Port binding fail (not working) ------------------------------------- The idea is to detect an invalid migration in the mechanism driver and let port binding fail. This approach has two problems a) Portbinding happens AFTER Migration happened. In post live migration nova requests to update the binding:host-id to the target. But when doing so, the instance is already running on the target host. The binding will fail, but the migration happened, though. b) Detecting a migration in the mech driver is difficult. The idea is to use the PortContext.original port. This works if we add the original port to the context somewhere in the ml2 plugin (see proposed patch). The problem is that this is not an indicator for a live migration. The original port will also be added if scheduling on another node failed before and now the current node is picked. There was no live migration but the PortContext.original port is set to another host. Maybe this can be solved but it's worthwhile mentioning here. see patch https://review.openstack.org/293404 #2 Let agent detect invalid migration (not working) --------------------------------------------------- An invalid migration could be detected in the agent to avoid the agent setting the device status to up. But this is too late, as the agent detects the device after migration already started. There is no way to stop it again. see patch https://review.openstack.org/293403 #3 Solve it in nova post live migration ------------------------------------ The idea is, that nova starts the migration and then listens on plug_vif event that is emitted by neutron after the agent reported the device as up. Nova also waits for the portbinding to occur. If one of both runs into a timeout or fails, either the migration should be rolled back (if still possible) or the instance should be set into error state and the network locked down (which is the default for ovs - not sure about other right now). There are some patchsets out that try to achieve something similar, but for the ovs-hybrid plug only. For others it's much more complicated, as the agent will only report the device up after it occured on the target (after migration already started) https://review.openstack.org/246898 #4 Prohibit agent start with invalid mapping -------------------------------------------- Do not allow different mappings at all. How to trigger the validtion? * Have an RPC call from the Agent to the Neutron plugin at agent start. --> Less resource consumption, but extra rpc call * Use the regular agent status reports. --> Checking on every status report consumes a lot of resources (db query and potential mech_driver calls) What to compare? * Have a master interface mappings config option configured on the plugin/mech_driver. All agent mappings must match that mapping --> If the server changes the master mapping, there's no way to notify the agents (or it must get implemented) --> Config option duplication * Query the master mapping from the server and ignore local mapping if one has been configured. * Compare against existing mappings in database. The first agent that sends his mapping via status reports defines the valid mapping. --> We need explicit table locking (locking rows is not sufficient) to avoid races, especially for the cases where the first agents get added. Where to do the validation/gather data for validation? * In the mech driver --> Most natural way, but requires a new mech_driver interface --> Also a new plugin interface is required * In the rpc callbacks class --> As the validation depends on the mech_driver, we would have mech_driver specific code there. But we would get around new interfaces Proposal: * Agent has a new config option "safe_migration_mode" = True|False (Default False, to stay backward compatible) * If it is set, the servers master mapping is queried by the agent on agent start via RPC. * If it does not map the local mapping, the agents terminates * The RPC call will be made to the plugin, which then triggers all mechanism drivers. Those have a generic method like 'get_plugin_configuration()' or similar * If this method is not present, the code will just continue (to not break other drivers) * The plugins returns a dict mech_driver:configuration to the agent. If the mech_driver did not provide any configuration (as not required or method not implemented), it will not be part of the return dict. References ========== [1] curl -g -i -X GET http://192.168.122.30:9696/v2.0/ports/b780af01-a3c6-4279-a355-fa3289bc1ec3.json {"port": {"status": "ACTIVE", "binding:host_id": "", "description": "", "allowed_address_pairs": [], "extra_dhcp_opts": [], "updated_at": "2016-04-08T08:59:08", "device_owner": "network:router_interface_distributed", "port_security_enabled": false, "binding:profile": {}, "fixed_ips": [{"subnet_id": "7c57f270-46a9-4cde-a2a7-82e66da1d084", "ip_address": "10.0.0.1"}], "id": "b780af01-a3c6-4279-a355-fa3289bc1ec3", "security_groups": [], "device_id": "48120dd2-2523-4b11-9f67-8ea944d11012", "name": "", "admin_state_up": true, "network_id": "2be9f80a-e3ac-42e6-9249-5ccca241ad85", "dns_name": null, "binding:vif_details": {}, "binding:vnic_type": "normal", "binding:vif_type": "distributed", "tenant_id": "1c9d0fc21afc40a2959d3d3d4acca528", "mac_address": "fa:16:3e:a7:d8:80", "created_at": "2016-04-07T15:05:57"}} [2] https://review.openstack.org/342872 Scenario1 - Migration on wrong physical network - High Prio =========================================================== Host1 has physical_interface_mappings: pyhsnet1:eth0, physnet2=eth2 Host2 has physical_interface_mappings: physnet1:eth1, physnet2=eth0 Now Live migration from an instance hosted on host1 (connected to physnet1) to host2 succeeds. Libvirt just migrates the whole server with its domain.xml and the macvtap is plugged on the targets side eth0. Now the instance does not have access to its network anymore, but access to another physical network. The behavior is documented, however this needs to be fixed! Scenario2 - Migration fails - Low Prio ====================================== Host1 has physical_interface_mappings: pyhsnet1:eth0 Host2 has physical_interface_mappings: physnet1:eth1 Let's assume a vlan setup. Let's assume a migration from host1 to host2. Host to does NOT have a interface eth0. Migration will fail in instance will remain active on the source, as nova plug on host2 failed to create a vlan device on eth0. If you have a flat network - definition of he libvirt xml will fail on host2. Two approaches are thinkable * Solve the problem (Scenario 1+2) * just prevent such an invalid migration (let scenario 1 fail like scenario 2 fails today) Solve the problem ================= #1 Solve it in Nova pre live migration -------------------------------------- This would allow migration although physical_interface mappings are different. a) On pre live migration nova should change the binding:host to the migration target. This will trigger the portbinding and the mech driver which will update the vif_details with the right macvtap source device information. Libvirt can then adapt the migration-xml to reflect the changes. Currently the update of the binding is done in post migration, after migration succeeded. Can we already do it in pre_live_migration and on failure undo it in rollback ? - There's no issue for the reference implementations - See the prototype: https://review.openstack.org/297100 - But there might be some mechanisms for external SDN Controllers that might shut down ports on the source host as soon as the host_id is being updated. On the other hand, if controller rely on this mechanism, they will set the port up a little too late today, as the update host_id is sent after live migration succeeded. b) The alternative would be to allow a port to be bound to multiple hosts simultaneously. So in pre_live migration, nova would add a binding for the target host and in post_live_migration it would remove the original binding. This would require - simultaneous port binding. This will be achieved by https://bugs.launchpad.net/neutron/+bug/1367391 - allow such a binding for compute ports as well - Update APIs to reflect multiple port_bindings   - Create / Update / Show Port   - host_id is not reflect for DVR ports today [1] #2 Moved to Prevent section --------------------------- #3 Device renaming in the macvtap agent --------------------------------------- This would allow migration although physical_interface mappings are different. Instead of      physical_interface_mapping = physnet1:eth0 use a      physical_interface_mac_mapping = physnet1:00:11:22:33:44:55:66 #where 00:11:22:33:44:55:66 is the mac address of the interface to use On agent startup, the agent could rename the associated device to "physnet1" (or to some other generic value) that is consistent cross all hosts! We would need to document that this interface should not be used by any other application (that relies on the interface name) #4 Use generic vlan device names -------------------------------- This solves the problem only for vlan networks! For flat networks it still would exist Today, the agent generates the vlan device names like this: for eth1 eth1.<vlan-id>. We could get rid of this pattern and use network-uuid.vlan instead. Where nework-uuid are the first 10 chars of the id. But this would not solve the issue for flat networks. Therefore the device renaming like proposed in #3 would be required. Prevent invalid migration ========================= #1 Let Port binding fail (not working) ------------------------------------- The idea is to detect an invalid migration in the mechanism driver and let port binding fail. This approach has two problems a) Portbinding happens AFTER Migration happened. In post live migration nova requests to update the binding:host-id to the target. But when doing so, the instance is already running on the target host. The binding will fail, but the migration happened, though. b) Detecting a migration in the mech driver is difficult. The idea is to use the PortContext.original port. This works if we add the original port to the context somewhere in the ml2 plugin (see proposed patch). The problem is that this is not an indicator for a live migration. The original port will also be added if scheduling on another node failed before and now the current node is picked. There was no live migration but the PortContext.original port is set to another host. Maybe this can be solved but it's worthwhile mentioning here. see patch https://review.openstack.org/293404 #2 Let agent detect invalid migration (not working) --------------------------------------------------- An invalid migration could be detected in the agent to avoid the agent setting the device status to up. But this is too late, as the agent detects the device after migration already started. There is no way to stop it again. see patch https://review.openstack.org/293403 #3 Solve it in nova post live migration ------------------------------------ The idea is, that nova starts the migration and then listens on plug_vif event that is emitted by neutron after the agent reported the device as up. Nova also waits for the portbinding to occur. If one of both runs into a timeout or fails, either the migration should be rolled back (if still possible) or the instance should be set into error state and the network locked down (which is the default for ovs - not sure about other right now). There are some patchsets out that try to achieve something similar, but for the ovs-hybrid plug only. For others it's much more complicated, as the agent will only report the device up after it occured on the target (after migration already started) https://review.openstack.org/246898 #4 Prohibit agent start with invalid mapping -------------------------------------------- Do not allow different mappings at all. How to trigger the validtion? * Have an RPC call from the Agent to the Neutron plugin at agent start. --> Less resource consumption, but extra rpc call * Use the regular agent status reports. --> Checking on every status report consumes a lot of resources (db query and potential mech_driver calls) What to compare? * Have a master interface mappings config option configured on the plugin/mech_driver. All agent mappings must match that mapping --> If the server changes the master mapping, there's no way to notify the agents (or it must get implemented) --> Config option duplication * Query the master mapping from the server and ignore local mapping if one has been configured. * Compare against existing mappings in database. The first agent that sends his mapping via status reports defines the valid mapping. --> We need explicit table locking (locking rows is not sufficient) to avoid races, especially for the cases where the first agents get added. Where to do the validation/gather data for validation? * In the mech driver --> Most natural way, but requires a new mech_driver interface --> Also a new plugin interface is required * In the rpc callbacks class --> As the validation depends on the mech_driver, we would have mech_driver specific code there. But we would get around new interfaces Proposal: * Agent has a new config option "safe_migration_mode" = True|False (Default False, to stay backward compatible) * If it is set, the servers master mapping is queried by the agent on agent start via RPC. * If it does not map the local mapping, the agents terminates * The RPC call will be made to the plugin, which then triggers all mechanism drivers. Those have a generic method like 'get_plugin_configuration()' or similar * If this method is not present, the code will just continue (to not break other drivers) * The plugins returns a dict mech_driver:configuration to the agent. If the mech_driver did not provide any configuration (as not required or method not implemented), it will not be part of the return dict. * If the master mapping on the server got changed, bug agents haven't been restarted, the local mapping will not be validated against the new master mapping again (which would required agent restart) References ========== [1] curl -g -i -X GET http://192.168.122.30:9696/v2.0/ports/b780af01-a3c6-4279-a355-fa3289bc1ec3.json {"port": {"status": "ACTIVE", "binding:host_id": "", "description": "", "allowed_address_pairs": [], "extra_dhcp_opts": [], "updated_at": "2016-04-08T08:59:08", "device_owner": "network:router_interface_distributed", "port_security_enabled": false, "binding:profile": {}, "fixed_ips": [{"subnet_id": "7c57f270-46a9-4cde-a2a7-82e66da1d084", "ip_address": "10.0.0.1"}], "id": "b780af01-a3c6-4279-a355-fa3289bc1ec3", "security_groups": [], "device_id": "48120dd2-2523-4b11-9f67-8ea944d11012", "name": "", "admin_state_up": true, "network_id": "2be9f80a-e3ac-42e6-9249-5ccca241ad85", "dns_name": null, "binding:vif_details": {}, "binding:vnic_type": "normal", "binding:vif_type": "distributed", "tenant_id": "1c9d0fc21afc40a2959d3d3d4acca528", "mac_address": "fa:16:3e:a7:d8:80", "created_at": "2016-04-07T15:05:57"}} [2] https://review.openstack.org/342872
2016-07-19 07:22:20 Andreas Scheuring description Scenario1 - Migration on wrong physical network - High Prio =========================================================== Host1 has physical_interface_mappings: pyhsnet1:eth0, physnet2=eth2 Host2 has physical_interface_mappings: physnet1:eth1, physnet2=eth0 Now Live migration from an instance hosted on host1 (connected to physnet1) to host2 succeeds. Libvirt just migrates the whole server with its domain.xml and the macvtap is plugged on the targets side eth0. Now the instance does not have access to its network anymore, but access to another physical network. The behavior is documented, however this needs to be fixed! Scenario2 - Migration fails - Low Prio ====================================== Host1 has physical_interface_mappings: pyhsnet1:eth0 Host2 has physical_interface_mappings: physnet1:eth1 Let's assume a vlan setup. Let's assume a migration from host1 to host2. Host to does NOT have a interface eth0. Migration will fail in instance will remain active on the source, as nova plug on host2 failed to create a vlan device on eth0. If you have a flat network - definition of he libvirt xml will fail on host2. Two approaches are thinkable * Solve the problem (Scenario 1+2) * just prevent such an invalid migration (let scenario 1 fail like scenario 2 fails today) Solve the problem ================= #1 Solve it in Nova pre live migration -------------------------------------- This would allow migration although physical_interface mappings are different. a) On pre live migration nova should change the binding:host to the migration target. This will trigger the portbinding and the mech driver which will update the vif_details with the right macvtap source device information. Libvirt can then adapt the migration-xml to reflect the changes. Currently the update of the binding is done in post migration, after migration succeeded. Can we already do it in pre_live_migration and on failure undo it in rollback ? - There's no issue for the reference implementations - See the prototype: https://review.openstack.org/297100 - But there might be some mechanisms for external SDN Controllers that might shut down ports on the source host as soon as the host_id is being updated. On the other hand, if controller rely on this mechanism, they will set the port up a little too late today, as the update host_id is sent after live migration succeeded. b) The alternative would be to allow a port to be bound to multiple hosts simultaneously. So in pre_live migration, nova would add a binding for the target host and in post_live_migration it would remove the original binding. This would require - simultaneous port binding. This will be achieved by https://bugs.launchpad.net/neutron/+bug/1367391 - allow such a binding for compute ports as well - Update APIs to reflect multiple port_bindings   - Create / Update / Show Port   - host_id is not reflect for DVR ports today [1] #2 Moved to Prevent section --------------------------- #3 Device renaming in the macvtap agent --------------------------------------- This would allow migration although physical_interface mappings are different. Instead of      physical_interface_mapping = physnet1:eth0 use a      physical_interface_mac_mapping = physnet1:00:11:22:33:44:55:66 #where 00:11:22:33:44:55:66 is the mac address of the interface to use On agent startup, the agent could rename the associated device to "physnet1" (or to some other generic value) that is consistent cross all hosts! We would need to document that this interface should not be used by any other application (that relies on the interface name) #4 Use generic vlan device names -------------------------------- This solves the problem only for vlan networks! For flat networks it still would exist Today, the agent generates the vlan device names like this: for eth1 eth1.<vlan-id>. We could get rid of this pattern and use network-uuid.vlan instead. Where nework-uuid are the first 10 chars of the id. But this would not solve the issue for flat networks. Therefore the device renaming like proposed in #3 would be required. Prevent invalid migration ========================= #1 Let Port binding fail (not working) ------------------------------------- The idea is to detect an invalid migration in the mechanism driver and let port binding fail. This approach has two problems a) Portbinding happens AFTER Migration happened. In post live migration nova requests to update the binding:host-id to the target. But when doing so, the instance is already running on the target host. The binding will fail, but the migration happened, though. b) Detecting a migration in the mech driver is difficult. The idea is to use the PortContext.original port. This works if we add the original port to the context somewhere in the ml2 plugin (see proposed patch). The problem is that this is not an indicator for a live migration. The original port will also be added if scheduling on another node failed before and now the current node is picked. There was no live migration but the PortContext.original port is set to another host. Maybe this can be solved but it's worthwhile mentioning here. see patch https://review.openstack.org/293404 #2 Let agent detect invalid migration (not working) --------------------------------------------------- An invalid migration could be detected in the agent to avoid the agent setting the device status to up. But this is too late, as the agent detects the device after migration already started. There is no way to stop it again. see patch https://review.openstack.org/293403 #3 Solve it in nova post live migration ------------------------------------ The idea is, that nova starts the migration and then listens on plug_vif event that is emitted by neutron after the agent reported the device as up. Nova also waits for the portbinding to occur. If one of both runs into a timeout or fails, either the migration should be rolled back (if still possible) or the instance should be set into error state and the network locked down (which is the default for ovs - not sure about other right now). There are some patchsets out that try to achieve something similar, but for the ovs-hybrid plug only. For others it's much more complicated, as the agent will only report the device up after it occured on the target (after migration already started) https://review.openstack.org/246898 #4 Prohibit agent start with invalid mapping -------------------------------------------- Do not allow different mappings at all. How to trigger the validtion? * Have an RPC call from the Agent to the Neutron plugin at agent start. --> Less resource consumption, but extra rpc call * Use the regular agent status reports. --> Checking on every status report consumes a lot of resources (db query and potential mech_driver calls) What to compare? * Have a master interface mappings config option configured on the plugin/mech_driver. All agent mappings must match that mapping --> If the server changes the master mapping, there's no way to notify the agents (or it must get implemented) --> Config option duplication * Query the master mapping from the server and ignore local mapping if one has been configured. * Compare against existing mappings in database. The first agent that sends his mapping via status reports defines the valid mapping. --> We need explicit table locking (locking rows is not sufficient) to avoid races, especially for the cases where the first agents get added. Where to do the validation/gather data for validation? * In the mech driver --> Most natural way, but requires a new mech_driver interface --> Also a new plugin interface is required * In the rpc callbacks class --> As the validation depends on the mech_driver, we would have mech_driver specific code there. But we would get around new interfaces Proposal: * Agent has a new config option "safe_migration_mode" = True|False (Default False, to stay backward compatible) * If it is set, the servers master mapping is queried by the agent on agent start via RPC. * If it does not map the local mapping, the agents terminates * The RPC call will be made to the plugin, which then triggers all mechanism drivers. Those have a generic method like 'get_plugin_configuration()' or similar * If this method is not present, the code will just continue (to not break other drivers) * The plugins returns a dict mech_driver:configuration to the agent. If the mech_driver did not provide any configuration (as not required or method not implemented), it will not be part of the return dict. * If the master mapping on the server got changed, bug agents haven't been restarted, the local mapping will not be validated against the new master mapping again (which would required agent restart) References ========== [1] curl -g -i -X GET http://192.168.122.30:9696/v2.0/ports/b780af01-a3c6-4279-a355-fa3289bc1ec3.json {"port": {"status": "ACTIVE", "binding:host_id": "", "description": "", "allowed_address_pairs": [], "extra_dhcp_opts": [], "updated_at": "2016-04-08T08:59:08", "device_owner": "network:router_interface_distributed", "port_security_enabled": false, "binding:profile": {}, "fixed_ips": [{"subnet_id": "7c57f270-46a9-4cde-a2a7-82e66da1d084", "ip_address": "10.0.0.1"}], "id": "b780af01-a3c6-4279-a355-fa3289bc1ec3", "security_groups": [], "device_id": "48120dd2-2523-4b11-9f67-8ea944d11012", "name": "", "admin_state_up": true, "network_id": "2be9f80a-e3ac-42e6-9249-5ccca241ad85", "dns_name": null, "binding:vif_details": {}, "binding:vnic_type": "normal", "binding:vif_type": "distributed", "tenant_id": "1c9d0fc21afc40a2959d3d3d4acca528", "mac_address": "fa:16:3e:a7:d8:80", "created_at": "2016-04-07T15:05:57"}} [2] https://review.openstack.org/342872 Scenario1 - Migration on wrong physical network - High Prio =========================================================== Host1 has physical_interface_mappings: pyhsnet1:eth0, physnet2=eth2 Host2 has physical_interface_mappings: physnet1:eth1, physnet2=eth0 Now Live migration from an instance hosted on host1 (connected to physnet1) to host2 succeeds. Libvirt just migrates the whole server with its domain.xml and the macvtap is plugged on the targets side eth0. Now the instance does not have access to its network anymore, but access to another physical network. The behavior is documented, however this needs to be fixed! Scenario2 - Migration fails - Low Prio ====================================== Host1 has physical_interface_mappings: pyhsnet1:eth0 Host2 has physical_interface_mappings: physnet1:eth1 Let's assume a vlan setup. Let's assume a migration from host1 to host2. Host to does NOT have a interface eth0. Migration will fail in instance will remain active on the source, as nova plug on host2 failed to create a vlan device on eth0. If you have a flat network - definition of he libvirt xml will fail on host2. Two approaches are thinkable * Solve the problem (Scenario 1+2) * just prevent such an invalid migration (let scenario 1 fail like scenario 2 fails today) Solve the problem ================= #1 Solve it in Nova pre live migration -------------------------------------- This would allow migration although physical_interface mappings are different. a) On pre live migration nova should change the binding:host to the migration target. This will trigger the portbinding and the mech driver which will update the vif_details with the right macvtap source device information. Libvirt can then adapt the migration-xml to reflect the changes. Currently the update of the binding is done in post migration, after migration succeeded. Can we already do it in pre_live_migration and on failure undo it in rollback ? - There's no issue for the reference implementations - See the prototype: https://review.openstack.org/297100 - But there might be some mechanisms for external SDN Controllers that might shut down ports on the source host as soon as the host_id is being updated. On the other hand, if controller rely on this mechanism, they will set the port up a little too late today, as the update host_id is sent after live migration succeeded. b) The alternative would be to allow a port to be bound to multiple hosts simultaneously. So in pre_live migration, nova would add a binding for the target host and in post_live_migration it would remove the original binding. This would require - simultaneous port binding. This will be achieved by https://bugs.launchpad.net/neutron/+bug/1367391 - allow such a binding for compute ports as well - Update APIs to reflect multiple port_bindings   - Create / Update / Show Port   - host_id is not reflect for DVR ports today [1] #2 Moved to Prevent section --------------------------- #3 Device renaming in the macvtap agent --------------------------------------- This would allow migration although physical_interface mappings are different. Instead of      physical_interface_mapping = physnet1:eth0 use a      physical_interface_mac_mapping = physnet1:00:11:22:33:44:55:66 #where 00:11:22:33:44:55:66 is the mac address of the interface to use On agent startup, the agent could rename the associated device to "physnet1" (or to some other generic value) that is consistent cross all hosts! We would need to document that this interface should not be used by any other application (that relies on the interface name) #4 Use generic vlan device names -------------------------------- This solves the problem only for vlan networks! For flat networks it still would exist Today, the agent generates the vlan device names like this: for eth1 eth1.<vlan-id>. We could get rid of this pattern and use network-uuid.vlan instead. Where nework-uuid are the first 10 chars of the id. But this would not solve the issue for flat networks. Therefore the device renaming like proposed in #3 would be required. Prevent invalid migration ========================= #1 Let Port binding fail (not working) ------------------------------------- The idea is to detect an invalid migration in the mechanism driver and let port binding fail. This approach has two problems a) Portbinding happens AFTER Migration happened. In post live migration nova requests to update the binding:host-id to the target. But when doing so, the instance is already running on the target host. The binding will fail, but the migration happened, though. b) Detecting a migration in the mech driver is difficult. The idea is to use the PortContext.original port. This works if we add the original port to the context somewhere in the ml2 plugin (see proposed patch). The problem is that this is not an indicator for a live migration. The original port will also be added if scheduling on another node failed before and now the current node is picked. There was no live migration but the PortContext.original port is set to another host. Maybe this can be solved but it's worthwhile mentioning here. see patch https://review.openstack.org/293404 #2 Let agent detect invalid migration (not working) --------------------------------------------------- An invalid migration could be detected in the agent to avoid the agent setting the device status to up. But this is too late, as the agent detects the device after migration already started. There is no way to stop it again. see patch https://review.openstack.org/293403 #3 Solve it in nova post live migration ------------------------------------ The idea is, that nova starts the migration and then listens on plug_vif event that is emitted by neutron after the agent reported the device as up. Nova also waits for the portbinding to occur. If one of both runs into a timeout or fails, either the migration should be rolled back (if still possible) or the instance should be set into error state and the network locked down (which is the default for ovs - not sure about other right now). There are some patchsets out that try to achieve something similar, but for the ovs-hybrid plug only. For others it's much more complicated, as the agent will only report the device up after it occured on the target (after migration already started) https://review.openstack.org/246898 #4 Prohibit agent start with invalid mapping -------------------------------------------- Do not allow different mappings at all. How to trigger the validtion? * Have an RPC call from the Agent to the Neutron plugin at agent start. --> Less resource consumption, but extra rpc call * Use the regular agent status reports. --> Checking on every status report consumes a lot of resources (db query and potential mech_driver calls) What to compare? * Have a master interface mappings config option configured on the plugin/mech_driver. All agent mappings must match that mapping --> If the server changes the master mapping, there's no way to notify the agents (or it must get implemented) --> Config option duplication * Query the master mapping from the server and compare at agent side, ord ignore local mapping at all if one has been configured. * Compare against existing mappings in database. The first agent that sends his mapping via status reports defines the valid mapping. --> We need explicit table locking (locking rows is not sufficient) to avoid races, especially for the cases where the first agents get added. Where to do the validation/gather data for validation? * In the mech driver --> Most natural way, but requires a new mech_driver interface --> Also a new plugin interface is required * In the rpc callbacks class --> As the validation depends on the mech_driver, we would have mech_driver specific code there. But we would get around new interfaces * In the agent Proposal: * Agent has a new config option "safe_migration_mode" = True|False (Default False, to stay backward compatible) * If it is set, the servers master mapping is queried by the agent on agent start via RPC. * If it does not map the local mapping, the agents terminates * The RPC call will be made to the plugin, which then triggers all mechanism drivers. Those have a generic method like 'get_plugin_configuration()' or similar * If this method is not present, the code will just continue (to not break other drivers) * The plugins returns a dict mech_driver:configuration to the agent. If the mech_driver did not provide any configuration (as not required or method not implemented), it will not be part of the return dict. * If the master mapping on the server got changed, bug agents haven't been restarted, the local mapping will not be validated against the new master mapping again (which would required agent restart) References ========== [1] curl -g -i -X GET http://192.168.122.30:9696/v2.0/ports/b780af01-a3c6-4279-a355-fa3289bc1ec3.json {"port": {"status": "ACTIVE", "binding:host_id": "", "description": "", "allowed_address_pairs": [], "extra_dhcp_opts": [], "updated_at": "2016-04-08T08:59:08", "device_owner": "network:router_interface_distributed", "port_security_enabled": false, "binding:profile": {}, "fixed_ips": [{"subnet_id": "7c57f270-46a9-4cde-a2a7-82e66da1d084", "ip_address": "10.0.0.1"}], "id": "b780af01-a3c6-4279-a355-fa3289bc1ec3", "security_groups": [], "device_id": "48120dd2-2523-4b11-9f67-8ea944d11012", "name": "", "admin_state_up": true, "network_id": "2be9f80a-e3ac-42e6-9249-5ccca241ad85", "dns_name": null, "binding:vif_details": {}, "binding:vnic_type": "normal", "binding:vif_type": "distributed", "tenant_id": "1c9d0fc21afc40a2959d3d3d4acca528", "mac_address": "fa:16:3e:a7:d8:80", "created_at": "2016-04-07T15:05:57"}} [2] https://review.openstack.org/342872
2016-07-25 09:57:05 Andreas Scheuring description Scenario1 - Migration on wrong physical network - High Prio =========================================================== Host1 has physical_interface_mappings: pyhsnet1:eth0, physnet2=eth2 Host2 has physical_interface_mappings: physnet1:eth1, physnet2=eth0 Now Live migration from an instance hosted on host1 (connected to physnet1) to host2 succeeds. Libvirt just migrates the whole server with its domain.xml and the macvtap is plugged on the targets side eth0. Now the instance does not have access to its network anymore, but access to another physical network. The behavior is documented, however this needs to be fixed! Scenario2 - Migration fails - Low Prio ====================================== Host1 has physical_interface_mappings: pyhsnet1:eth0 Host2 has physical_interface_mappings: physnet1:eth1 Let's assume a vlan setup. Let's assume a migration from host1 to host2. Host to does NOT have a interface eth0. Migration will fail in instance will remain active on the source, as nova plug on host2 failed to create a vlan device on eth0. If you have a flat network - definition of he libvirt xml will fail on host2. Two approaches are thinkable * Solve the problem (Scenario 1+2) * just prevent such an invalid migration (let scenario 1 fail like scenario 2 fails today) Solve the problem ================= #1 Solve it in Nova pre live migration -------------------------------------- This would allow migration although physical_interface mappings are different. a) On pre live migration nova should change the binding:host to the migration target. This will trigger the portbinding and the mech driver which will update the vif_details with the right macvtap source device information. Libvirt can then adapt the migration-xml to reflect the changes. Currently the update of the binding is done in post migration, after migration succeeded. Can we already do it in pre_live_migration and on failure undo it in rollback ? - There's no issue for the reference implementations - See the prototype: https://review.openstack.org/297100 - But there might be some mechanisms for external SDN Controllers that might shut down ports on the source host as soon as the host_id is being updated. On the other hand, if controller rely on this mechanism, they will set the port up a little too late today, as the update host_id is sent after live migration succeeded. b) The alternative would be to allow a port to be bound to multiple hosts simultaneously. So in pre_live migration, nova would add a binding for the target host and in post_live_migration it would remove the original binding. This would require - simultaneous port binding. This will be achieved by https://bugs.launchpad.net/neutron/+bug/1367391 - allow such a binding for compute ports as well - Update APIs to reflect multiple port_bindings   - Create / Update / Show Port   - host_id is not reflect for DVR ports today [1] #2 Moved to Prevent section --------------------------- #3 Device renaming in the macvtap agent --------------------------------------- This would allow migration although physical_interface mappings are different. Instead of      physical_interface_mapping = physnet1:eth0 use a      physical_interface_mac_mapping = physnet1:00:11:22:33:44:55:66 #where 00:11:22:33:44:55:66 is the mac address of the interface to use On agent startup, the agent could rename the associated device to "physnet1" (or to some other generic value) that is consistent cross all hosts! We would need to document that this interface should not be used by any other application (that relies on the interface name) #4 Use generic vlan device names -------------------------------- This solves the problem only for vlan networks! For flat networks it still would exist Today, the agent generates the vlan device names like this: for eth1 eth1.<vlan-id>. We could get rid of this pattern and use network-uuid.vlan instead. Where nework-uuid are the first 10 chars of the id. But this would not solve the issue for flat networks. Therefore the device renaming like proposed in #3 would be required. Prevent invalid migration ========================= #1 Let Port binding fail (not working) ------------------------------------- The idea is to detect an invalid migration in the mechanism driver and let port binding fail. This approach has two problems a) Portbinding happens AFTER Migration happened. In post live migration nova requests to update the binding:host-id to the target. But when doing so, the instance is already running on the target host. The binding will fail, but the migration happened, though. b) Detecting a migration in the mech driver is difficult. The idea is to use the PortContext.original port. This works if we add the original port to the context somewhere in the ml2 plugin (see proposed patch). The problem is that this is not an indicator for a live migration. The original port will also be added if scheduling on another node failed before and now the current node is picked. There was no live migration but the PortContext.original port is set to another host. Maybe this can be solved but it's worthwhile mentioning here. see patch https://review.openstack.org/293404 #2 Let agent detect invalid migration (not working) --------------------------------------------------- An invalid migration could be detected in the agent to avoid the agent setting the device status to up. But this is too late, as the agent detects the device after migration already started. There is no way to stop it again. see patch https://review.openstack.org/293403 #3 Solve it in nova post live migration ------------------------------------ The idea is, that nova starts the migration and then listens on plug_vif event that is emitted by neutron after the agent reported the device as up. Nova also waits for the portbinding to occur. If one of both runs into a timeout or fails, either the migration should be rolled back (if still possible) or the instance should be set into error state and the network locked down (which is the default for ovs - not sure about other right now). There are some patchsets out that try to achieve something similar, but for the ovs-hybrid plug only. For others it's much more complicated, as the agent will only report the device up after it occured on the target (after migration already started) https://review.openstack.org/246898 #4 Prohibit agent start with invalid mapping -------------------------------------------- Do not allow different mappings at all. How to trigger the validtion? * Have an RPC call from the Agent to the Neutron plugin at agent start. --> Less resource consumption, but extra rpc call * Use the regular agent status reports. --> Checking on every status report consumes a lot of resources (db query and potential mech_driver calls) What to compare? * Have a master interface mappings config option configured on the plugin/mech_driver. All agent mappings must match that mapping --> If the server changes the master mapping, there's no way to notify the agents (or it must get implemented) --> Config option duplication * Query the master mapping from the server and compare at agent side, ord ignore local mapping at all if one has been configured. * Compare against existing mappings in database. The first agent that sends his mapping via status reports defines the valid mapping. --> We need explicit table locking (locking rows is not sufficient) to avoid races, especially for the cases where the first agents get added. Where to do the validation/gather data for validation? * In the mech driver --> Most natural way, but requires a new mech_driver interface --> Also a new plugin interface is required * In the rpc callbacks class --> As the validation depends on the mech_driver, we would have mech_driver specific code there. But we would get around new interfaces * In the agent Proposal: * Agent has a new config option "safe_migration_mode" = True|False (Default False, to stay backward compatible) * If it is set, the servers master mapping is queried by the agent on agent start via RPC. * If it does not map the local mapping, the agents terminates * The RPC call will be made to the plugin, which then triggers all mechanism drivers. Those have a generic method like 'get_plugin_configuration()' or similar * If this method is not present, the code will just continue (to not break other drivers) * The plugins returns a dict mech_driver:configuration to the agent. If the mech_driver did not provide any configuration (as not required or method not implemented), it will not be part of the return dict. * If the master mapping on the server got changed, bug agents haven't been restarted, the local mapping will not be validated against the new master mapping again (which would required agent restart) References ========== [1] curl -g -i -X GET http://192.168.122.30:9696/v2.0/ports/b780af01-a3c6-4279-a355-fa3289bc1ec3.json {"port": {"status": "ACTIVE", "binding:host_id": "", "description": "", "allowed_address_pairs": [], "extra_dhcp_opts": [], "updated_at": "2016-04-08T08:59:08", "device_owner": "network:router_interface_distributed", "port_security_enabled": false, "binding:profile": {}, "fixed_ips": [{"subnet_id": "7c57f270-46a9-4cde-a2a7-82e66da1d084", "ip_address": "10.0.0.1"}], "id": "b780af01-a3c6-4279-a355-fa3289bc1ec3", "security_groups": [], "device_id": "48120dd2-2523-4b11-9f67-8ea944d11012", "name": "", "admin_state_up": true, "network_id": "2be9f80a-e3ac-42e6-9249-5ccca241ad85", "dns_name": null, "binding:vif_details": {}, "binding:vnic_type": "normal", "binding:vif_type": "distributed", "tenant_id": "1c9d0fc21afc40a2959d3d3d4acca528", "mac_address": "fa:16:3e:a7:d8:80", "created_at": "2016-04-07T15:05:57"}} [2] https://review.openstack.org/342872 Scenario1 - Migration on wrong physical network - High Prio =========================================================== Host1 has physical_interface_mappings: pyhsnet1:eth0, physnet2=eth2 Host2 has physical_interface_mappings: physnet1:eth1, physnet2=eth0 Now Live migration from an instance hosted on host1 (connected to physnet1) to host2 succeeds. Libvirt just migrates the whole server with its domain.xml and the macvtap is plugged on the targets side eth0. Now the instance does not have access to its network anymore, but access to another physical network. The behavior is documented, however this needs to be fixed! Scenario2 - Migration fails - Low Prio ====================================== Host1 has physical_interface_mappings: pyhsnet1:eth0 Host2 has physical_interface_mappings: physnet1:eth1 Let's assume a vlan setup. Let's assume a migration from host1 to host2. Host to does NOT have a interface eth0. Migration will fail in instance will remain active on the source, as nova plug on host2 failed to create a vlan device on eth0. If you have a flat network - definition of he libvirt xml will fail on host2. Two approaches are thinkable * Solve the problem (Scenario 1+2) * just prevent such an invalid migration (let scenario 1 fail like scenario 2 fails today) Solve the problem ================= #1 Solve it in Nova pre live migration -------------------------------------- This would allow migration although physical_interface mappings are different. a) On pre live migration nova should change the binding:host to the migration target. This will trigger the portbinding and the mech driver which will update the vif_details with the right macvtap source device information. Libvirt can then adapt the migration-xml to reflect the changes. Currently the update of the binding is done in post migration, after migration succeeded. Can we already do it in pre_live_migration and on failure undo it in rollback ? - There's no issue for the reference implementations - See the prototype: https://review.openstack.org/297100 - But there might be some mechanisms for external SDN Controllers that might shut down ports on the source host as soon as the host_id is being updated. On the other hand, if controller rely on this mechanism, they will set the port up a little too late today, as the update host_id is sent after live migration succeeded. b) The alternative would be to allow a port to be bound to multiple hosts simultaneously. So in pre_live migration, nova would add a binding for the target host and in post_live_migration it would remove the original binding. This would require - simultaneous port binding. This will be achieved by https://bugs.launchpad.net/neutron/+bug/1367391 - allow such a binding for compute ports as well - Update APIs to reflect multiple port_bindings   - Create / Update / Show Port   - host_id is not reflect for DVR ports today [1] #2 Moved to Prevent section --------------------------- #3 Device renaming in the macvtap agent --------------------------------------- This would allow migration although physical_interface mappings are different. Instead of      physical_interface_mapping = physnet1:eth0 use a      physical_interface_mac_mapping = physnet1:00:11:22:33:44:55:66 #where 00:11:22:33:44:55:66 is the mac address of the interface to use On agent startup, the agent could rename the associated device to "physnet1" (or to some other generic value) that is consistent cross all hosts! We would need to document that this interface should not be used by any other application (that relies on the interface name) #4 Use generic vlan device names -------------------------------- This solves the problem only for vlan networks! For flat networks it still would exist Today, the agent generates the vlan device names like this: for eth1 eth1.<vlan-id>. We could get rid of this pattern and use network-uuid.vlan instead. Where nework-uuid are the first 10 chars of the id. But this would not solve the issue for flat networks. Therefore the device renaming like proposed in #3 would be required. Prevent invalid migration ========================= #1 Let Port binding fail ------------------------ The idea is to detect an invalid migration in the mechanism driver and let port binding fail. This approach has two problems a) Portbinding happens AFTER Migration happened. In post live migration nova requests to update the binding:host-id to the target. But when doing so, the instance is already running on the target host. The binding will fail, but the migration happened, though. --> But at least the the instance would be in error state and user is aware of that! In addition, we might drop all traffic related to this instance. b) Detecting a migration in the mech driver is difficult. The idea is to use the PortContext.original port. This works if we add the original port to the context somewhere in the ml2 plugin (see proposed patch). The problem is that this is not an indicator for a live migration. The original port will also be added if scheduling on another node failed before and now the current node is picked. There was no live migration but the PortContext.original port is set to another host. Maybe this can be solved but it's worthwhile mentioning here. --> In the worst case, use the profile information added with https://review.openstack.org/#/c/275073/ see patch https://review.openstack.org/293404 #2 Let agent detect invalid migration (not working) --------------------------------------------------- An invalid migration could be detected in the agent to avoid the agent setting the device status to up. But this is too late, as the agent detects the device after migration already started. There is no way to stop it again. see patch https://review.openstack.org/293403 #3 Solve it in nova post live migration ------------------------------------ The idea is, that nova starts the migration and then listens on plug_vif event that is emitted by neutron after the agent reported the device as up. Nova also waits for the portbinding to occur. If one of both runs into a timeout or fails, either the migration should be rolled back (if still possible) or the instance should be set into error state and the network locked down (which is the default for ovs - not sure about other right now). There are some patchsets out that try to achieve something similar, but for the ovs-hybrid plug only. For others it's much more complicated, as the agent will only report the device up after it occured on the target (after migration already started) https://review.openstack.org/246898 #4 Prohibit agent start with invalid mapping -------------------------------------------- Do not allow different mappings at all. How to trigger the validtion? * Have an RPC call from the Agent to the Neutron plugin at agent start. --> Less resource consumption, but extra rpc call * Use the regular agent status reports. --> Checking on every status report consumes a lot of resources (db query and potential mech_driver calls) What to compare? * Have a master interface mappings config option configured on the plugin/mech_driver. All agent mappings must match that mapping --> If the server changes the master mapping, there's no way to notify the agents (or it must get implemented) --> Config option duplication * Query the master mapping from the server and compare at agent side, ord ignore local mapping at all if one has been configured. * Compare against existing mappings in database. The first agent that sends his mapping via status reports defines the valid mapping. --> We need explicit table locking (locking rows is not sufficient) to avoid races, especially for the cases where the first agents get added. Where to do the validation/gather data for validation? * In the mech driver --> Most natural way, but requires a new mech_driver interface --> Also a new plugin interface is required * In the rpc callbacks class --> As the validation depends on the mech_driver, we would have mech_driver specific code there. But we would get around new interfaces * In the agent Proposal: * Agent has a new config option "safe_migration_mode" = True|False (Default False, to stay backward compatible) * If it is set, the servers master mapping is queried by the agent on agent start via RPC. * If it does not map the local mapping, the agents terminates * The RPC call will be made to the plugin, which then triggers all mechanism drivers. Those have a generic method like 'get_plugin_configuration()' or similar * If this method is not present, the code will just continue (to not break other drivers) * The plugins returns a dict mech_driver:configuration to the agent. If the mech_driver did not provide any configuration (as not required or method not implemented), it will not be part of the return dict. * If the master mapping on the server got changed, bug agents haven't been restarted, the local mapping will not be validated against the new master mapping again (which would required agent restart) References ========== [1] curl -g -i -X GET http://192.168.122.30:9696/v2.0/ports/b780af01-a3c6-4279-a355-fa3289bc1ec3.json {"port": {"status": "ACTIVE", "binding:host_id": "", "description": "", "allowed_address_pairs": [], "extra_dhcp_opts": [], "updated_at": "2016-04-08T08:59:08", "device_owner": "network:router_interface_distributed", "port_security_enabled": false, "binding:profile": {}, "fixed_ips": [{"subnet_id": "7c57f270-46a9-4cde-a2a7-82e66da1d084", "ip_address": "10.0.0.1"}], "id": "b780af01-a3c6-4279-a355-fa3289bc1ec3", "security_groups": [], "device_id": "48120dd2-2523-4b11-9f67-8ea944d11012", "name": "", "admin_state_up": true, "network_id": "2be9f80a-e3ac-42e6-9249-5ccca241ad85", "dns_name": null, "binding:vif_details": {}, "binding:vnic_type": "normal", "binding:vif_type": "distributed", "tenant_id": "1c9d0fc21afc40a2959d3d3d4acca528", "mac_address": "fa:16:3e:a7:d8:80", "created_at": "2016-04-07T15:05:57"}} [2] https://review.openstack.org/342872
2016-08-17 23:16:41 Armando Migliaccio neutron: status In Progress Confirmed
2016-08-26 15:21:22 OpenStack Infra neutron: status Confirmed In Progress
2016-09-01 20:09:01 Armando Migliaccio neutron: milestone newton-3 newton-rc1
2016-09-13 01:08:09 Armando Migliaccio neutron: milestone newton-rc1 ocata-1
2016-10-20 13:58:14 OpenStack Infra neutron: assignee Andreas Scheuring (andreas-scheuring) Gary Kotton (garyk)
2016-10-21 10:06:08 OpenStack Infra neutron: assignee Gary Kotton (garyk) Andreas Scheuring (andreas-scheuring)
2016-11-16 22:40:52 Armando Migliaccio neutron: milestone ocata-1 ocata-2
2016-11-28 10:50:50 Andreas Scheuring neutron: importance High Medium
2016-11-28 11:44:25 OpenStack Infra neutron: assignee Andreas Scheuring (andreas-scheuring) Gary Kotton (garyk)
2017-01-06 16:03:08 Armando Migliaccio neutron: milestone ocata-2 ocata-3
2017-01-26 23:39:55 Armando Migliaccio neutron: milestone ocata-3 ocata-rc1
2017-02-06 23:41:48 Armando Migliaccio neutron: assignee Gary Kotton (garyk) Andreas Scheuring (andreas-scheuring)
2017-02-06 23:42:40 Armando Migliaccio neutron: milestone ocata-rc1 pike-1
2017-05-18 01:19:23 Armando Migliaccio neutron: milestone pike-1 pike-2
2022-10-19 11:20:08 Rodolfo Alonso neutron: status In Progress Won't Fix