Activity log for bug #1940668

Date Who What changed Old value New value Message
2021-08-20 14:41:53 Ilya Popov bug added bug
2021-08-20 14:43:32 Ilya Popov description Description =========== Reproduced in ussuri, master has the same code. When nova compute start to fit instance's NUMA topology on host's NUMA topology it uses host cells list. This list contains cell objects from cell 0 up to cell N always sorted from cell id 0 up to cell id N (N number depends on host numa node number). The only case when sort order of this list is changed is the case with instance without pci device requirement. If instance doesn't need pci specific to NUMA node, host cells list is reordered to place cells with PCI capabilities to the end of list. If all NUMA cells have PCI capabilities, list order won't changed. This behaviour leads to attempt to place instance's first NUMA node to host NUMA node id 0 at the beginning. If we will use huge pages and place several instances with number of NUMA nodes less when Host NUMA node number, we exhaust completely NUMA node id 0. Which will lead to instances with larger number of NUMA nodes failed to fit on this host (for example instance with NUMA nodes number equal to host NUMA node number). To mitigate this issue, it will be better to take into account NUMA node memory usage. May be related also to https://bugs.launchpad.net/nova/+bug/1738501 Steps to reproduce ================== 1. Configure OpenStack to use 2MB huge pages, allocate huge pages on compute host(let say compute 1) during boot For ussuri it is described here: https://docs.openstack.org/nova/ussuri/admin/huge-pages.html 2. Prepare to flavors to test issue: one flavor with hw:mem_page_size='2MB', hw:numa_nodes='1', second flavor with hw:mem_page_size='2MB', hw:numa_nodes='N', where N - number of NUMA nodes on compute we will use for testing. Compute should have node number more than 1. Flavor's RAM should be large enough to exhaust compute NUMA node 0 RAM with some small number of instances. Lets say with 6 instances of flavor 1 we exhaust compute NUMA node 0 RAM. Flavor 2 RAM should be equal Flavor 1 RAM multiplied by N (number of numa nodes on compute 1.) 3. Start 6 instances with first flavor (with 1 NUMA node defined) on compute 1 (with availability zone hint pointed to compute 1). RAM of NUMA node 0 on host compute 1 will be exhausted 4. Try to start instance with second flavor. Instance will not been started with error "...was re-scheduled: Insufficient compute resources: Requested instance NUMA topology cannot fit the given host NUMA topology" How it should work: =================== We should take into account memory usage of numa nodes to reduce number of this kind of error. Description =========== Reproduced in ussuri, master has the same code. When nova compute start to fit instance's NUMA topology on host's NUMA topology it uses host cells list. This list contains cell objects from cell 0 up to cell N always sorted from cell id 0 up to cell id N (N number depends on host numa node number). The only case when sort order of this list is changed is the case with instance without pci device requirement. If instance doesn't need pci specific to NUMA node, host cells list is reordered to place cells with PCI capabilities to the end of list. If all NUMA cells have PCI capabilities, list order won't changed. This behaviour leads to attempt to place instance's first NUMA node to host NUMA node id 0 at the beginning. If we will use huge pages and place several instances with number of NUMA nodes less when Host NUMA node number, we exhaust completely NUMA node id 0. Which will lead to instances with larger number of NUMA nodes failed to fit on this host (for example instance with NUMA nodes number equal to host NUMA node number). To mitigate this issue, it will be better to take into account NUMA node memory usage. May be related also to https://bugs.launchpad.net/nova/+bug/1738501 Steps to reproduce ================== 1. Configure OpenStack to use 2MB huge pages, allocate huge pages on compute host(let say compute 1) during boot For ussuri it is described here: https://docs.openstack.org/nova/ussuri/admin/huge-pages.html 2. Prepare two flavors to test issue: one flavor with hw:mem_page_size='2MB', hw:numa_nodes='1', second flavor with hw:mem_page_size='2MB', hw:numa_nodes='N', where N - number of NUMA nodes on compute we will use for testing. Compute should have node number more than 1. Flavor's RAM should be large enough to exhaust compute NUMA node 0 RAM with some small number of instances. Lets say with 6 instances of flavor 1 we exhaust compute NUMA node 0 RAM. Flavor 2 RAM should be equal Flavor 1 RAM multiplied by N (number of numa nodes on compute 1.) 3. Start 6 instances with first flavor (with 1 NUMA node defined) on compute 1 (with availability zone hint pointed to compute 1). RAM of NUMA node 0 on host compute 1 will be exhausted 4. Try to start instance with second flavor. Instance will not been started with error "...was re-scheduled: Insufficient compute resources: Requested instance NUMA topology cannot fit the given host NUMA topology" How it should work: =================== We should take into account memory usage of numa nodes to reduce number of this kind of error.
2021-08-20 14:44:17 Ilya Popov description Description =========== Reproduced in ussuri, master has the same code. When nova compute start to fit instance's NUMA topology on host's NUMA topology it uses host cells list. This list contains cell objects from cell 0 up to cell N always sorted from cell id 0 up to cell id N (N number depends on host numa node number). The only case when sort order of this list is changed is the case with instance without pci device requirement. If instance doesn't need pci specific to NUMA node, host cells list is reordered to place cells with PCI capabilities to the end of list. If all NUMA cells have PCI capabilities, list order won't changed. This behaviour leads to attempt to place instance's first NUMA node to host NUMA node id 0 at the beginning. If we will use huge pages and place several instances with number of NUMA nodes less when Host NUMA node number, we exhaust completely NUMA node id 0. Which will lead to instances with larger number of NUMA nodes failed to fit on this host (for example instance with NUMA nodes number equal to host NUMA node number). To mitigate this issue, it will be better to take into account NUMA node memory usage. May be related also to https://bugs.launchpad.net/nova/+bug/1738501 Steps to reproduce ================== 1. Configure OpenStack to use 2MB huge pages, allocate huge pages on compute host(let say compute 1) during boot For ussuri it is described here: https://docs.openstack.org/nova/ussuri/admin/huge-pages.html 2. Prepare two flavors to test issue: one flavor with hw:mem_page_size='2MB', hw:numa_nodes='1', second flavor with hw:mem_page_size='2MB', hw:numa_nodes='N', where N - number of NUMA nodes on compute we will use for testing. Compute should have node number more than 1. Flavor's RAM should be large enough to exhaust compute NUMA node 0 RAM with some small number of instances. Lets say with 6 instances of flavor 1 we exhaust compute NUMA node 0 RAM. Flavor 2 RAM should be equal Flavor 1 RAM multiplied by N (number of numa nodes on compute 1.) 3. Start 6 instances with first flavor (with 1 NUMA node defined) on compute 1 (with availability zone hint pointed to compute 1). RAM of NUMA node 0 on host compute 1 will be exhausted 4. Try to start instance with second flavor. Instance will not been started with error "...was re-scheduled: Insufficient compute resources: Requested instance NUMA topology cannot fit the given host NUMA topology" How it should work: =================== We should take into account memory usage of numa nodes to reduce number of this kind of error. Description =========== Reproduced in ussuri, master has the same code. When nova compute start to fit instance's NUMA topology on host's NUMA topology it uses host cells list. This list contains cell objects from cell 0 up to cell N always sorted from cell id 0 up to cell id N (N number depends on host numa node number). The only case when sort order of this list is changed is the case with instance without pci device requirement. If instance doesn't need pci specific to NUMA node, host cells list is reordered to place cells with PCI capabilities to the end of list. If all NUMA cells have PCI capabilities, list order won't changed. This behaviour leads to attempt to place instance's first NUMA node to host NUMA node id 0 at the beginning. If we will use huge pages and place several instances with number of NUMA nodes less when Host NUMA node number, we exhaust completely NUMA node id 0. Which will lead to instances with larger number of NUMA nodes failed to fit on this host (for example instance with NUMA nodes number equal to host NUMA node number). To mitigate this issue, it will be better to take into account NUMA node memory usage. May be related also to https://bugs.launchpad.net/nova/+bug/1738501 Steps to reproduce ================== 1. Configure OpenStack to use 2MB huge pages, allocate huge pages on compute host(let say compute 1) during boot For ussuri it is described here: https://docs.openstack.org/nova/ussuri/admin/huge-pages.html 2. Prepare two flavors to test issue: one flavor with hw:mem_page_size='2MB', hw:numa_nodes='1', second flavor with hw:mem_page_size='2MB', hw:numa_nodes='N', where N - number of NUMA nodes on compute we will use for testing. Compute should have NUMA node number more than 1. Flavor's RAM should be large enough to exhaust compute NUMA node 0 RAM with some small number of instances. Lets say with 6 instances of flavor 1 we exhaust compute NUMA node 0 RAM. Flavor 2 RAM should be equal Flavor 1 RAM multiplied by N (number of numa nodes on compute 1.) 3. Start 6 instances with first flavor (with 1 NUMA node defined) on compute 1 (with availability zone hint pointed to compute 1). RAM of NUMA node 0 on host compute 1 will be exhausted 4. Try to start instance with second flavor. Instance will not been started with error "...was re-scheduled: Insufficient compute resources: Requested instance NUMA topology cannot fit the given host NUMA topology" How it should work: =================== We should take into account memory usage of numa nodes to reduce number of this kind of error.
2021-08-20 20:10:11 Ilya Popov description Description =========== Reproduced in ussuri, master has the same code. When nova compute start to fit instance's NUMA topology on host's NUMA topology it uses host cells list. This list contains cell objects from cell 0 up to cell N always sorted from cell id 0 up to cell id N (N number depends on host numa node number). The only case when sort order of this list is changed is the case with instance without pci device requirement. If instance doesn't need pci specific to NUMA node, host cells list is reordered to place cells with PCI capabilities to the end of list. If all NUMA cells have PCI capabilities, list order won't changed. This behaviour leads to attempt to place instance's first NUMA node to host NUMA node id 0 at the beginning. If we will use huge pages and place several instances with number of NUMA nodes less when Host NUMA node number, we exhaust completely NUMA node id 0. Which will lead to instances with larger number of NUMA nodes failed to fit on this host (for example instance with NUMA nodes number equal to host NUMA node number). To mitigate this issue, it will be better to take into account NUMA node memory usage. May be related also to https://bugs.launchpad.net/nova/+bug/1738501 Steps to reproduce ================== 1. Configure OpenStack to use 2MB huge pages, allocate huge pages on compute host(let say compute 1) during boot For ussuri it is described here: https://docs.openstack.org/nova/ussuri/admin/huge-pages.html 2. Prepare two flavors to test issue: one flavor with hw:mem_page_size='2MB', hw:numa_nodes='1', second flavor with hw:mem_page_size='2MB', hw:numa_nodes='N', where N - number of NUMA nodes on compute we will use for testing. Compute should have NUMA node number more than 1. Flavor's RAM should be large enough to exhaust compute NUMA node 0 RAM with some small number of instances. Lets say with 6 instances of flavor 1 we exhaust compute NUMA node 0 RAM. Flavor 2 RAM should be equal Flavor 1 RAM multiplied by N (number of numa nodes on compute 1.) 3. Start 6 instances with first flavor (with 1 NUMA node defined) on compute 1 (with availability zone hint pointed to compute 1). RAM of NUMA node 0 on host compute 1 will be exhausted 4. Try to start instance with second flavor. Instance will not been started with error "...was re-scheduled: Insufficient compute resources: Requested instance NUMA topology cannot fit the given host NUMA topology" How it should work: =================== We should take into account memory usage of numa nodes to reduce number of this kind of error. Description =========== Reproduced in ussuri, master has the same code. When nova compute start to fit instance's NUMA topology on host's NUMA topology it uses host cells list. This list contains cell objects from cell 0 up to cell N always sorted from cell id 0 up to cell id N (N number depends on host numa node number). The only case when sort order of this list is changed is the case with instance without pci device requirement. If instance doesn't need pci specific to NUMA node, host cells list is reordered to place cells with PCI capabilities to the end of list. If all NUMA cells have PCI capabilities, list order won't changed. This behaviour leads to attempt to place instance's first NUMA node to host NUMA node id 0 at the beginning. If we will use huge pages and place several instances with number of NUMA nodes less when Host NUMA node number, we exhaust completely NUMA node id 0. Which will lead to instances with larger number of NUMA nodes failed to fit on this host (for example instance with NUMA nodes number equal to host NUMA node number). To mitigate this issue, it will be better to take into account NUMA node memory usage. May be related also to https://bugs.launchpad.net/nova/+bug/1738501 Steps to reproduce ================== 1. Configure OpenStack to use 2MB huge pages, allocate huge pages on compute host(let say compute 1) during boot For ussuri it is described here: https://docs.openstack.org/nova/ussuri/admin/huge-pages.html 2. Prepare two flavors to test issue: one flavor with hw:mem_page_size='2MB', hw:numa_nodes='1', second flavor with hw:mem_page_size='2MB', hw:numa_nodes='N', where N - number of NUMA nodes on compute we will use for testing. Compute should have NUMA node number more than 1. Flavor's RAM should be large enough to exhaust compute NUMA node 0 RAM with some small number of instances. Lets say with 6 instances of flavor 1 we exhaust compute NUMA node 0 RAM. Flavor 2 RAM should be equal Flavor 1 RAM multiplied by N (number of numa nodes on compute 1.) 3. Start 6 instances with first flavor (with 1 NUMA node defined) on compute 1 (with availability zone hint pointed to compute 1). RAM of NUMA node 0 on host compute 1 will be exhausted 4. Try to start instance with second flavor. Instance will not been started with error "...was re-scheduled: Insufficient compute resources: Requested instance NUMA topology cannot fit the given host NUMA topology" How it should work: =================== We should take into account memory usage of NUMA nodes to reduce number of this kind of error. So it is needed to use first NUMA nodes with more free RAM available.
2021-08-23 11:38:39 Ilya Popov nova: assignee Ilya Popov (ilya-p)
2021-08-23 14:07:31 OpenStack Infra nova: status New In Progress
2022-02-01 07:27:37 Ilya Popov description Description =========== Reproduced in ussuri, master has the same code. When nova compute start to fit instance's NUMA topology on host's NUMA topology it uses host cells list. This list contains cell objects from cell 0 up to cell N always sorted from cell id 0 up to cell id N (N number depends on host numa node number). The only case when sort order of this list is changed is the case with instance without pci device requirement. If instance doesn't need pci specific to NUMA node, host cells list is reordered to place cells with PCI capabilities to the end of list. If all NUMA cells have PCI capabilities, list order won't changed. This behaviour leads to attempt to place instance's first NUMA node to host NUMA node id 0 at the beginning. If we will use huge pages and place several instances with number of NUMA nodes less when Host NUMA node number, we exhaust completely NUMA node id 0. Which will lead to instances with larger number of NUMA nodes failed to fit on this host (for example instance with NUMA nodes number equal to host NUMA node number). To mitigate this issue, it will be better to take into account NUMA node memory usage. May be related also to https://bugs.launchpad.net/nova/+bug/1738501 Steps to reproduce ================== 1. Configure OpenStack to use 2MB huge pages, allocate huge pages on compute host(let say compute 1) during boot For ussuri it is described here: https://docs.openstack.org/nova/ussuri/admin/huge-pages.html 2. Prepare two flavors to test issue: one flavor with hw:mem_page_size='2MB', hw:numa_nodes='1', second flavor with hw:mem_page_size='2MB', hw:numa_nodes='N', where N - number of NUMA nodes on compute we will use for testing. Compute should have NUMA node number more than 1. Flavor's RAM should be large enough to exhaust compute NUMA node 0 RAM with some small number of instances. Lets say with 6 instances of flavor 1 we exhaust compute NUMA node 0 RAM. Flavor 2 RAM should be equal Flavor 1 RAM multiplied by N (number of numa nodes on compute 1.) 3. Start 6 instances with first flavor (with 1 NUMA node defined) on compute 1 (with availability zone hint pointed to compute 1). RAM of NUMA node 0 on host compute 1 will be exhausted 4. Try to start instance with second flavor. Instance will not been started with error "...was re-scheduled: Insufficient compute resources: Requested instance NUMA topology cannot fit the given host NUMA topology" How it should work: =================== We should take into account memory usage of NUMA nodes to reduce number of this kind of error. So it is needed to use first NUMA nodes with more free RAM available. Description =========== Reproduced in ussuri, master has the same code. When nova compute start to fit instance's NUMA topology on host's NUMA topology it uses host cells list. This list contains cell objects from cell 0 up to cell N always sorted from cell id 0 up to cell id N (N number depends on host numa node number). The only case when sort order of this list is changed is the case with instance without pci device requirement. If instance doesn't need pci specific to NUMA node, host cells list is reordered to place cells with PCI capabilities to the end of list. If all NUMA cells have PCI capabilities, list order won't changed. This behaviour leads to attempt to place instance's first NUMA node to host NUMA node id 0 at the beginning. If we will use huge pages and place several instances with number of NUMA nodes less when Host NUMA node number, we exhaust completely NUMA node id 0. Which will lead to instances with larger number of NUMA nodes failed to fit on this host (for example instance with NUMA nodes number equal to host NUMA node number). To mitigate this issue, it will be better to take into account NUMA node memory usage. May be related also to https://bugs.launchpad.net/nova/+bug/1738501 https://bugs.launchpad.net/nova/+bug/1887377 https://bugs.launchpad.net/nova/+bug/1893121 Steps to reproduce ================== 1. Configure OpenStack to use 2MB huge pages, allocate huge pages on compute host(let say compute 1) during boot For ussuri it is described here: https://docs.openstack.org/nova/ussuri/admin/huge-pages.html 2. Prepare two flavors to test issue: one flavor with hw:mem_page_size='2MB', hw:numa_nodes='1', second flavor with hw:mem_page_size='2MB', hw:numa_nodes='N', where N - number of NUMA nodes on compute we will use for testing. Compute should have NUMA node number more than 1. Flavor's RAM should be large enough to exhaust compute NUMA node 0 RAM with some small number of instances. Lets say with 6 instances of flavor 1 we exhaust compute NUMA node 0 RAM. Flavor 2 RAM should be equal Flavor 1 RAM multiplied by N (number of numa nodes on compute 1.) 3. Start 6 instances with first flavor (with 1 NUMA node defined) on compute 1 (with availability zone hint pointed to compute 1). RAM of NUMA node 0 on host compute 1 will be exhausted 4. Try to start instance with second flavor. Instance will not been started with error "...was re-scheduled: Insufficient compute resources: Requested instance NUMA topology cannot fit the given host NUMA topology" How it should work: =================== We should take into account memory usage of NUMA nodes to reduce number of this kind of error. So it is needed to use first NUMA nodes with more free RAM available.
2022-02-01 07:29:25 Ilya Popov description Description =========== Reproduced in ussuri, master has the same code. When nova compute start to fit instance's NUMA topology on host's NUMA topology it uses host cells list. This list contains cell objects from cell 0 up to cell N always sorted from cell id 0 up to cell id N (N number depends on host numa node number). The only case when sort order of this list is changed is the case with instance without pci device requirement. If instance doesn't need pci specific to NUMA node, host cells list is reordered to place cells with PCI capabilities to the end of list. If all NUMA cells have PCI capabilities, list order won't changed. This behaviour leads to attempt to place instance's first NUMA node to host NUMA node id 0 at the beginning. If we will use huge pages and place several instances with number of NUMA nodes less when Host NUMA node number, we exhaust completely NUMA node id 0. Which will lead to instances with larger number of NUMA nodes failed to fit on this host (for example instance with NUMA nodes number equal to host NUMA node number). To mitigate this issue, it will be better to take into account NUMA node memory usage. May be related also to https://bugs.launchpad.net/nova/+bug/1738501 https://bugs.launchpad.net/nova/+bug/1887377 https://bugs.launchpad.net/nova/+bug/1893121 Steps to reproduce ================== 1. Configure OpenStack to use 2MB huge pages, allocate huge pages on compute host(let say compute 1) during boot For ussuri it is described here: https://docs.openstack.org/nova/ussuri/admin/huge-pages.html 2. Prepare two flavors to test issue: one flavor with hw:mem_page_size='2MB', hw:numa_nodes='1', second flavor with hw:mem_page_size='2MB', hw:numa_nodes='N', where N - number of NUMA nodes on compute we will use for testing. Compute should have NUMA node number more than 1. Flavor's RAM should be large enough to exhaust compute NUMA node 0 RAM with some small number of instances. Lets say with 6 instances of flavor 1 we exhaust compute NUMA node 0 RAM. Flavor 2 RAM should be equal Flavor 1 RAM multiplied by N (number of numa nodes on compute 1.) 3. Start 6 instances with first flavor (with 1 NUMA node defined) on compute 1 (with availability zone hint pointed to compute 1). RAM of NUMA node 0 on host compute 1 will be exhausted 4. Try to start instance with second flavor. Instance will not been started with error "...was re-scheduled: Insufficient compute resources: Requested instance NUMA topology cannot fit the given host NUMA topology" How it should work: =================== We should take into account memory usage of NUMA nodes to reduce number of this kind of error. So it is needed to use first NUMA nodes with more free RAM available. Description =========== Reproduced in ussuri, master has the same code. When nova compute start to fit instance's NUMA topology on host's NUMA topology it uses host cells list. This list contains cell objects from cell 0 up to cell N always sorted from cell id 0 up to cell id N (N number depends on host numa node number). The only case when sort order of this list is changed is the case with instance without pci device requirement. If instance doesn't need pci specific to NUMA node, host cells list is reordered to place cells with PCI capabilities to the end of list. If all NUMA cells have PCI capabilities, list order won't changed. This behaviour leads to attempt to place instance's first NUMA node to host NUMA node id 0 at the beginning. If we will use huge pages and place several instances with number of NUMA nodes less when Host NUMA node number, we exhaust completely NUMA node id 0. Which will lead to instances with larger number of NUMA nodes failed to fit on this host (for example instance with NUMA nodes number equal to host NUMA node number). To mitigate this issue, it will be better to take into account NUMA node memory usage. May be related also to: https://bugs.launchpad.net/nova/+bug/1738501 https://bugs.launchpad.net/nova/+bug/1887377 https://bugs.launchpad.net/nova/+bug/1893121 Steps to reproduce ================== 1. Configure OpenStack to use 2MB huge pages, allocate huge pages on compute host(let say compute 1) during boot For ussuri it is described here: https://docs.openstack.org/nova/ussuri/admin/huge-pages.html 2. Prepare two flavors to test issue: one flavor with hw:mem_page_size='2MB', hw:numa_nodes='1', second flavor with hw:mem_page_size='2MB', hw:numa_nodes='N', where N - number of NUMA nodes on compute we will use for testing. Compute should have NUMA node number more than 1. Flavor's RAM should be large enough to exhaust compute NUMA node 0 RAM with some small number of instances. Lets say with 6 instances of flavor 1 we exhaust compute NUMA node 0 RAM. Flavor 2 RAM should be equal Flavor 1 RAM multiplied by N (number of numa nodes on compute 1.) 3. Start 6 instances with first flavor (with 1 NUMA node defined) on compute 1 (with availability zone hint pointed to compute 1). RAM of NUMA node 0 on host compute 1 will be exhausted 4. Try to start instance with second flavor. Instance will not been started with error "...was re-scheduled: Insufficient compute resources: Requested instance NUMA topology cannot fit the given host NUMA topology" How it should work: =================== We should take into account memory usage of NUMA nodes to reduce number of this kind of error. So it is needed to use first NUMA nodes with more free RAM available.
2022-10-24 12:42:39 Bartosz Bezak bug added subscriber Bartosz Bezak