OpenStack Compute (nova)

Server group anti-affinity no longer works

Bug #1863190 reported by Michael Johnson on 2020-02-14

This bug affects 3 people

Affects		Status	Importance	Assigned to	Milestone
	OpenStack Compute (nova)	Invalid	Undecided	Unassigned

Bug Description

Server group anti-affinity is no longer working, at least in the simple case. I am able to boot two VMs in an anti-affinity server group on a devstack that only has one compute instance. Previously this would fail and/or require soft-anti-affinity enabled.

$ openstack server list
+--------------------------------------+----------------------------------------------+--------+-----------------------------------------------+---------------------+------------+
| ID | Name | Status | Networks | Image | Flavor |
+--------------------------------------+----------------------------------------------+--------+-----------------------------------------------+---------------------+------------+
| a44febef-330c-4db5-b220-959cbbff8f8c | amphora-1bc97ddb-80da-446a-bce3-0c867c1fc258 | ACTIVE | lb-mgmt-net=192.168.0.58; public=172.24.4.200 | amphora-x64-haproxy | m1.amphora |
| de776347-0cf4-47d5-bb37-17fb37d79f2e | amphora-433abe98-fd8e-4e4f-ac11-4f76bbfc7aba | ACTIVE | lb-mgmt-net=192.168.0.199; public=172.24.4.11 | amphora-x64-haproxy | m1.amphora |
+--------------------------------------+----------------------------------------------+--------+-----------------------------------------------+---------------------+------------+

$ openstack server group show ddbc8544-c664-4da4-8fd8-32f6bd01e960
+----------+----------------------------------------------------------------------------+
| Field | Value |
+----------+----------------------------------------------------------------------------+
| id | ddbc8544-c664-4da4-8fd8-32f6bd01e960 |
| members | a44febef-330c-4db5-b220-959cbbff8f8c, de776347-0cf4-47d5-bb37-17fb37d79f2e |
| name | octavia-lb-cc40d031-6ce9-475f-81b4-0a6096178834 |
| policies | anti-affinity |
+----------+----------------------------------------------------------------------------+

Steps to reproduce:
1. Boot a devstack.
2. Create an anti-affinity server group.
2. Boot two VMs in that server group.

Expected Behavior:

The second VM boot should fail with an error similar to "not enough hosts"

Actual Behavior:

The second VM boots with no error, The two instances in the server group are on the same host.

Environment:
Nova version (current Ussuri): 0d3aeb0287a0619695c9b9e17c2dec49099876a5
commit 0d3aeb0287a0619695c9b9e17c2dec49099876a5 (HEAD -> master, origin/master, origin/HEAD)
Merge: 1fcd74730d 65825ebfbd
Author: Zuul <email address hidden>
Date: Thu Feb 13 14:25:10 2020 +0000

Merge "Make RBD imagebackend flatten method idempotent"

Fresh devstack install, however I have another devstack from August that is also showing this behavior.

Revision history for this message

Michael Johnson (johnsom) wrote on 2020-02-14:

devstack@n-* log files. Edit (169.0 KiB, application/x-tar)

devstack@n-* log files.

Revision history for this message

Michael Johnson (johnsom) wrote on 2020-02-14:

As was requested in IRC, if I wait until the first instance goes to ACTIVE, the second build will go to ERROR as expected.
This isn't a good workaround for us as waiting for ACTIVE can take up to five minutes in some clouds and we would want to get the secondary instance started much faster than that.

This previously worked as expected, but I don't know exactly when it started allowing two instances on the same host when using "hard" anti-affinity.

Revision history for this message

Adam Harwell (adam-harwell) wrote on 2020-02-21:

What I see in my cloud is that one of the two will schedule and build, and the other will schedule, but fail to build with a rescheduling error:

```
{'message': 'Build of instance 417e19c2-e2a5-48e0-8ce5-0f087c5f6091 was re-scheduled: Anti-affinity instance group policy was violated.', 'code': 500, 'details': 'Traceback (most recent call last):\n File "/opt/openstack/venv/nova/lib/python2.7/site-packages/nova/compute/manager.py", line 1941, in _do_build_and_run_instance\n filter_properties, request_spec)\n File "/opt/openstack/venv/nova/lib/python2.7/site-packages/nova/compute/manager.py", line 2230, in _build_and_run_instance\n instance_uuid=instance.uuid, reason=six.text_type(e))\nRescheduledException: Build of instance 417e19c2-e2a5-48e0-8ce5-0f087c5f6091 was re-scheduled: Build of instance 417e19c2-e2a5-48e0-8ce5-0f087c5f6091 was re-scheduled: Anti-affinity instance group policy was violated.\n', 'created': '2020-02-21T03:43:18Z'}
```

This is with hard anti-affinity.
With soft-anti-affinity, there would be no reschedule forced, and therefore it would just never be effective.

Revision history for this message

melanie witt (melwitt) wrote on 2020-03-11:

Apologies for just now coming back to this -- it completely slipped my mind :(

The behavior Adam described is what is correct and expected in the "parallel requests for hard anti-affinity" scenario. The two will race and land on the same compute host initially. One of them will "win" and the other will fail what we call the "late affinity check" in nova-compute on the compute host and will be rescheduled and then fail if no other host is available.

Adam, do you recall what release version of nova you used when you did your test? Was it master/Ussuri or an older release?

Revision history for this message

Balazs Gibizer (balazs-gibizer) wrote on 2020-03-11:

The late affinity check that fixes the race is an upcall that simply not possible with the default cellv2 setup [1].

In a non resource constrained cloud there is another way to limit the possible race. You can configure [filter_scheduler]/host_subset_size [2] to positive number to avoid parallel scheduling requests selecting the same host as target.

[1] https://docs.openstack.org/nova/latest/user/cellsv2-layout.html#operations-requiring-upcalls
[2] https://docs.openstack.org/nova/train/configuration/config.html#filter_scheduler.host_subset_size

Revision history for this message

melanie witt (melwitt) wrote on 2020-03-11:

The late affinity check works in a single-cell cellsv2 setup as [workarounds]/disable_group_policy_check_upcall defaults to False and will work as long as the nova-scheduler and nova-compute are on the same message queue. The single-cell cellsv2 setup is the most common deployment and is what devstack uses.

In a multi-cell cellsv2 setup though, it is true that the late affinity check is not possible regardless of the [workarounds]/disable_group_policy_check_upcall config option setting because nova-scheduler and nova-compute would not be connected to the same message queue.

So, I think we still need to investigate what is going on and verify whether/how a regression has occurred.

Revision history for this message

melanie witt (melwitt) wrote on 2020-03-25:

Download full text (6.5 KiB)

I finally got a chance to try and reproduce this on a devstack and can now see what you have reported.

$ git log -1
commit e20e731630c1b337daf4446286bb6c8e761025e3 (HEAD -> master, origin/master, origin/HEAD)
Merge: fc159ac91b 998475f5bd
Author: Zuul <email address hidden>
Date: Wed Mar 11 19:03:28 2020 +0000

Merge "nova-net: Remove unused nova-network objects"

I created a server group with anti-affinity policy and booted two servers at the same time in separate terminal windows. (Note that this depends on your timing -- if you are "too slow" you will see one go to ERROR with "No valid host" and the other to ACTIVE because the affinity check at the scheduler will catch it).

$ openstack server group create --policy anti-affinity anti-affinity
$ openstack server group list
+--------------------------------------+---------------+---------------+
| ID | Name | Policies |
+--------------------------------------+---------------+---------------+
| 85f266b7-3fe9-492a-b80b-7c74b7ea1a73 | anti-affinity | anti-affinity |
+--------------------------------------+---------------+---------------+
$ openstack server create --image 50668455-013d-4daf-80b3-dc2ae225663f --flavor 42 --hint group=85f266b7-3fe9-492a-b80b-7c74b7ea1a73 --nic net-id=01524af1-e35d-4ed9-a411-1fee224fb07c one
$ openstack server create --image 50668455-013d-4daf-80b3-dc2ae225663f --flavor 42 --hint group=85f266b7-3fe9-492a-b80b-7c74b7ea1a73 --nic net-id=01524af1-e35d-4ed9-a411-1fee224fb07c two
$ openstack server list
+--------------------------------------+------+--------+------------------------+--------------------------+---------+
| ID | Name | Status | Networks | Image | Flavor |
+--------------------------------------+------+--------+------------------------+--------------------------+---------+
| 8dd746d3-6201-4a3b-a3b1-71c854ff8721 | one | ACTIVE | shared=192.168.233.168 | cirros-0.4.0-x86_64-disk | m1.nano |
| 6fe23607-0875-498f-b52b-50c910bc1b61 | two | ACTIVE | shared=192.168.233.240 | cirros-0.4.0-x86_64-disk | m1.nano |
+--------------------------------------+------+--------+------------------------+--------------------------+---------+

BUT then I noticed in the /etc/nova/nova-cpu.conf:

[workarounds]
disable_group_policy_check_upcall = True

This will disable the late affinity check [which preserves affinity policy enforcement in the case of racing parallel requests) in nova-compute.

The default value is False [1] but it is set to True in devstack in the gate [2] because the gate is configured to exercise the multiple cell service topology [3] and run with a "superconductor". With multiple cells with each cell using their own separate message queue, the late affinity check can't work.

But in devstack, there is only a single message queue, so it is possible to use [workarounds]disable_group_policy_check_upcall = False in /etc/nova/nova-cpu.conf. You will want to set this if you want to be able to handle affinity races.

When using [workarounds]disable_group_policy_check_upcall = False with a multi-tier conductor setup, you'll also need to set t...

I finally got a chance to try and reproduce this on a devstack and can now see what you have reported.

$ git log -1
commit e20e731630c1b337daf4446286bb6c8e761025e3 (HEAD -> master, origin/master, origin/HEAD)
Merge: fc159ac91b 998475f5bd
Author: Zuul <zuul@review.opendev.org>
Date:   Wed Mar 11 19:03:28 2020 +0000

Merge "nova-net: Remove unused nova-network objects"

$ openstack server group create --policy anti-affinity anti-affinity
$ openstack server group list
+--------------------------------------+---------------+---------------+
| ID                                   | Name          | Policies      |
+--------------------------------------+---------------+---------------+
| 85f266b7-3fe9-492a-b80b-7c74b7ea1a73 | anti-affinity | anti-affinity |
+--------------------------------------+---------------+---------------+
$ openstack server create --image 50668455-013d-4daf-80b3-dc2ae225663f --flavor 42 --hint group=85f266b7-3fe9-492a-b80b-7c74b7ea1a73 --nic net-id=01524af1-e35d-4ed9-a411-1fee224fb07c one
$ openstack server create --image 50668455-013d-4daf-80b3-dc2ae225663f --flavor 42 --hint group=85f266b7-3fe9-492a-b80b-7c74b7ea1a73 --nic net-id=01524af1-e35d-4ed9-a411-1fee224fb07c two
$ openstack server list
+--------------------------------------+------+--------+------------------------+--------------------------+---------+
| ID                                   | Name | Status | Networks               | Image                    | Flavor  |
+--------------------------------------+------+--------+------------------------+--------------------------+---------+
| 8dd746d3-6201-4a3b-a3b1-71c854ff8721 | one  | ACTIVE | shared=192.168.233.168 | cirros-0.4.0-x86_64-disk | m1.nano |
| 6fe23607-0875-498f-b52b-50c910bc1b61 | two  | ACTIVE | shared=192.168.233.240 | cirros-0.4.0-x86_64-disk | m1.nano |
+--------------------------------------+------+--------+------------------------+--------------------------+---------+

BUT then I noticed in the /etc/nova/nova-cpu.conf:

[workarounds]
disable_group_policy_check_upcall = True

This will disable the late affinity check [which preserves affinity policy enforcement in the case of racing parallel requests) in nova-compute.

When using [workarounds]disable_group_policy_check_upcall = False with a multi-tier conductor setup, you'll also need to set the [api_database]connection in the cell conductor nova config /etc/nova/nova_cell1.conf (I copied from /etc/nova/nova.conf that n-super-cond is using).

So trying again, I set [workarounds]disable_group_policy_check_upcall = False in /etc/nova/nova-cpu.conf and restarted devstack@n-cpu and set [api_database]connection = <setting from /etc/nova/nova.conf> in /etc/nova/nova_cell1.conf and restarted devstack@n-cond-cell1:

$ openstack server create --image 50668455-013d-4daf-80b3-dc2ae225663f --flavor 42 --hint group=85f266b7-3fe9-492a-b80b-7c74b7ea1a73 --nic net-id=01524af1-e35d-4ed9-a411-1fee224fb07c one
$ openstack server create --image 50668455-013d-4daf-80b3-dc2ae225663f --flavor 42 --hint group=85f266b7-3fe9-492a-b80b-7c74b7ea1a73 --nic net-id=01524af1-e35d-4ed9-a411-1fee224fb07c two

Looking at the n-cpu log:

Mar 25 20:22:23 ubuntu-bionic nova-compute[4278]: DEBUG nova.compute.manager [None req-1d4c2c9f-a88e-4f4f-bc67-389d6aa5e486 demo demo] [instance: eb896048-4c8e-4f2d-a88e-7f7e8fbab489] Build of instance eb
896048-4c8e-4f2d-a88e-7f7e8fbab489 was re-scheduled: Build of instance eb896048-4c8e-4f2d-a88e-7f7e8fbab489 was re-scheduled: Anti-affinity instance group policy was violated. {{(pid=4278) _do_build_and_r
un_instance /opt/stack/nova/nova/compute/manager.py:2217}}

That ^ is the late affinity check triggering.

$ openstack server show one
...
| fault                       | {'code': 500, 'created': '2020-03-25T20:22:24Z', 'message': 'Exceeded maximum number of retries. Exhausted all hosts available for retrying build failures for instance eb896048-4c8e-4f2d-a88e-7f7e8fbab489.'} |
...

Based on this, there has not been a regression.

Perhaps last time you tried, it was before the devstack config was changed to reflect a multi-cell deployment topology or maybe the timing of the two requests was "slow enough" to fail at the scheduler filter stage.

You can learn more about multi-cell layout and "upcalls" at this doc:

https://docs.openstack.org/nova/latest/user/cellsv2-layout.html#operations-requiring-upcalls

When running with multiple cells, it presently is not possible to enforce affinity policy properly in a race situation. To support this, affinity support needs to be implemented in the placement service.

And I'm going to work on adding an item about this to our troubleshooting doc to help with future questions/bug reports:

https://docs.openstack.org/nova/latest/admin/support-compute.html

[1] https://github.com/openstack/nova/blob/35240b0d8c450c42d6f44be5f1df58d850a020d3/nova/conf/workarounds.py#L135-L136
[2] https://github.com/openstack/devstack/blob/37659927923473c13f4bec88855205d0ee28bcfb/lib/nova#L900
[3] https://docs.openstack.org/nova/latest/user/cellsv2-layout.html#multiple-cells

Revision history for this message

melanie witt (melwitt) wrote on 2020-03-25:

I must correct parts of my earlier comment 7:

> With multiple cells with each cell using their own separate message queue, the late affinity check can't work.

> When running with multiple cells, it presently is not possible to enforce affinity policy properly in a race situation. To support this, affinity support needs to be implemented in the placement service.

This is incorrect. It is possible to enforce affinity policy in a race situation with multiple cells if cell conductors are configured to set [api_database]connection and computes do not set [workarounds]disable_group_policy_check_upcall.

This is not an ideal configuration with multiple cells, however, as cells are meant to be isolated from the upper layers of the deployment. That is where we will need affinity support to be added in the placement service, to be able to fully enforce affinity policies without need to configure cell conductors to access the API database.

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2020-03-25: Related fix proposed to nova (master)

Related fix proposed to branch: master
Review: https://review.opendev.org/715092

Revision history for this message

melanie witt (melwitt) wrote on 2020-03-25:

#10

I've proposed a doc update ^ related to this bug report.

Closing this as Invalid because server group affinity has not been regressed, as explained in comment 7 and comment 8.

Changed in nova:
status:	New → Invalid

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2020-04-06: Related fix merged to nova (master)

#11

Reviewed: https://review.opendev.org/715092
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=df216de6d9b195782be3cfc2d51296f3c4442b54
Submitter: Zuul
Branch: master

commit df216de6d9b195782be3cfc2d51296f3c4442b54
Author: melanie witt <email address hidden>
Date: Wed Mar 25 23:02:42 2020 +0000

Add info about affinity requests to the troubleshooting doc

We had recent bug report about a possible regression related to
affinity policy enforcement with parallel server create requests.

    It turned out not to be a regression but because of the complexity
    around affinity enforcement, it might help to add a section to the
    compute troubleshooting doc about it which we could refer to in the
    future.

Related-Bug: #1863190

Change-Id: I508c48183a7205d46e13154d4e92d31dfa7f7d78

Report a bug

This report contains Public information

Everyone can see this information.

You are

Subscribing...

Edit bug mail

Other bug subscribers

Bug attachments

devstack@n-* log files. Edit

Add attachment

Remote bug watches

Bug watches keep track of this bug in other bug trackers.