"Unable to create the network. No available network found in maximum allowed attempts." during rally stress test

Bug #1940073 reported by Krzysztof Klimonda
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
neutron
Fix Released
Medium
Rodolfo Alonso

Bug Description

When running rally scenario NeutronNetworks.create_and_delete_networks with concurrency of 60 the following error is observed:

--8<--8<--8<--
2021-08-16 11:28:41.526 710 ERROR oslo_db.api [req-61e1d9da-1bad-4410-94ce-d2945c13a2d5 05971ba84eac4b8eb176bd935909f9d0 03904310315c47c7b33178da2bfc99a2 - default default] DB exceeded retry limit.: oslo_db.exception.RetryRequest: Unable to create the network. No available network found in maximum allowed attempts.
2021-08-16 11:28:41.526 710 ERROR oslo_db.api Traceback (most recent call last):
2021-08-16 11:28:41.526 710 ERROR oslo_db.api File "/var/lib/kolla/venv/lib/python3.6/site-packages/oslo_db/api.py", line 142, in wrapper
2021-08-16 11:28:41.526 710 ERROR oslo_db.api return f(*args, **kwargs)
2021-08-16 11:28:41.526 710 ERROR oslo_db.api File "/var/lib/kolla/venv/lib/python3.6/site-packages/neutron_lib/db/api.py", line 183, in wrapped
2021-08-16 11:28:41.526 710 ERROR oslo_db.api LOG.debug("Retry wrapper got retriable exception: %s", e)
2021-08-16 11:28:41.526 710 ERROR oslo_db.api File "/var/lib/kolla/venv/lib/python3.6/site-packages/oslo_utils/excutils.py", line 220, in __exit__
2021-08-16 11:28:41.526 710 ERROR oslo_db.api self.force_reraise()
2021-08-16 11:28:41.526 710 ERROR oslo_db.api File "/var/lib/kolla/venv/lib/python3.6/site-packages/oslo_utils/excutils.py", line 196, in force_reraise
2021-08-16 11:28:41.526 710 ERROR oslo_db.api six.reraise(self.type_, self.value, self.tb)
2021-08-16 11:28:41.526 710 ERROR oslo_db.api File "/var/lib/kolla/venv/lib/python3.6/site-packages/six.py", line 703, in reraise
2021-08-16 11:28:41.526 710 ERROR oslo_db.api raise value
2021-08-16 11:28:41.526 710 ERROR oslo_db.api File "/var/lib/kolla/venv/lib/python3.6/site-packages/neutron_lib/db/api.py", line 179, in wrapped
2021-08-16 11:28:41.526 710 ERROR oslo_db.api return f(*dup_args, **dup_kwargs)
2021-08-16 11:28:41.526 710 ERROR oslo_db.api File "/var/lib/kolla/venv/lib/python3.6/site-packages/neutron/plugins/ml2/plugin.py", line 1053, in create_network
2021-08-16 11:28:41.526 710 ERROR oslo_db.api result, mech_context = self._create_network_db(context, network)
2021-08-16 11:28:41.526 710 ERROR oslo_db.api File "/var/lib/kolla/venv/lib/python3.6/site-packages/neutron/plugins/ml2/plugin.py", line 1012, in _create_network_db
2021-08-16 11:28:41.526 710 ERROR oslo_db.api tenant_id)
2021-08-16 11:28:41.526 710 ERROR oslo_db.api File "/var/lib/kolla/venv/lib/python3.6/site-packages/neutron/plugins/ml2/managers.py", line 226, in create_network_segments
2021-08-16 11:28:41.526 710 ERROR oslo_db.api context, filters=filters)
2021-08-16 11:28:41.526 710 ERROR oslo_db.api File "/var/lib/kolla/venv/lib/python3.6/site-packages/neutron/plugins/ml2/managers.py", line 312, in _allocate_tenant_net_segment
2021-08-16 11:28:41.526 710 ERROR oslo_db.api segment = self._allocate_segment(context, network_type, filters)
2021-08-16 11:28:41.526 710 ERROR oslo_db.api File "/var/lib/kolla/venv/lib/python3.6/site-packages/neutron/plugins/ml2/managers.py", line 308, in _allocate_segment
2021-08-16 11:28:41.526 710 ERROR oslo_db.api return driver.obj.allocate_tenant_segment(context, filters)
2021-08-16 11:28:41.526 710 ERROR oslo_db.api File "/var/lib/kolla/venv/lib/python3.6/site-packages/neutron/plugins/ml2/drivers/type_tunnel.py", line 391, in allocate_tenant_segment
2021-08-16 11:28:41.526 710 ERROR oslo_db.api alloc = self.allocate_partially_specified_segment(context, **filters)
2021-08-16 11:28:41.526 710 ERROR oslo_db.api File "/var/lib/kolla/venv/lib/python3.6/site-packages/neutron/plugins/ml2/drivers/helpers.py", line 153, in allocate_partially_specified_segment
2021-08-16 11:28:41.526 710 ERROR oslo_db.api exceptions.NoNetworkFoundInMaximumAllowedAttempts())
2021-08-16 11:28:41.526 710 ERROR oslo_db.api oslo_db.exception.RetryRequest: Unable to create the network. No available network found in maximum allowed attempts.
2021-08-16 11:28:41.526 710 ERROR oslo_db.api
--8<--8<--8<--

Revision history for this message
Slawek Kaplonski (slaweq) wrote :

Did You got error 500 in that case or something like that?
I think we should fix handling of that error but if You want to make Your tests run properly, probably You will also need to adjust range of available vlans in Your neutron configuration.

Changed in neutron:
status: New → Confirmed
importance: Undecided → Medium
tags: added: api
Revision history for this message
Krzysztof Klimonda (kklimonda) wrote :

Yes, that error translates into 503 response. I've already adjusted my neutron configuration for ovn geneve tunnels, like that:

[ml2_type_geneve]
vni_ranges = 1001:10000
max_header_size = 38

# SELECT COUNT(*) FROM ml2_geneve_allocations;
9000

Also, while the test is running the following query never really grows out of control:

# SELECT COUNT(*) FROM ml2_geneve_allocations WHERE allocated = 1;
~66

I'm assuming that's because networks are constantly created and deleted on each iteration.

tags: added: low-hanging-fruit
Revision history for this message
Rodolfo Alonso (rodolfo-alonso-hernandez) wrote :

Hello Krzysztof:

What version of Neutron are you running? Please check if you have [1] in your environment.

Regards.

[1]https://review.opendev.org/q/I953062d9ee8ee5ee9a9f07aff4a8222ac63ed525

Revision history for this message
Krzysztof Klimonda (kklimonda) wrote :

That's stable/ussuri branch from late July and yeah, I already have this patch applied.

Revision history for this message
Rodolfo Alonso (rodolfo-alonso-hernandez) wrote :

Hello Krzysztof:

If you are not using network segment range, this is quite strange because the method "get_random_unallocated_segment" will return a random segment ID from the 9000 you have. In case of executing several network creation in parallel, is very difficult those operations clash having the same VNI number.

If you are using network segment range plugin, check the IDs assigned per segment.

In any case, could be useful in your environment to add a debug after [1]. If the segment is not allocated, it could be useful to see what segment ID the method has tried to use to create this segment. It should be a random VNI number from the non-allocated registers in "ml2_geneve_allocations". You can also query the DB to print the "ml2_geneve_allocations" register with this VNI.

Regards.

[1]https://github.com/openstack/neutron/blob/c32a5f2192af66b833fa1cac12af07ed19ad9ef2/neutron/plugins/ml2/drivers/helpers.py#L152-L156

Revision history for this message
Krzysztof Klimonda (kklimonda) wrote :

I'm testing it with stable/ussuri which doesn't have get_random_unallocated_segment - logged error shows that neutron is trying to use consequent vni numbers.

It seems https://review.opendev.org/c/openstack/neutron/+/804999 was never backported into stable/ussuri but got stuck trying to pass zuul tests.

Revision history for this message
Rodolfo Alonso (rodolfo-alonso-hernandez) wrote :

Hello Krzysztof:

Now [1] is merged (up to Train), can you try again your tests? This patch should improve it, as reported initially in [2]. If the segment number is randomly selected, it is unlikely that two concurrent requests try to assign the same VNI. The bigger the segment range is, the less the API request clash.

Regards.

[1]https://review.opendev.org/q/Id3f71611a00e69c4f22340ca4d05d95e4373cf69
[2]https://bugs.launchpad.net/neutron/+bug/1920923

Changed in neutron:
status: Confirmed → Fix Committed
assignee: nobody → Rodolfo Alonso (rodolfo-alonso-hernandez)
Changed in neutron:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.