[fullstack] Race condition when updating the router port information and updating the network MTU
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
neutron |
Expired
|
Medium
|
Unassigned |
Bug Description
In [1] we can see a race condition between the router port processing and the network (where this port belongs) update processing in the L3 agent. Ordered list of events:
1) In [2] 03:05:13.563: The router starts the updating process.
Starting router update for bc221d89-
3, priority 1, update_id 0ce8cd8d-
This process is asynchronous.
2) In [2] 03:05:22.318: BaseRouterInfo.
adding internal network: prefix(qr-), port(
d15ca83e-
3) In [3] 03:05:23.348: The network MTU is updated.
Request body: {'network': {'mtu': 1499}} prepare_
This event is received and processed in the L3 agent in L3NATAgent.
4) In [2] 03:05:26.671: The port d15ca83e-
appending port {'id': 'd15ca83e-
'mtu': 1500} to internal_ports cache
LOGS:
[1] https:/
[2] L3 agent: https:/
[3] Neutron server: https:/
description: | updated |
Changed in neutron: | |
assignee: | nobody → Rodolfo Alonso (rodolfo-alonso-hernandez) |
tags: | added: gate-failure |
tags: | added: fullstack |
Changed in neutron: | |
assignee: | Rodolfo Alonso (rodolfo-alonso-hernandez) → nobody |
So the error is that router- interface- add is not fully completed when the mtu-update already arrives, right?
According to logstash this seems to be rare in the gate (zero hits in the last 10 days, the linked logs are older than that). In real environments we'd need router- interface- adds and mtu-updates for the same network in close proximity to each other, which also sounds rare. So I'm setting the importance to low, but clearly we have a bug here.