periodic master scen1 standalone fails/timeout 'manage firewall rules'
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
tripleo |
Fix Released
|
Critical
|
Unassigned |
Bug Description
At [1][2] the periodic-
* 2022-08-04 23:28:27.803309 | primary | fatal: [undercloud]: FAILED! => {"changed": false, "msg": "async task did not complete within the requested time - 5700s"}
Digging a bit, in both cases looks like the "Manage firewall rules" task is taking over 40 mins causing the timeout.
Fron [1]
* 2022-08-04 13:37:31.522883 | fa163e5e-
2022-08-04 14:19:39.965488 | fa163e5e-
2022-08-04 14:19:39.966843 | fa163e5e-
From [2]
* 2022-08-04 21:49:42.321436 | fa163ece-
2022-08-04 22:31:48.302603 | fa163ece-
2022-08-04 22:31:48.304079 | fa163ece-
This is a promotion blocker for the master centos9 integration line
[1] https:/
[2] https:/
[3] https:/
Changed in tripleo: | |
status: | Triaged → Fix Released |
tags: | removed: promotion-blocker |
I've quickly checked the host logs[1].
We can indeed see the first ansible- ansible. builtin. iptables call at 17:37:31, and the last one at
18:19:39. There are a lot of things in-between, but I don't think those unrelated actions have any impact on the rule application.
AFAIK, here are the changes related to the firewall: /review. opendev. org/c/openstack /tripleo- ansible/ +/850221
- tripleo_iptables action allows to filter out "nft_" prefixed parameter: https:/
This is needed in order to allow some better rules within nftables
- actually accept connections on l0 and dhcpv6: https:/ /review. opendev. org/c/openstack /tripleo- ansible/ +/850620
While this doesn't have actual effect on the way we manage iptables, it's needed for nftables due to the way it drops packets via a chain policy instead of the (terrible) "DROP" rule that affects only the NEW states
- better logging for nftables: https:/ /review. opendev. org/c/openstack /tripleo- ansible/ +/850222
This only affects nftables based deploy
So I don't think any of the mentioned patches are having an actual impact on the time to apply rules :(. Maybe an issue with the VM resources?
[1] https:/ /logserver. rdoproject. org/openstack- periodic- integration- main/opendev. org/openstack/ tripleo- ci/master/ periodic- tripleo- ci-centos- 9-scenario001- standalone- master/ 2a01470/ logs/undercloud /var/log/ extra/journal. txt.gz