Ansible may run more handlers than originally planned

Bug #1863510 reported by Radosław Piliszek
36
This bug affects 6 people
Affects Status Importance Assigned to Milestone
kolla-ansible
Triaged
Medium
Radosław Piliszek
Train
Triaged
Medium
Unassigned
Ussuri
Triaged
Medium
Unassigned
Victoria
Triaged
Medium
Radosław Piliszek

Bug Description

For example in check-containers.yml there is a task which checks container parameters. Supposedly it notifies only the restart handler for the service which has any changes. However, this is not the case - all containers on such a host are restarted when one has a differing config (limited to that role). It seems loops with "notify" notify all the handlers when at least one item "changed".

bug in Ansible: https://github.com/ansible/ansible/issues/22579

Identified places:
- check_containers
- copy config.json

Revision history for this message
Radosław Piliszek (yoctozepto) wrote :

Aaand it seems it is a known issue in Ansible :/ https://github.com/ansible/ansible/issues/22579

Changed in kolla-ansible:
status: New → Triaged
importance: Undecided → Medium
Revision history for this message
Radosław Piliszek (yoctozepto) wrote :

We could work around this by using include_tasks in loop as it lifts most of those pesky ansible quirks related to looped tasks...

description: updated
Revision history for this message
Mark Goddard (mgoddard) wrote :

I was certain that this worked, but I just tested and it does not :(

Revision history for this message
Mark Goddard (mgoddard) wrote :

This breaks quite a lot of our configuration and container checking code. I guess this is only a problem since we (I) removed all the handler conditional logic.

Revision history for this message
Radosław Piliszek (yoctozepto) wrote :

Could be; maybe some of these deserve discussion during meeting?

Revision history for this message
Radosław Piliszek (yoctozepto) wrote :

landed with Train's: https://review.opendev.org/647699 [Simplify handler conditionals]

Revision history for this message
Mark Goddard (mgoddard) wrote :

Introduced by https://review.opendev.org/647699, which landed in Train.

Revision history for this message
Radosław Piliszek (yoctozepto) wrote :

Some points to remember (as discussed during meeting):
1) the issue is in looping with notify - don't rely on it solely
2) there could be performance issues with include_tasks at scale
3) there is also an issue that interrupted deploy/reconfigure may *not* restart containers if it failed between config update and planned restart (due to no change happening later)
  3a) we could do some touch dance
  3b) "kolla_set_configs --check" or similar
4) at some point in time we might want to switch to podman which relies on systemd (no idea how it affects this but nice to remember)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to kolla-ansible (master)

Fix proposed to branch: master
Review: https://review.opendev.org/723342

Changed in kolla-ansible:
assignee: nobody → Radosław Piliszek (yoctozepto)
status: Triaged → In Progress
description: updated
Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Fix proposed to branch: master
Review: https://review.opendev.org/744207

Changed in kolla-ansible:
assignee: nobody → Radosław Piliszek (yoctozepto)
status: Triaged → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on kolla-ansible (master)

Change abandoned by Radosław Piliszek (<email address hidden>) on branch: master
Review: https://review.opendev.org/744207
Reason: superseded by https://github.com/ansible/ansible/pull/71118/files and https://review.opendev.org/745164

Revision history for this message
Radosław Piliszek (yoctozepto) wrote :
Revision history for this message
Michal Arbet (michalarbet) wrote :

Shouldn't we put a little pressure on ansible dev and start another one discussion about this ?
For me it sounds very funny to reimplement all in kolla while it's only needed to patch one line in ansible.

Revision history for this message
Michal Arbet (michalarbet) wrote :

Well, not everything, but definitely a bigger part than one line

Revision history for this message
Radosław Piliszek (yoctozepto) wrote :

As far as I am concerned, feel welcome to discuss this further with the Ansible team. We gave up because we don't have time to invest in learning all Ansible internals to prove our reasoning is right.

Mark Goddard (mgoddard)
Changed in kolla-ansible:
status: In Progress → Triaged
Revision history for this message
dalekseev (dalekseev) wrote :

so why don't we use workaround in kolla-ansible if it didn't work out with ansible guys anyway ?

Revision history for this message
Radosław Piliszek (yoctozepto) wrote :

Lack of time to finish the implementation. It is just still pending...

Revision history for this message
Roman Krček (r-krcek) wrote :

Hello, is this issue/optimalization still relevant? I would like to take a crack at it. :)

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.