Handlers are not delegated properly for periodic nova db archiving

Bug #2034583 reported by Damian Dąbrowski
14
This bug affects 2 people
Affects Status Importance Assigned to Milestone
OpenStack-Ansible
Fix Released
Critical
Dmitriy Rabotyagov

Bug Description

Problem description:

In [1] we implemented a feature that allows users to periodically trim old database records in nova db.
Unfortunately, when this feature is enabled, handler from systemd_service role is not delegated properly to the first nova_conductor host and it's executed on a compute node instead.
It's not expected to have this timer on a compute node, so the playbook fails with an error:

```
RUNNING HANDLER [systemd_service : Restart service {{ services_results.item.service_name | replace(' ', '_') }}] ****
failed: [aio1] (item=) => {"ansible_loop_var": "template_argument", "changed": false, "msg": "Could not find the requested service nova-archive-deleted.timer: host", "template_argument": ""}
```

Steps to reproduce:

1. Define `nova_archive_deleted: True` in OSA config(it's not enabled by default)
2. Execute `openstack-ansible /opt/openstack-ansible/playbooks/os-nova-install.yml`

Proposed solution:

So far I haven't been able to find any solution.

[1] https://opendev.org/openstack/openstack-ansible-os_nova/commit/efe64725e177649ae9fe9624c51601d76fea2438

description: updated
summary: - Handlers not delegated properly for periodic nova db archiving
+ Handlers are not delegated properly for periodic nova db archiving
Revision history for this message
Shanjiayi (odeshen520) wrote :

ansible-role-systemd_service/tasks/main.yml
The trigger processor is used in line 88. The nova-archive-deleted.timer configuration was not generated when restarting the service。

Revision history for this message
Dmitriy Rabotyagov (noonedeadpunk) wrote :

There's more issues with current behaviour. For example, compute node might try to execute DB migrations:

TASK [os_nova : Perform online data migrations] ********************************************************************************************************************************************************************************************
fatal: [compute02 -> {{ nova_conductor_setup_host }}]: FAILED! => {"msg": "The conditional check 'hostvars[nova_conductor_setup_host]['ansible_local']['openstack_ansible']['nova']['need_online_data_migrations'] | bool' failed. The error was: error while evaluating conditional (hostvars[nova_conductor_setup_host]['ansible_local']['openstack_ansible']['nova']['need_online_data_migrations'] | bool): 'dict object' has no attribute 'openstack_ansible'\n\nThe error appears to be in '/etc/ansible/roles/os_nova/tasks/nova_db_post_setup.yml': line 34, column 3, but may\nbe elsewhere in the file depending on the exact syntax problem.\n\nThe offending line appears to be:\n\n# It should be considered successfully completed only when the exit status is 0.\n- name: Perform online data migrations\n ^ here\n"}

And it fails since facts are not delegated in this scenario.

Changed in openstack-ansible:
status: New → Triaged
importance: Undecided → Critical
assignee: nobody → Dmitriy Rabotyagov (noonedeadpunk)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to openstack-ansible (master)

Related fix proposed to branch: master
Review: https://review.opendev.org/c/openstack/openstack-ansible/+/897568

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to openstack-ansible (master)
Changed in openstack-ansible:
status: Triaged → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to openstack-ansible (master)

Reviewed: https://review.opendev.org/c/openstack/openstack-ansible/+/897568
Committed: https://opendev.org/openstack/openstack-ansible/commit/61ea7a82073ea5886ad7f27f9dbcf6456233ab67
Submitter: "Zuul (22348)"
Branch: master

commit 61ea7a82073ea5886ad7f27f9dbcf6456233ab67
Author: Dmitriy Rabotyagov <email address hidden>
Date: Fri Oct 6 17:38:13 2023 +0200

    Remove common nova playbook

    Code of os-nova-install has been refactored to include content from the
    common nova playbook. This allows us to be more flexible in executed
    tasks and simplify logic.

    Related-Bug: #2034583
    Change-Id: I21fe061d93cf77c97f8fa6d0003219595459e1c3

Changed in openstack-ansible:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to openstack-ansible (master)

Reviewed: https://review.opendev.org/c/openstack/openstack-ansible/+/897570
Committed: https://opendev.org/openstack/openstack-ansible/commit/a44f1212c38ae01894e4a825fa69b2512e363dba
Submitter: "Zuul (22348)"
Branch: master

commit a44f1212c38ae01894e4a825fa69b2512e363dba
Author: Dmitriy Rabotyagov <email address hidden>
Date: Fri Oct 6 17:52:55 2023 +0200

    Run nova db post setup from nova playbook

    We need to run specific tasks, like online migrations or cells discovery
    after all tasks have finished against nova conductor hosts.

    This can't be done with the role logic, as we run computes the last,
    and delegation to conductors does not work nicely since handlers are
    not delegated.

    Closes-Bug: #2034583
    Change-Id: Ic4486cf90310dc81af15b9297e84c078e612c0c2

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to openstack-ansible (stable/2023.1)

Related fix proposed to branch: stable/2023.1
Review: https://review.opendev.org/c/openstack/openstack-ansible/+/898493

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to openstack-ansible (stable/2023.1)

Fix proposed to branch: stable/2023.1
Review: https://review.opendev.org/c/openstack/openstack-ansible/+/898494

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to openstack-ansible (master)

Related fix proposed to branch: master
Review: https://review.opendev.org/c/openstack/openstack-ansible/+/898557

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to openstack-ansible (stable/2023.1)

Reviewed: https://review.opendev.org/c/openstack/openstack-ansible/+/898493
Committed: https://opendev.org/openstack/openstack-ansible/commit/6187ce0cfd5a58f5159f68cfb8ede80a92ccdd79
Submitter: "Zuul (22348)"
Branch: stable/2023.1

commit 6187ce0cfd5a58f5159f68cfb8ede80a92ccdd79
Author: Dmitriy Rabotyagov <email address hidden>
Date: Fri Oct 6 17:38:13 2023 +0200

    Remove common nova playbook

    Code of os-nova-install has been refactored to include content from the
    common nova playbook. This allows us to be more flexible in executed
    tasks and simplify logic.

    Related-Bug: #2034583
    Change-Id: I21fe061d93cf77c97f8fa6d0003219595459e1c3
    (cherry picked from commit 61ea7a82073ea5886ad7f27f9dbcf6456233ab67)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to openstack-ansible (master)

Reviewed: https://review.opendev.org/c/openstack/openstack-ansible/+/898557
Committed: https://opendev.org/openstack/openstack-ansible/commit/607d237e50737161f6f8341e34e2572ee79c2179
Submitter: "Zuul (22348)"
Branch: master

commit 607d237e50737161f6f8341e34e2572ee79c2179
Author: Dmitriy Rabotyagov <email address hidden>
Date: Tue Oct 17 17:06:31 2023 +0200

    Fix vars-file include for os-nova-install

    This is a follow-up patch to [1]. Somehow, this issue was not catched by
    our CI on master, likely due to some safeguard logic in ansible-core 2.15

    Though, it's still worth to align vars_files include and do that properly.

    [1] https://review.opendev.org/c/openstack/openstack-ansible/+/897570

    Related-Bug: #2034583
    Change-Id: I130aab09610f594e0d67db5082c8ff28c9298661

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to openstack-ansible-os_nova (master)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to openstack-ansible (stable/2023.1)

Reviewed: https://review.opendev.org/c/openstack/openstack-ansible/+/898494
Committed: https://opendev.org/openstack/openstack-ansible/commit/29c33cece6a1aa7729f8239a1e9ccffb74ab3795
Submitter: "Zuul (22348)"
Branch: stable/2023.1

commit 29c33cece6a1aa7729f8239a1e9ccffb74ab3795
Author: Dmitriy Rabotyagov <email address hidden>
Date: Fri Oct 6 17:52:55 2023 +0200

    Run nova db post setup from nova playbook

    We need to run specific tasks, like online migrations or cells discovery
    after all tasks have finished against nova conductor hosts.

    This can't be done with the role logic, as we run computes the last,
    and delegation to conductors does not work nicely since handlers are
    not delegated.

    Closes-Bug: #2034583
    Change-Id: Ic4486cf90310dc81af15b9297e84c078e612c0c2
    (cherry picked from commit a44f1212c38ae01894e4a825fa69b2512e363dba)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to openstack-ansible-os_nova (master)

Reviewed: https://review.opendev.org/c/openstack/openstack-ansible-os_nova/+/898749
Committed: https://opendev.org/openstack/openstack-ansible-os_nova/commit/4aa65eb60691c50c9022c22848e3338c8f350af3
Submitter: "Zuul (22348)"
Branch: master

commit 4aa65eb60691c50c9022c22848e3338c8f350af3
Author: Dmitriy Rabotyagov <email address hidden>
Date: Wed Oct 18 19:04:46 2023 +0200

    Fix logic of discovering hosts by service

    For quite some time, we relate usage of --by-service flag for
    nova-manage cell_v2 discover_hosts command to the used nova_virt_type.
    However, we run db_post_setup tasks only once and delegating to the
    conductor host. With latest changes to the logic, when this task in
    included from the playbook level it makes even less sense, since
    definition of nova_virt_type for conductor is weird and wrong.

    Instead, we attempt to detect if ironic is in use by checking hostvars
    of all compute nodes for that. It will include host_vars, group_vars,
    all sort of extra variables, etc.

    Thus, ironic hosts should be better discovered now with nova-manage
    command.

    Related-Bug: #2034583
    Change-Id: I3deea859a4017ff96919290ba50cb375c0f960ea

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to openstack-ansible-os_nova (stable/2023.1)

Related fix proposed to branch: stable/2023.1
Review: https://review.opendev.org/c/openstack/openstack-ansible-os_nova/+/898779

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to openstack-ansible-os_nova (stable/zed)

Related fix proposed to branch: stable/zed
Review: https://review.opendev.org/c/openstack/openstack-ansible-os_nova/+/898780

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to openstack-ansible-os_nova (stable/2023.1)

Reviewed: https://review.opendev.org/c/openstack/openstack-ansible-os_nova/+/898779
Committed: https://opendev.org/openstack/openstack-ansible-os_nova/commit/77c06d452048b837984118d801f1e62122f3b7e6
Submitter: "Zuul (22348)"
Branch: stable/2023.1

commit 77c06d452048b837984118d801f1e62122f3b7e6
Author: Dmitriy Rabotyagov <email address hidden>
Date: Wed Oct 18 19:04:46 2023 +0200

    Fix logic of discovering hosts by service

    For quite some time, we relate usage of --by-service flag for
    nova-manage cell_v2 discover_hosts command to the used nova_virt_type.
    However, we run db_post_setup tasks only once and delegating to the
    conductor host. With latest changes to the logic, when this task in
    included from the playbook level it makes even less sense, since
    definition of nova_virt_type for conductor is weird and wrong.

    Instead, we attempt to detect if ironic is in use by checking hostvars
    of all compute nodes for that. It will include host_vars, group_vars,
    all sort of extra variables, etc.

    Thus, ironic hosts should be better discovered now with nova-manage
    command.

    Related-Bug: #2034583
    Change-Id: I3deea859a4017ff96919290ba50cb375c0f960ea
    (cherry picked from commit 4aa65eb60691c50c9022c22848e3338c8f350af3)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/openstack-ansible 27.2.0

This issue was fixed in the openstack/openstack-ansible 27.2.0 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to openstack-ansible-os_nova (stable/zed)

Reviewed: https://review.opendev.org/c/openstack/openstack-ansible-os_nova/+/898780
Committed: https://opendev.org/openstack/openstack-ansible-os_nova/commit/c377b904ad0e2839f5ae3dd611059a4d75a2d4bd
Submitter: "Zuul (22348)"
Branch: stable/zed

commit c377b904ad0e2839f5ae3dd611059a4d75a2d4bd
Author: Dmitriy Rabotyagov <email address hidden>
Date: Wed Oct 18 19:04:46 2023 +0200

    Fix logic of discovering hosts by service

    For quite some time, we relate usage of --by-service flag for
    nova-manage cell_v2 discover_hosts command to the used nova_virt_type.
    However, we run db_post_setup tasks only once and delegating to the
    conductor host. With latest changes to the logic, when this task in
    included from the playbook level it makes even less sense, since
    definition of nova_virt_type for conductor is weird and wrong.

    Instead, we attempt to detect if ironic is in use by checking hostvars
    of all compute nodes for that. It will include host_vars, group_vars,
    all sort of extra variables, etc.

    Thus, ironic hosts should be better discovered now with nova-manage
    command.

    Related-Bug: #2034583
    Change-Id: I3deea859a4017ff96919290ba50cb375c0f960ea
    (cherry picked from commit 4aa65eb60691c50c9022c22848e3338c8f350af3)

tags: added: in-stable-zed
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/openstack-ansible 28.0.0.0rc1

This issue was fixed in the openstack/openstack-ansible 28.0.0.0rc1 release candidate.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.