"fetch the archive" play from tripleo-transfer is not reliable for huge files

Bug #1908425 reported by Jose Luis Franco
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
tripleo
High
Jose Luis Franco

Bug Description

Upstream bug based on https://bugzilla.redhat.com/show_bug.cgi?id=1904681

During FFU Queens -> Train upgrade of tripleo deployment with separate Database control role customer faced a situation when command [1] failed because of timeout. When we analyzed python logs it turned out that "fetch the archive" play was initiated, but was in progress after few hours.

On director node we saw active ansible-playbook process that used ~2GB of RAM, but wasn't actually doing anything. Our first assumption was that DB archive is too big, but it was only ~7.2GB and there were a lot of space on controller node and director node.

It looks like it is known to some extend that fetch plays use different mechanisms and create extra load when "become" parameter is used.

[1]
"openstack overcloud external-upgrade run --stack overcloud --tags system_upgrade_transfer_data -y"

Changed in tripleo:
importance: Undecided → High
milestone: none → wallaby-2
Changed in tripleo:
milestone: wallaby-2 → wallaby-3
Changed in tripleo:
milestone: wallaby-3 → wallaby-rc1
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/tripleo-ansible 2.2.0

This issue was fixed in the openstack/tripleo-ansible 2.2.0 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/tripleo-ansible 0.7.0

This issue was fixed in the openstack/tripleo-ansible 0.7.0 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/tripleo-ansible 1.5.3

This issue was fixed in the openstack/tripleo-ansible 1.5.3 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on tripleo-ansible (master)

Change abandoned by "Jose Luis Franco <email address hidden>" on branch: master
Review: https://review.opendev.org/c/openstack/tripleo-ansible/+/767177
Reason: Issue fixed by https://review.opendev.org/c/openstack/tripleo-ansible/+/771657

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/tripleo-ansible 3.1.0

This issue was fixed in the openstack/tripleo-ansible 3.1.0 release.

Changed in tripleo:
milestone: wallaby-rc1 → xena-1
Changed in tripleo:
milestone: xena-1 → xena-2
Changed in tripleo:
milestone: xena-2 → xena-3
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers