"fetch the archive" play from tripleo-transfer is not reliable for huge files
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
tripleo |
Triaged
|
High
|
Jose Luis Franco |
Bug Description
Upstream bug based on https:/
During FFU Queens -> Train upgrade of tripleo deployment with separate Database control role customer faced a situation when command [1] failed because of timeout. When we analyzed python logs it turned out that "fetch the archive" play was initiated, but was in progress after few hours.
On director node we saw active ansible-playbook process that used ~2GB of RAM, but wasn't actually doing anything. Our first assumption was that DB archive is too big, but it was only ~7.2GB and there were a lot of space on controller node and director node.
It looks like it is known to some extend that fetch plays use different mechanisms and create extra load when "become" parameter is used.
[1]
"openstack overcloud external-upgrade run --stack overcloud --tags system_
Changed in tripleo: | |
importance: | Undecided → High |
milestone: | none → wallaby-2 |
Changed in tripleo: | |
milestone: | wallaby-2 → wallaby-3 |
Changed in tripleo: | |
milestone: | wallaby-3 → wallaby-rc1 |
Changed in tripleo: | |
milestone: | wallaby-rc1 → xena-1 |
Changed in tripleo: | |
milestone: | xena-1 → xena-2 |
Changed in tripleo: | |
milestone: | xena-2 → xena-3 |
This issue was fixed in the openstack/ tripleo- ansible 2.2.0 release.