Collect logs after a failed DC upgrade are too big

Bug #1885778 reported by Al Bailey
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
StarlingX
Fix Released
Medium
Eric MacDonald

Bug Description

Brief Description
-----------------
I recently looked at a collect log from controller-0 that was attempting to perform a Distributed Cloud upgrade.
It included large files that should be filtered out of the collect (such as an ISO).
We are making very large collect files, that can be reduced in size.

Severity
--------
Minor.

Steps to Reproduce
------------------
Attempt a DC subcloud upgrade.
Run collect afterwards

Expected Behavior
------------------
The collect should not be overly large

Actual Behavior
----------------
The collect for ALL NODES was
4229007360 Jun 26 21:16 ALL_NODES_20200625.190233.tar

which is over 4 Gigs
4.0G ALL_NODES_20200625.190233.tar

Internally, controller-0 is all the space
4185975726 2020-06-25 14:09 ALL_NODES_20200625.190233/controller-0_20200625.190233.tgz

and extracting that tarball shows at least one very large file
2.1G ./var/extra/ansible/subcloud6/bootimage.iso

all these files are larger than 50M
 find . -type f -size +50M
./var/extra/platform/iso/20.06/shared/LiveOS/squashfs.img
./var/extra/platform/iso/20.06/shared/Packages/pxe-network-installer-1.0-28.tis.x86_64.rpm
./var/extra/platform/iso/20.06/shared/Packages/kubernetes-master-1.18.1-1.tis.5.x86_64.rpm
./var/extra/platform/iso/20.06/shared/Packages/linux-firmware-20190429-72.gitddde598.el7.noarch.rpm
./var/extra/platform/iso/20.06/nodes/subcloud6/bootimage.iso
./var/extra/ansible/subcloud6/bootimage.iso

Presumably the iso/20.06 folder can be filtered

ls -al ./var/extra/platform/iso/20.06/nodes/
total 0
drwxr-xr-x 3 abailey users 23 Jun 25 13:58 .
drwxr-xr-x 4 abailey users 33 Jun 25 13:58 ..
drwxr-xr-x 2 abailey users 125 Jun 25 13:58 subcloud6

ls -al ./var/extra/platform/iso/20.06/shared/
total 104
drwxr-xr-x 5 abailey users 72 Jun 25 13:58 .
drwxr-xr-x 4 abailey users 33 Jun 25 13:58 ..
-rw-r--r-- 1 abailey users 2901 Jun 25 13:58 isolinux.cfg
drwxr-xr-x 2 abailey users 26 Jun 23 21:30 LiveOS
drwxr-xr-x 2 abailey users 69632 Jun 23 21:30 Packages
drwxr-xr-x 2 abailey users 4096 Jun 23 21:33 repodata

Reproducibility
---------------
Seen once

System Configuration
--------------------
DC system performing an upgrade

Branch/Pull Time/Commit
-----------------------
June 25, 2020

Last Pass
---------
N/A

Timestamp/Logs
--------------
Contact Al Bailey for a sample collect (too large to attach)

Test Activity
-------------
Developer Testing

Workaround
----------
N/A

Changed in starlingx:
assignee: nobody → Eric MacDonald (rocksolidmtce)
Ghada Khalil (gkhalil)
description: updated
tags: added: stx.tools
Ghada Khalil (gkhalil)
tags: added: stx.distcloud
Revision history for this message
Eric MacDonald (rocksolidmtce) wrote :

These temporary files are cleaned up automatically by system controller subcloud installation software. There must be an error path that bypasses cleanup.

The /etc/collect.d/collect_sysinv_.sh script is including these tmp files with this command.

    cp -a /opt/platform ${extradir}

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to utilities (master)

Fix proposed to branch: master
Review: https://review.opendev.org/739520

Changed in starlingx:
status: New → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to utilities (master)

Reviewed: https://review.opendev.org/739520
Committed: https://git.openstack.org/cgit/starlingx/utilities/commit/?id=7c076a390f99cb72623da7168ae64e2947a25080
Submitter: Zuul
Branch: master

commit 7c076a390f99cb72623da7168ae64e2947a25080
Author: Eric MacDonald <email address hidden>
Date: Mon Jul 6 11:24:10 2020 -0400

    Exclude temporary subcloud install iso files from collect

    The collect tool is unnecessarily including temp subcloud
    install iso files and related other large tmp files.

    This update changes the raw 'cp ...' to 'rsync ...'
    with the --exclude option to filter out iso files from the
    the collect_dc and collect_sysinv collect scripts.

    Change-Id: I3575c3193a24f376dcd006c3e5015c551023c69a
    Closes-Bug: 1885778
    Signed-off-by: Eric MacDonald <email address hidden>

Changed in starlingx:
status: In Progress → Fix Released
Ghada Khalil (gkhalil)
Changed in starlingx:
importance: Undecided → Medium
tags: added: stx.4.0
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to utilities (f/centos8)

Fix proposed to branch: f/centos8
Review: https://review.opendev.org/c/starlingx/utilities/+/792213

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.