corrupted dynamic_inventory.py backup file

Bug #1750233 reported by Shannon Mitchell
10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
OpenStack-Ansible
Fix Released
High
Shannon Mitchell

Bug Description

I have been seeing the dynamic_inventory.py runs randomly break more recently. It seems to happen more often on larger environments with
multiple people/scripts accessing the inventory for various tasks.
It looks like the /etc/openstack_deploy/backup_openstack_inventory.tar archive is getting corrupted causing the inventory script to bomb on
all future calls. The workaround is to blow it away the tar file and
rerun the dynamic_inventory.py script to re-generate.

You can simulate by running multiple calls to it at once.

root@shan5464-aio:/opt/openstack-ansible/inventory# cat test.sh
#!/bin/bash

rm /etc/openstack_deploy/backup_openstack_inventory.tar
for ITER in {1..40}; do
  /opt/openstack-ansible/inventory/dynamic_inventory.py > /dev/null 2>&1 &
done
wait

root@shan5464-aio:/opt/openstack-ansible/inventory# ./test.sh

root@shan5464-aio:/opt/openstack-ansible/inventory# /opt/openstack-ansible/inventory/dynamic_inventory.py
Traceback (most recent call last):
  File "/opt/openstack-ansible/inventory/dynamic_inventory.py", line 80, in <module>
    output = generate.main(**all_args)
  File "/opt/openstack-ansible/osa_toolkit/generate.py", line 1075, in main
    inventory, inv_path = filesys.load_inventory(config, INVENTORY_SKEL)
  File "/opt/openstack-ansible/osa_toolkit/filesystem.py", line 247, in load_inventory
    _make_backup(load_path, file_loaded)
  File "/opt/openstack-ansible/osa_toolkit/filesystem.py", line 156, in _make_backup
    with tarfile.open(inventory_backup_file, 'a') as tar:
  File "/usr/lib/python2.7/tarfile.py", line 1711, in open
    return cls.taropen(name, mode, fileobj, **kwargs)
  File "/usr/lib/python2.7/tarfile.py", line 1721, in taropen
    return cls(name, mode, fileobj, **kwargs)
  File "/usr/lib/python2.7/tarfile.py", line 1601, in __init__
    raise ReadError(str(e))
tarfile.ReadError: empty header

root@shan5464-aio:/opt/openstack-ansible/inventory# rm /etc/openstack_deploy/backup_openstack_inventory.tar

root@shan5464-aio:/opt/openstack-ansible/inventory# /opt/openstack-ansible/inventory/dynamic_inventory.py | head -n 4
{
    "_meta": {
        "hostvars": {
            "shan5464-aio": {

Looks like it may need to handle locking in osa_toolkit/filesystem.py or a way to run it read-only to keep the inventory/backup files from being generated on every call.

description: updated
Revision history for this message
Jean-Philippe Evrard (jean-philippe-evrard) wrote :

That looks terribad. Thanks for the bug report. We will triage this appropriately.
Could you tell us on which branch are you?

Revision history for this message
Shannon Mitchell (shannon-mitchell) wrote :

We were seeing it in newton in one of our larger development environments with a lot of activity. The example in the this ticket is from master, so I would assume most have the same issue.

Revision history for this message
Shannon Mitchell (shannon-mitchell) wrote :

Looks like it was pulled out of the main dplaybooks/inventory/dynamic_inventory.py script somewhere between newton and master and placed into osa_toolkit/filesystem.py. The overall code for most of it looks the same. Below is a snippet of the backup portion.

# grep 'def make_backup' ./openstack-ansible/playbooks/inventory/dynamic_inventory.py -A 11
def make_backup(config_path, inventory_file_path):
    # Create a backup of all previous inventory files as a tar archive
    inventory_backup_file = os.path.join(
        config_path,
        'backup_openstack_inventory.tar'
    )
    with tarfile.open(inventory_backup_file, 'a') as tar:
        basename = os.path.basename(inventory_file_path)
        backup_name = get_backup_name(basename)
        tar.add(inventory_file_path, arcname=backup_name)
    logger.debug("Backup written to %s", inventory_backup_file)

# grep 'def _make_backup' /opt/openstack-ansible/osa_toolkit/filesystem.py -A17
def _make_backup(backup_path, source_file_path):
    """Create a backup of all previous inventory files as a tar archive

    :param backup_path: where to store the backup file
    :param source_file_path: path of file to backup
    :return:
    """

    inventory_backup_file = os.path.join(
        backup_path,
        'backup_openstack_inventory.tar'
    )
    with tarfile.open(inventory_backup_file, 'a') as tar:
        basename = os.path.basename(source_file_path)
        backup_name = _get_backup_name(basename)
        tar.add(source_file_path, arcname=backup_name)
    logger.debug("Backup written to {}".format(inventory_backup_file))

Changed in openstack-ansible:
status: New → Confirmed
importance: Undecided → High
Revision history for this message
Matt Thompson (mattt416) wrote :

Looks like another user ran into this issue. From #openstack-ansible:

JohnnyOSA | Hey all -- after doing a simple typo with using the --limit flag with an ansible ad-hoc command (using --limit galera_container[12 instead of --limit galera_container[1]), the inventory python tools have broken and yield the
          | following error: http://paste.openstack.org/show/679696/

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to openstack-ansible (master)

Fix proposed to branch: master
Review: https://review.openstack.org/565950

Changed in openstack-ansible:
assignee: nobody → Kevin Carter (kevin-carter)
status: Confirmed → In Progress
Changed in openstack-ansible:
assignee: Kevin Carter (kevin-carter) → Jean-Philippe Evrard (jean-philippe-evrard)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Fix proposed to branch: master
Review: https://review.openstack.org/637441

Changed in openstack-ansible:
assignee: Jean-Philippe Evrard (jean-philippe-evrard) → Shannon Mitchell (shannon-mitchell)
Revision history for this message
Shannon Mitchell (shannon-mitchell) wrote :

I tried the patch from https://review.openstack.org/565950 as we encountered the issue again recently. The patch does seem to reduce the number of times corruption happens, but it still happens.

I was surprised to see that it looks like a backup is created every time the script is ran. This results in a large number of backups that may contain the same inventory. Checking the inventory for modifications and saving only if needed should reduce the writes that are happening while the backup is taking place in a separate process. Also creating the backup to right before a save, should provide more meaningful backups for administrators.

This is the first time for me digging into the python code on this(and I'm no dev), so I might be missing the reasoning behind the backups being done this way. I put in https://review.openstack.org/#/c/637441/ for these changes. The end result when testing was the inventory and backup was only written to when the conf files were changed. Also any action taken with inventory-manage.py that runs a save will create the backup as well. It also seems to have done away with the corruption issues.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to openstack-ansible (stable/rocky)

Fix proposed to branch: stable/rocky
Review: https://review.openstack.org/638289

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to openstack-ansible (stable/queens)

Fix proposed to branch: stable/queens
Review: https://review.openstack.org/638291

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to openstack-ansible (stable/pike)

Fix proposed to branch: stable/pike
Review: https://review.openstack.org/638292

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to openstack-ansible (stable/queens)

Reviewed: https://review.openstack.org/638291
Committed: https://git.openstack.org/cgit/openstack/openstack-ansible/commit/?id=c7fbe594d0bdddedddb2aa2977b1ad1e7bdbe5b3
Submitter: Zuul
Branch: stable/queens

commit c7fbe594d0bdddedddb2aa2977b1ad1e7bdbe5b3
Author: Shannon Mitchell <email address hidden>
Date: Sun Feb 17 16:59:29 2019 -0600

    Dynamic inventory backup corruption fix

    When multiple users and process are accessing the dynamic inventory
    the inventory backup and json can get corrupted. This change checks
    for inventory modifictions and only saves if needed. The backup
    is also moved to right before the actual save.

    Change-Id: Ifd348ddd9c21526f5b523963dd1fd247edd6b109
    Closes-Bug: #1750233

tags: added: in-stable-queens
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to openstack-ansible (stable/pike)

Reviewed: https://review.openstack.org/638292
Committed: https://git.openstack.org/cgit/openstack/openstack-ansible/commit/?id=b2208fcfcef596c11247ef8475f0993122be377e
Submitter: Zuul
Branch: stable/pike

commit b2208fcfcef596c11247ef8475f0993122be377e
Author: Shannon Mitchell <email address hidden>
Date: Sun Feb 17 16:59:29 2019 -0600

    Dynamic inventory backup corruption fix

    When multiple users and process are accessing the dynamic inventory
    the inventory backup and json can get corrupted. This change checks
    for inventory modifictions and only saves if needed. The backup
    is also moved to right before the actual save.

    Change-Id: Ifd348ddd9c21526f5b523963dd1fd247edd6b109
    Closes-Bug: #1750233

tags: added: in-stable-pike
tags: added: in-stable-rocky
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to openstack-ansible (stable/rocky)

Reviewed: https://review.openstack.org/638289
Committed: https://git.openstack.org/cgit/openstack/openstack-ansible/commit/?id=87a4b7b5dd8cc0701931f0c535c74b2213ddaf3e
Submitter: Zuul
Branch: stable/rocky

commit 87a4b7b5dd8cc0701931f0c535c74b2213ddaf3e
Author: Shannon Mitchell <email address hidden>
Date: Sun Feb 17 16:59:29 2019 -0600

    Dynamic inventory backup corruption fix

    When multiple users and process are accessing the dynamic inventory
    the inventory backup and json can get corrupted. This change checks
    for inventory modifictions and only saves if needed. The backup
    is also moved to right before the actual save.

    Change-Id: Ifd348ddd9c21526f5b523963dd1fd247edd6b109
    Closes-Bug: #1750233

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to openstack-ansible (master)

Reviewed: https://review.openstack.org/637441
Committed: https://git.openstack.org/cgit/openstack/openstack-ansible/commit/?id=71a067abf26365e2f46d0d6e217f50e047a25a2e
Submitter: Zuul
Branch: master

commit 71a067abf26365e2f46d0d6e217f50e047a25a2e
Author: Shannon Mitchell <email address hidden>
Date: Sun Feb 17 16:59:29 2019 -0600

    Dynamic inventory backup corruption fix

    When multiple users and process are accessing the dynamic inventory
    the inventory backup and json can get corrupted. This change checks
    for inventory modifictions and only saves if needed. The backup
    is also moved to right before the actual save.

    Change-Id: Ifd348ddd9c21526f5b523963dd1fd247edd6b109
    Closes-Bug: #1750233

Changed in openstack-ansible:
status: In Progress → Fix Released
Changed in openstack-ansible:
status: Fix Released → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/openstack-ansible 18.1.5

This issue was fixed in the openstack/openstack-ansible 18.1.5 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/openstack-ansible 16.0.27

This issue was fixed in the openstack/openstack-ansible 16.0.27 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/openstack-ansible 17.1.9

This issue was fixed in the openstack/openstack-ansible 17.1.9 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/openstack-ansible 19.0.0.0b1

This issue was fixed in the openstack/openstack-ansible 19.0.0.0b1 development milestone.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to openstack-ansible (master)

Reviewed: https://review.openstack.org/565950
Committed: https://git.openstack.org/cgit/openstack/openstack-ansible/commit/?id=ac28ad132922b582679c20c0f2346987c25bc98a
Submitter: Zuul
Branch: master

commit ac28ad132922b582679c20c0f2346987c25bc98a
Author: Kevin Carter <email address hidden>
Date: Wed May 2 23:29:50 2018 -0500

    Automatically prune the inventory backup

    The inventory backup process takes the running inventory json file and
    adds it to a tar archive. This process has no limits and will add files
    to the tar archive until that is no longer possible and limited by the
    underlying operating system. This change automatically prunes the backup
    file and retains only the last 15 inventory files. This should provide
    the same backup capabilities we've had without trying to saving
    archives indefinitely.

    > It should be noted that this change is using a subprocess call to
      prune the tar file. This is being done because the "tarfile" library
      does not provide an interface for deleting a file within an archive.

    Change-Id: Ida5a9be0d0910c223fe05401bc4f75aef100e456
    Closes-Bug: #1750233
    Signed-off-by: Kevin Carter <email address hidden>

Changed in openstack-ansible:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/openstack-ansible 19.0.0.0rc1

This issue was fixed in the openstack/openstack-ansible 19.0.0.0rc1 release candidate.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.