centos-8 ussuri scenario 0 upgrade Callback Exception controller upgrade run

Bug #1888488 reported by Marios Andreou
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
tripleo
Fix Released
Low
Marios Andreou

Bug Description

At [1][2] the newly added centos-8 scenario 0 upgrade job is green but with a bunch of "Callback Exception" - various traces example:

        2020-07-21 11:13:35 | [WARNING]: Failure using method (v2_playbook_on_handler_task_start) in callback
        2020-07-21 11:13:35 | plugin (<ansible.plugins.callback.tripleo_profile_tasks.CallbackModule object
        2020-07-21 11:13:35 | at 0x7f6f6b1af128>): maximum recursion depth exceeded while calling a Python
        2020-07-21 11:13:35 | object
        2020-07-21 11:13:35 | Callback Exception:
        2020-07-21 11:13:35 | File "/usr/lib/python3.6/site-packages/ansible/executor/task_queue_manager.py", line 327, in send_callback
        2020-07-21 11:13:35 | method(*new_args, **kwargs)
        2020-07-21 11:13:35 | File "/usr/lib/python3.6/site-packages/ansible/plugins/callback/profile_tasks.py", line 164, in v2_playbook_on_handler_task_start
        2020-07-21 11:13:35 | self._record_task(task)
        2020-07-21 11:13:35 | File "/usr/share/ansible/plugins/callback/tripleo_profile_tasks.py", line 76, in _record_task

The start of all the overcloud_upgrade_foo.log files has this - don't know if related:

        * 2020-07-21 11:03:54 | /usr/lib64/python3.6/importlib/_bootstrap.py:219: ImportWarning: can't resolve package from __spec__ or __package__, falling back on __name__ and __path__
        2020-07-21 11:03:54 | return f(*args, **kwds)
        2020-07-21 11:03:54 | --- Logging error ---
        2020-07-21 11:03:54 | Traceback (most recent call last):
        2020-07-21 11:03:54 | File "/usr/lib64/python3.6/logging/__init__.py", line 998, in emit
        2020-07-21 11:03:54 | self.flush()
        2020-07-21 11:03:54 | File "/usr/lib64/python3.6/logging/__init__.py", line 978, in flush
        2020-07-21 11:03:54 | self.stream.flush()
        2020-07-21 11:03:54 | BrokenPipeError: [Errno 32] Broken pipe
        2020-07-21 11:03:54 | Call stack:
        2020-07-21 11:03:54 | File "/usr/bin/openstack", line 10, in <module>
        2020-07-21 11:03:54 | sys.exit(main())
        2020-07-21 11:03:54 | File "/usr/lib/python3.6/site-packages/openstackclient/shell.py", line 153, in main
        2020-07-21 11:03:54 | return OpenStackShell().run(argv)
        2020-07-21 11:03:54 | File "/usr/lib/python3.6/site-packages/osc_lib/shell.py", line 149, in run
        2020-07-21 11:03:54 | self.log.info("END return value: %s", ret_val)
        2020-07-21 11:03:54 | Message: 'END return value: %s'
        2020-07-21 11:03:54 | Arguments: (0,)
        2020-07-21 11:03:54 | Exception ignored in: <_io.TextIOWrapper name='<stdout>' mode='w' encoding='UTF-8'>
        2020-07-21 11:03:54 | BrokenPipeError: [Errno 32] Broken pipe
        2020-07-21 11:03:54 | Running major upgrade prepare step

I also note that the centos*7* job - train for example at [3] has similar looking Callback Exception.

I am not yet convinced that these are *real* errors. I need some sanity check from upgrades/other folks - AFAICS from brief dig, the relevant code lives in tripleo-ansible/tripleo-validations/validations-common [4].

If they aren't real then let's use this bug to remove or otherwise suppress these please?

[1] https://storage.gra.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_bab/739457/8/check/tripleo-ci-centos-8-scenario000-multinode-oooq-container-upgrades-ussuri/babae60/logs/undercloud/home/zuul/overcloud_upgrade_run_Controller.log
[2] https://8657757db0c2321fb483-9dc687b1ca3b864db38ca15564d88e40.ssl.cf1.rackcdn.com/739457/8/check/tripleo-ci-centos-8-scenario000-multinode-oooq-container-upgrades-ussuri/c1727f7/logs/undercloud/home/zuul/overcloud_upgrade_run_Controller.log
[3] https://storage.gra.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_77b/735818/1/gate/tripleo-ci-centos-7-scenario000-multinode-oooq-container-upgrades/77bfe95/logs/undercloud/home/zuul/overcloud_upgrade_run_Controller.log
[4] http://codesearch.openstack.org/?q=def%20v2_playbook_on_task_start

Revision history for this message
Bogdan Dobrelya (bogdando) wrote :

The callback plugin seems like a missing backport for Ied39aaef9c65c65f33cceb99071c53af7f9aa464
https://review.opendev.org/#/c/733750/1/tripleo_ansible/ansible_plugins/callback/tripleo_profile_tasks.py

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to tripleo-ansible (stable/ussuri)

Fix proposed to branch: stable/ussuri
Review: https://review.opendev.org/742400

Revision history for this message
Alex Schultz (alex-schultz) wrote : Re: centos-8 scenario 0 upgrade Callback Exception controller upgrade run

For the record, call back exceptions do not actually affect ansible execution. Ansible will just stop running the broken callback

Changed in tripleo:
importance: Critical → Low
summary: - centos-8 scenario 0 upgrade Callback Exception controller upgrade run
+ centos-8 ussuri scenario 0 upgrade Callback Exception controller upgrade
+ run
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to tripleo-ansible (stable/ussuri)

Reviewed: https://review.opendev.org/742400
Committed: https://git.openstack.org/cgit/openstack/tripleo-ansible/commit/?id=8794829156e0fb2ad085fa21b104e6e319b896f5
Submitter: Zuul
Branch: stable/ussuri

commit 8794829156e0fb2ad085fa21b104e6e319b896f5
Author: Rabi Mishra <email address hidden>
Date: Fri Jun 5 11:41:03 2020 +0530

    Fix tripleo_profile_tasks callback plugin _output()

    Regression from https://review.opendev.org/#/c/733394/

    Partial-Bug: #1888488
    Change-Id: Ied39aaef9c65c65f33cceb99071c53af7f9aa464
    (cherry picked from commit f0dbd1d0f5eb0122007b3a4ca100764646a68d98)

tags: added: in-stable-ussuri
Revision history for this message
Marios Andreou (marios-b) wrote :

Thanks very much Bogdan & Rabi - after https://review.opendev.org/742400 merged I reran the test in https://review.opendev.org/#/c/739457/ - looking at the latest logs I don't see any more 'callback Exception'

        * https://32bd1141f7c2886be59d-3e8a0a8e79810b59118da02fe60f5024.ssl.cf1.rackcdn.com/739457/9/check/tripleo-ci-centos-8-scenario000-multinode-oooq-container-upgrades-ussuri/a0de4e0/logs/undercloud/home/zuul/overcloud_upgrade_run_Controller.log

Going to mark this as fix-released please move back if you disagree thanks

Changed in tripleo:
status: Triaged → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.