HOST failure recovery fails with reserved host

Bug #1819578 reported by Shilpa Devharakar
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
masakari
Fix Released
Critical
Shilpa Devharakar

Bug Description

Bug Description
================
If the HOST recovery_flow is of type 'reserved host', then it fails to execute task as it attempts to persists Host object.

Steps to reproduce:
===================
Note: Need multi node setup to verify this.

1. Declare failover segment with recovery_method as 'reserved_host'
2. Configure host under segment created, one with reserved host.
3. Create Host Notification for failure host:
curl -g -i -X POST http://<IP/instance-ha/v1/
notifications -H "X-Auth-Token: <AUTH_TYPE>" -H "Content-Type: application/json" -H "Accept: application/json" -d '{"notification": {"type": "COMPUTE_HOST", "hostname": <FAILURE_HOST>, "generated_time": "2017-06-13 15:34:55", "payload": {"event": "STOPPED"}}}'
4. Notification processed with status as ERROR.

Masakari-engine Error logs:
===========================
2019-03-12 11:26:52.137 INFO masakari.engine.manager [req-4727ffbc-6167-4558-958c-1add50f0b5f0 admin admin] Processing notification c1942b2c-09ac-47bc-9d16-19a3d0881f82 of type: COMPUTE_HOST
2019-03-12 11:26:52.138 DEBUG oslo_db.api [req-4727ffbc-6167-4558-958c-1add50f0b5f0 admin admin] Loading backend 'sqlalchemy' from 'masakari.db.sqlalchemy.api' from (pid=28985) _load_backend /usr/local/lib/python2.7/dist-packages/oslo_db/api.py:261
2019-03-12 11:26:52.152 DEBUG oslo_db.sqlalchemy.engines [req-4727ffbc-6167-4558-958c-1add50f0b5f0 admin admin] MySQL server mode set to STRICT_TRANS_TABLES,STRICT_ALL_TABLES,NO_ZERO_IN_DATE,NO_ZERO_DATE,ERROR_FOR_DIVISION_BY_ZERO,TRADITIONAL,NO_AUTO_CREATE_USER,NO_ENGINE_SUBSTITUTION from (pid=28985) _check_effective_sql_mode /usr/local/lib/python2.7/dist-packages/oslo_db/sqlalchemy/engines.py:307
Mar 12 11:27:31 masakari-engine[28985]: 2019-03-12 11:27:31.491 DEBUG passlib.utils.compat [req-4727ffbc-6167-4558-958c-1add50f0b5f0 admin admin] loaded lazy attr 'SafeConfigParser': <class ConfigParser.SafeConfigParser at 0x7f940cea5940> from (pid=28985) __getattr__ /usr/local/lib/python2.7/dist-packages/passlib/utils/compat/__init__.py:418
Mar 12 11:27:31 masakari-engine[28985]: 2019-03-12 11:27:31.491 DEBUG passlib.utils.compat [req-4727ffbc-6167-4558-958c-1add50f0b5f0 admin admin] loaded lazy attr 'NativeStringIO': <built-in function StringIO> from (pid=28985) __getattr__ /usr/local/lib/python2.7/dist-packages/passlib/utils/compat/__init__.py:418
Mar 12 11:27:31 masakari-engine[28985]: 2019-03-12 11:27:31.492 DEBUG passlib.utils.compat [req-4727ffbc-6167-4558-958c-1add50f0b5f0 admin admin] loaded lazy attr 'BytesIO': <built-in function StringIO> from (pid=28985) __getattr__ /usr/local/lib/python2.7/dist-packages/passlib/utils/compat/__init__.py:418
Mar 12 11:27:31 masakari-engine[28985]: 2019-03-12 11:27:31.738 INFO masakari.engine.manager [req-4727ffbc-6167-4558-958c-1add50f0b5f0 admin admin] Notification c1942b2c-09ac-47bc-9d16-19a3d0881f82 exits with status: error.
Mar 12 11:27:31 masakari-engine[28985]: 2019-03-12 11:27:31.745 DEBUG masakari.utils [req-4727ffbc-6167-4558-958c-1add50f0b5f0 admin admin] Releasing lock: masakari-138394aa-64f3-485d-b7d2-4833b59cd6c2 on resource: do_process_notification from (pid=28985) inner /opt/stack/masakari/masakari/utils.py:271

Note: This issue observed after changes have been merge for https://review.openstack.org/#/c/640755/

Changed in masakari:
assignee: nobody → Shilpa Devharakar (shilpasd)
Tushar Patil (tpatil)
Changed in masakari:
importance: Undecided → Critical
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to masakari (master)

Fix proposed to branch: master
Review: https://review.openstack.org/644930

Changed in masakari:
status: New → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to masakari (master)

Reviewed: https://review.openstack.org/644930
Committed: https://git.openstack.org/cgit/openstack/masakari/commit/?id=5e037db22e45b06d470df6a67ec3c6ba67e4d349
Submitter: Zuul
Branch: master

commit 5e037db22e45b06d470df6a67ec3c6ba67e4d349
Author: shilpa <email address hidden>
Date: Tue Mar 12 12:02:55 2019 +0530

    Updated rh host workflow for recovery workflow details

    After merge of [1] host recovery for reserved host is getting failed
    as it attempts to persists Host object.

    This patch addressed this issue by sending list of names of reserved host
    instead of host object to recovery flow.

    [1]: https://review.openstack.org/#/c/640755/

    Change-Id: Ifd008b89ff639e2e3bd8229830b5a20dced1c31b
    Closes-Bug: #1819578

Changed in masakari:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/masakari 7.0.0.0rc1

This issue was fixed in the openstack/masakari 7.0.0.0rc1 release candidate.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.