online_data_migrations exceptions quietly masked

Bug #1796192 reported by iain MacDonnell
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Cinder
Fix Released
Medium
iain MacDonnell
OpenStack Compute (nova)
Fix Released
Medium
iain MacDonnell
Rocky
Fix Released
Medium
iain MacDonnell

Bug Description

When online_data_migrations raise exceptions, nova/cinder-manage catches the exception, prints a fairly useless "something didn't work" message, and moves on. Two issues:

1) The user(/admin) has no way to see what actually failed (exception is not logged)
2) The command returns exit status 0, as if all possible migrations have been completed successfully - this can cause failures to get missed, especially if automated

description: updated
Changed in nova:
assignee: nobody → iain MacDonnell (imacdonn)
status: New → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (master)

Fix proposed to branch: master
Review: https://review.openstack.org/608091

tags: added: nova-manage
Matt Riedemann (mriedem)
Changed in cinder:
status: New → Confirmed
importance: Undecided → Medium
Changed in nova:
importance: Undecided → Medium
Changed in nova:
assignee: iain MacDonnell (imacdonn) → Matt Riedemann (mriedem)
Changed in cinder:
assignee: nobody → iain MacDonnell (imacdonn)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (master)

Reviewed: https://review.openstack.org/608091
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=3eea37b85b1abc72786a2b24baf01141b4d95f08
Submitter: Zuul
Branch: master

commit 3eea37b85b1abc72786a2b24baf01141b4d95f08
Author: imacdonn <email address hidden>
Date: Thu Oct 4 21:27:18 2018 +0000

    Handle online_data_migrations exceptions

    When online_data_migrations raise exceptions, nova/cinder-manage catches
    the exceptions, prints fairly useless "something didn't work" messages,
    and moves on. Two issues:

    1) The user(/admin) has no way to see what actually failed (exception
       detail is not logged)

    2) The command returns exit status 0, as if all possible migrations have
       been completed successfully - this can cause failures to get missed,
       especially if automated

    This change adds logging of the exceptions, and introduces a new exit
    status of 2, which indicates that no updates took effect in the last
    batch attempt, but some are (still) failing, which requires intervention.

    Change-Id: Ib684091af0b19e62396f6becc78c656c49a60504
    Closes-Bug: #1796192

Changed in nova:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to cinder (master)

Fix proposed to branch: master
Review: https://review.openstack.org/611463

Changed in cinder:
status: Confirmed → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (stable/rocky)

Fix proposed to branch: stable/rocky
Review: https://review.openstack.org/611701

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to cinder (master)

Reviewed: https://review.openstack.org/611463
Committed: https://git.openstack.org/cgit/openstack/cinder/commit/?id=d47486d317e8cdb1cdab50d73f6484289ab082d4
Submitter: Zuul
Branch: master

commit d47486d317e8cdb1cdab50d73f6484289ab082d4
Author: imacdonn <email address hidden>
Date: Wed Oct 17 22:48:18 2018 +0000

    cinder-manage online_data_migrations fixes

    Addresses some issues with this command:

    1) When used without the --max-count option, the summary table will
       always show zero migrations run, because it only accounts for the
       last batch, and the loop only exits when the last batch does no work.

    2) "remaining" counts cannot be accurate, given the way migrations are
       implemented, because the "found" count refers to the number of rows
       that exist in the database, not the number that still need the
       migration applied.

    3) In the case where no migrations are successful, but some raise
       exceptions, the command was exiting with status zero, which usually
       indicates "success". This can cause issues that cause migration
       failures to go unnoticed, especially when automated.

    4) When exceptions do occur, a minimally useful message is output, and
       no detail about the exception is available to the user. The exception
       detail should be logged.

    5) Inaccuracies in the documentation - "--max_number" should be
       "--max-count", and stale references to the "--ignore_state" option,
       which was removed in [1]

    The solution for (3) introduces a new exit status, 2. See release note
    for details.

    These changes are aligned with equivalents [2][3] for the nova-manage
    command, except for the calculation of "Total Needed" - nova seems to
    interpret the "found" count differently/inconsistently.

    [1] https://review.openstack.org/510201
    [2] https://review.openstack.org/605828
    [3] https://review.openstack.org/608091

    Change-Id: I878480eb2359625cde839b073230844acc645cba
    Closes-Bug: #1794364
    Closes-Bug: #1796192

Changed in cinder:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to cinder (stable/rocky)

Fix proposed to branch: stable/rocky
Review: https://review.openstack.org/614617

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (stable/rocky)

Reviewed: https://review.openstack.org/611701
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=dd8354efc1113ae9f35e404ef5ece00a78c378b4
Submitter: Zuul
Branch: stable/rocky

commit dd8354efc1113ae9f35e404ef5ece00a78c378b4
Author: imacdonn <email address hidden>
Date: Thu Oct 4 21:27:18 2018 +0000

    Handle online_data_migrations exceptions

    When online_data_migrations raise exceptions, nova/cinder-manage catches
    the exceptions, prints fairly useless "something didn't work" messages,
    and moves on. Two issues:

    1) The user(/admin) has no way to see what actually failed (exception
       detail is not logged)

    2) The command returns exit status 0, as if all possible migrations have
       been completed successfully - this can cause failures to get missed,
       especially if automated

    This change adds logging of the exceptions, and introduces a new exit
    status of 2, which indicates that no updates took effect in the last
    batch attempt, but some are (still) failing, which requires intervention.

    Change-Id: Ib684091af0b19e62396f6becc78c656c49a60504
    Closes-Bug: #1796192
    (cherry picked from commit 3eea37b85b1abc72786a2b24baf01141b4d95f08)

tags: added: in-stable-rocky
Matt Riedemann (mriedem)
Changed in nova:
assignee: Matt Riedemann (mriedem) → iain MacDonnell (imacdonn)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to cinder (stable/rocky)

Reviewed: https://review.openstack.org/614617
Committed: https://git.openstack.org/cgit/openstack/cinder/commit/?id=74fd810ad1a8561ba150925d19feb8cbe598fe84
Submitter: Zuul
Branch: stable/rocky

commit 74fd810ad1a8561ba150925d19feb8cbe598fe84
Author: imacdonn <email address hidden>
Date: Wed Oct 17 22:48:18 2018 +0000

    cinder-manage online_data_migrations fixes

    Addresses some issues with this command:

    1) When used without the --max-count option, the summary table will
       always show zero migrations run, because it only accounts for the
       last batch, and the loop only exits when the last batch does no work.

    2) "remaining" counts cannot be accurate, given the way migrations are
       implemented, because the "found" count refers to the number of rows
       that exist in the database, not the number that still need the
       migration applied.

    3) In the case where no migrations are successful, but some raise
       exceptions, the command was exiting with status zero, which usually
       indicates "success". This can cause issues that cause migration
       failures to go unnoticed, especially when automated.

    4) When exceptions do occur, a minimally useful message is output, and
       no detail about the exception is available to the user. The exception
       detail should be logged.

    5) Inaccuracies in the documentation - "--max_number" should be
       "--max-count", and stale references to the "--ignore_state" option,
       which was removed in [1]

    The solution for (3) introduces a new exit status, 2. See release note
    for details.

    These changes are aligned with equivalents [2][3] for the nova-manage
    command, except for the calculation of "Total Needed" - nova seems to
    interpret the "found" count differently/inconsistently.

    [1] https://review.openstack.org/510201
    [2] https://review.openstack.org/605828
    [3] https://review.openstack.org/608091

    Change-Id: I878480eb2359625cde839b073230844acc645cba
    Closes-Bug: #1794364
    Closes-Bug: #1796192
    (cherry picked from commit d47486d317e8cdb1cdab50d73f6484289ab082d4)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/cinder 13.0.2

This issue was fixed in the openstack/cinder 13.0.2 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/nova 18.1.0

This issue was fixed in the openstack/nova 18.1.0 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/cinder 14.0.0.0rc1

This issue was fixed in the openstack/cinder 14.0.0.0rc1 release candidate.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/nova 19.0.0.0rc1

This issue was fixed in the openstack/nova 19.0.0.0rc1 release candidate.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.