[octane] upgrade on controllers stopped on mysql reason

Bug #1599837 reported by Sergey Abramov
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Fuel for OpenStack
Fix Committed
High
Sergey Abramov
8.0.x
Fix Released
High
Sergey Abramov
Mitaka
Fix Released
High
Sergey Abramov

Bug Description

Detailed bug description:
upgrade from 7 -> 8
one controller upgraded successfully
after upgrade db, ceph and control try to upgrade another 2 controllers
It's breaks.

possible reason is:
seq 1 3 | xargs -I {} ssh node-{} ps -ax | grep mysql
Warning: Permanently added 'node-1' (ECDSA) to the list of known hosts.
29953 ? S 0:00 /bin/sh /usr/bin/mysqld_safe --pid-file=/var/run/resource-agents/mysql-wss/mysql-wss.pid --socket=/var/run/mysqld/mysqld.sock --datadir=/var/lib/mysql --user=mysql --wsrep-new-cluster
31017 ? Sl 2:15 /usr/sbin/mysqld --basedir=/usr --datadir=/var/lib/mysql --plugin-dir=/usr/lib/mysql/plugin --user=mysql --wsrep-new-cluster --open-files-limit=102400 --pid-file=/var/run/resource-agents/mysql-wss/mysql-wss.pid --socket=/var/run/mysqld/mysqld.sock --port=3307 --wsrep_start_position=e64df968-43a2-11e6-93a0-abac947a6ff9:17
31018 ? S 0:00 logger -t mysqld -p daemon.error
Warning: Permanently added 'node-2' (ECDSA) to the list of known hosts.
 1003 ? Sl 5:09 /usr/sbin/mysqld --basedir=/usr --datadir=/var/lib/mysql --plugin-dir=/usr/lib/mysql/plugin --user=mysql --init-file=/tmp/wsrep-init-file --wsrep-new-cluster --log-error=/var/log/mysql/error.log --open-files-limit=102400 --pid-file=/var/run/resource-agents/mysql-wss/mysql-wss.pid --socket=/var/run/mysqld/mysqld.sock --port=3307 --wsrep_start_position=00000000-0000-0000-0000-000000000000:-1
31785 ? S 0:00 /bin/sh /usr/bin/mysqld_safe --pid-file=/var/run/resource-agents/mysql-wss/mysql-wss.pid --socket=/var/run/mysqld/mysqld.sock --datadir=/var/lib/mysql --user=mysql --init-file=/tmp/wsrep-init-file --wsrep-new-cluster
Warning: Permanently added 'node-3' (ECDSA) to the list of known hosts.
23012 ? S 0:00 /bin/sh /usr/bin/mysqld_safe --pid-file=/var/run/resource-agents/mysql-wss/mysql-wss.pid --socket=/var/run/mysqld/mysqld.sock --datadir=/var/lib/mysql --user=mysql --init-file=/tmp/wsrep-init-file
24679 ? Sl 2:55 /usr/sbin/mysqld --basedir=/usr --datadir=/var/lib/mysql --plugin-dir=/usr/lib/mysql/plugin --user=mysql --init-file=/tmp/wsrep-init-file --log-error=/var/log/mysql/error.log --open-files-limit=102400 --pid-file=/var/run/resource-agents/mysql-wss/mysql-wss.pid --socket=/var/run/mysqld/mysqld.sock --port=3307 --wsrep_start_position=00000000-0000-0000-0000-000000000000:-1

On node-1 and node-2 mysql started with --wsrep-new-cluster attribute
Mysql with this attribute should be run only on primary controller.

Expected results:
all controllers in ready state

Actual result:

upgrading controllers in error state

Workaround:

 Choose a node with the primary controller role as the first controller for a seed environment during the upgrade procedure.

Revision history for this message
Ilya Kharin (akscram) wrote :

Primary roles of a node should be cleared during its reassignment if they are set, here [1] should be added something like `cls.update_primary_roles(instance, [])`.

[1] https://github.com/openstack/fuel-web/blob/stable/8.0/nailgun/nailgun/objects/node.py#L625

Changed in fuel:
milestone: none → 9.1
assignee: nobody → Sergey Abramov (sabramov)
description: updated
Changed in fuel:
importance: Undecided → High
status: New → Confirmed
Ilya Kharin (akscram)
Changed in fuel:
milestone: 9.1 → next
no longer affects: fuel/8.0.x
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to fuel-web (master)

Reviewed: https://review.openstack.org/339005
Committed: https://git.openstack.org/cgit/openstack/fuel-web/commit/?id=aca06b5296b3c5dc7f3bac3b1d61e3d3ec76c686
Submitter: Jenkins
Branch: master

commit aca06b5296b3c5dc7f3bac3b1d61e3d3ec76c686
Author: Sergey Abramov <email address hidden>
Date: Thu Jul 7 16:22:04 2016 +0300

    Clear primary roles during upgrade of nodes

    This patch clears primary_roles of a node during its reassignment as a
    part of an upgrade. Without this fix it is possible to have several
    primary roles in a seed cluster during the upgrade. This race condition
    leads to some failed deployments during the upgrade. As a result primary
    roles have to be cleared during the reassignment of nodes. That allows
    to assign primary roles on appropriate nodes rely on the internal logic.

    Related-bug: 1599837
    Change-Id: Iae5f3090cfbfd1e6033b94720694a94477562328

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to fuel-web (stable/mitaka)

Related fix proposed to branch: stable/mitaka
Review: https://review.openstack.org/342684

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to fuel-web (stable/8.0)

Related fix proposed to branch: stable/8.0
Review: https://review.openstack.org/342686

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to fuel-web (stable/mitaka)

Reviewed: https://review.openstack.org/342684
Committed: https://git.openstack.org/cgit/openstack/fuel-web/commit/?id=69551303f3dbd43a45b8a3641fe0c1c2f752ff72
Submitter: Jenkins
Branch: stable/mitaka

commit 69551303f3dbd43a45b8a3641fe0c1c2f752ff72
Author: Sergey Abramov <email address hidden>
Date: Thu Jul 7 16:22:04 2016 +0300

    Clear primary roles during upgrade of nodes

    This patch clears primary_roles of a node during its reassignment as a
    part of an upgrade. Without this fix it is possible to have several
    primary roles in a seed cluster during the upgrade. This race condition
    leads to some failed deployments during the upgrade. As a result primary
    roles have to be cleared during the reassignment of nodes. That allows
    to assign primary roles on appropriate nodes rely on the internal logic.

    Related-bug: 1599837
    Change-Id: Iae5f3090cfbfd1e6033b94720694a94477562328
    (cherry picked from commit aca06b5296b3c5dc7f3bac3b1d61e3d3ec76c686)

tags: added: in-stable-mitaka
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to fuel-web (stable/8.0)

Reviewed: https://review.openstack.org/342686
Committed: https://git.openstack.org/cgit/openstack/fuel-web/commit/?id=1311fda210490b507d5ebd9031332a90e1c6a5da
Submitter: Jenkins
Branch: stable/8.0

commit 1311fda210490b507d5ebd9031332a90e1c6a5da
Author: Sergey Abramov <email address hidden>
Date: Thu Jul 7 16:22:04 2016 +0300

    Clear primary roles during upgrade of nodes

    This patch clears primary_roles of a node during its reassignment as a
    part of an upgrade. Without this fix it is possible to have several
    primary roles in a seed cluster during the upgrade. This race condition
    leads to some failed deployments during the upgrade. As a result primary
    roles have to be cleared during the reassignment of nodes. That allows
    to assign primary roles on appropriate nodes rely on the internal logic.

    Related-bug: 1599837
    Change-Id: Iae5f3090cfbfd1e6033b94720694a94477562328
    (cherry picked from commit aca06b5296b3c5dc7f3bac3b1d61e3d3ec76c686)

Ilya Kharin (akscram)
Changed in fuel:
status: Confirmed → Fix Committed
Revision history for this message
Dmitry (dtsapikov) wrote :

Verified on MU+3

tags: added: on-verification
tags: removed: on-verification
Revision history for this message
Vladimir Khlyunev (vkhlyunev) wrote :

Fixed by changing documentation - now upgrade of primary controller requires primary controller from old cluster; verified (snapshot does not matter). Clean-up primary roles is useful and does not breaks anything - 9.1 snapshot 298

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.