Fuel for OpenStack

[octane] upgrade on controllers stopped on mysql reason

Bug #1599837 reported by Sergey Abramov on 2016-07-07

This bug affects 1 person

	Status	Importance	Assigned to	Milestone
Fuel for OpenStack	Fix Committed	High	Sergey Abramov	Fuel for OpenStack next
8.0.x	Fix Released	High	Sergey Abramov	Fuel for OpenStack 8.0-mu-3
Mitaka	Fix Released	High	Sergey Abramov	Fuel for OpenStack 9.1

Bug Description

Detailed bug description:
upgrade from 7 -> 8
one controller upgraded successfully
after upgrade db, ceph and control try to upgrade another 2 controllers
It's breaks.

possible reason is:
seq 1 3 | xargs -I {} ssh node-{} ps -ax | grep mysql
Warning: Permanently added 'node-1' (ECDSA) to the list of known hosts.
29953 ? S 0:00 /bin/sh /usr/bin/mysqld_safe --pid-file=/var/run/resource-agents/mysql-wss/mysql-wss.pid --socket=/var/run/mysqld/mysqld.sock --datadir=/var/lib/mysql --user=mysql --wsrep-new-cluster
31017 ? Sl 2:15 /usr/sbin/mysqld --basedir=/usr --datadir=/var/lib/mysql --plugin-dir=/usr/lib/mysql/plugin --user=mysql --wsrep-new-cluster --open-files-limit=102400 --pid-file=/var/run/resource-agents/mysql-wss/mysql-wss.pid --socket=/var/run/mysqld/mysqld.sock --port=3307 --wsrep_start_position=e64df968-43a2-11e6-93a0-abac947a6ff9:17
31018 ? S 0:00 logger -t mysqld -p daemon.error
Warning: Permanently added 'node-2' (ECDSA) to the list of known hosts.
1003 ? Sl 5:09 /usr/sbin/mysqld --basedir=/usr --datadir=/var/lib/mysql --plugin-dir=/usr/lib/mysql/plugin --user=mysql --init-file=/tmp/wsrep-init-file --wsrep-new-cluster --log-error=/var/log/mysql/error.log --open-files-limit=102400 --pid-file=/var/run/resource-agents/mysql-wss/mysql-wss.pid --socket=/var/run/mysqld/mysqld.sock --port=3307 --wsrep_start_position=00000000-0000-0000-0000-000000000000:-1
31785 ? S 0:00 /bin/sh /usr/bin/mysqld_safe --pid-file=/var/run/resource-agents/mysql-wss/mysql-wss.pid --socket=/var/run/mysqld/mysqld.sock --datadir=/var/lib/mysql --user=mysql --init-file=/tmp/wsrep-init-file --wsrep-new-cluster
Warning: Permanently added 'node-3' (ECDSA) to the list of known hosts.
23012 ? S 0:00 /bin/sh /usr/bin/mysqld_safe --pid-file=/var/run/resource-agents/mysql-wss/mysql-wss.pid --socket=/var/run/mysqld/mysqld.sock --datadir=/var/lib/mysql --user=mysql --init-file=/tmp/wsrep-init-file
24679 ? Sl 2:55 /usr/sbin/mysqld --basedir=/usr --datadir=/var/lib/mysql --plugin-dir=/usr/lib/mysql/plugin --user=mysql --init-file=/tmp/wsrep-init-file --log-error=/var/log/mysql/error.log --open-files-limit=102400 --pid-file=/var/run/resource-agents/mysql-wss/mysql-wss.pid --socket=/var/run/mysqld/mysqld.sock --port=3307 --wsrep_start_position=00000000-0000-0000-0000-000000000000:-1

On node-1 and node-2 mysql started with --wsrep-new-cluster attribute
Mysql with this attribute should be run only on primary controller.

Expected results:
all controllers in ready state

Actual result:

upgrading controllers in error state

Workaround:

Choose a node with the primary controller role as the first controller for a seed environment during the upgrade procedure.

See original description

Tags:

Revision history for this message

Ilya Kharin (akscram) wrote on 2016-07-07:

Primary roles of a node should be cleared during its reassignment if they are set, here [1] should be added something like `cls.update_primary_roles(instance, [])`.

[1] https://github.com/openstack/fuel-web/blob/stable/8.0/nailgun/nailgun/objects/node.py#L625

Changed in fuel:
milestone:	none → 9.1
assignee:	nobody → Sergey Abramov (sabramov)
description:	updated
Changed in fuel:
importance:	Undecided → High
status:	New → Confirmed

Ilya Kharin (akscram) on 2016-07-07

Changed in fuel:
milestone:	9.1 → next
no longer affects:	fuel/8.0.x

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2016-07-15: Related fix merged to fuel-web (master)

Reviewed: https://review.openstack.org/339005
Committed: https://git.openstack.org/cgit/openstack/fuel-web/commit/?id=aca06b5296b3c5dc7f3bac3b1d61e3d3ec76c686
Submitter: Jenkins
Branch: master

commit aca06b5296b3c5dc7f3bac3b1d61e3d3ec76c686
Author: Sergey Abramov <email address hidden>
Date: Thu Jul 7 16:22:04 2016 +0300

Clear primary roles during upgrade of nodes

    This patch clears primary_roles of a node during its reassignment as a
    part of an upgrade. Without this fix it is possible to have several
    primary roles in a seed cluster during the upgrade. This race condition
    leads to some failed deployments during the upgrade. As a result primary
    roles have to be cleared during the reassignment of nodes. That allows
    to assign primary roles on appropriate nodes rely on the internal logic.

Related-bug: 1599837
Change-Id: Iae5f3090cfbfd1e6033b94720694a94477562328

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2016-07-15: Related fix proposed to fuel-web (stable/mitaka)

Related fix proposed to branch: stable/mitaka
Review: https://review.openstack.org/342684

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2016-07-15: Related fix proposed to fuel-web (stable/8.0)

Related fix proposed to branch: stable/8.0
Review: https://review.openstack.org/342686

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2016-07-26: Related fix merged to fuel-web (stable/mitaka)

Reviewed: https://review.openstack.org/342684
Committed: https://git.openstack.org/cgit/openstack/fuel-web/commit/?id=69551303f3dbd43a45b8a3641fe0c1c2f752ff72
Submitter: Jenkins
Branch: stable/mitaka

commit 69551303f3dbd43a45b8a3641fe0c1c2f752ff72
Author: Sergey Abramov <email address hidden>
Date: Thu Jul 7 16:22:04 2016 +0300

Clear primary roles during upgrade of nodes

    Related-bug: 1599837
    Change-Id: Iae5f3090cfbfd1e6033b94720694a94477562328
    (cherry picked from commit aca06b5296b3c5dc7f3bac3b1d61e3d3ec76c686)

tags:

added: in-stable-mitaka

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2016-08-15: Related fix merged to fuel-web (stable/8.0)

Reviewed: https://review.openstack.org/342686
Committed: https://git.openstack.org/cgit/openstack/fuel-web/commit/?id=1311fda210490b507d5ebd9031332a90e1c6a5da
Submitter: Jenkins
Branch: stable/8.0

commit 1311fda210490b507d5ebd9031332a90e1c6a5da
Author: Sergey Abramov <email address hidden>
Date: Thu Jul 7 16:22:04 2016 +0300

Clear primary roles during upgrade of nodes

    Related-bug: 1599837
    Change-Id: Iae5f3090cfbfd1e6033b94720694a94477562328
    (cherry picked from commit aca06b5296b3c5dc7f3bac3b1d61e3d3ec76c686)

Ilya Kharin (akscram) on 2016-08-29

Changed in fuel:
status:	Confirmed → Fix Committed

Revision history for this message

Dmitry (dtsapikov) wrote on 2016-09-08:

Verified on MU+3

tags:	added: on-verification
tags:	removed: on-verification

Revision history for this message

Vladimir Khlyunev (vkhlyunev) wrote on 2016-09-23:

Fixed by changing documentation - now upgrade of primary controller requires primary controller from old cluster; verified (snapshot does not matter). Clean-up primary roles is useful and does not breaks anything - 9.1 snapshot 298

Report a bug

This report contains Public information

Everyone can see this information.

You are

Subscribing...

Edit bug mail

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.