OpenStack DBaaS (Trove)

Allow for invalid packet sequence in keepalive

Bug #1621702 reported by Amrith Kumar on 2016-09-09

This bug affects 1 person

Affects		Status	Importance	Assigned to	Milestone
	OpenStack DBaaS (Trove)	Fix Released	High	Peter Stachowski	OpenStack DBaaS (Trove) next

Bug Description

In the SQLAlchemy keep_alive class, MariaDB is failing as pymysql reports an invalid packet sequence. MariaDB seems to timeout the client in a different way than MySQL and PXC, which manifests itself as the aforementioned invalid sequence. It is now handled as a special-case exception.

OpenStack Infra (hudson-openstack) on 2016-09-09

Changed in trove:
assignee:	Peter Stachowski (peterstac) → Amrith (amrith)
status:	New → In Progress

Amrith Kumar (amrith) on 2016-09-09

Changed in trove:
milestone:	none → next
importance:	Undecided → High
status:	In Progress → Confirmed

OpenStack Infra (hudson-openstack) on 2016-09-11

Changed in trove:
status:	Confirmed → In Progress

OpenStack Infra (hudson-openstack) on 2016-09-12

Changed in trove:
assignee:	Amrith (amrith) → Peter Stachowski (peterstac)

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2016-09-14: Fix merged to trove (master)

Reviewed: https://review.openstack.org/362347
Committed: https://git.openstack.org/cgit/openstack/trove/commit/?id=bd761989eead77eead58c91ccb30fcb53d7a5c5d
Submitter: Jenkins
Branch: master

commit bd761989eead77eead58c91ccb30fcb53d7a5c5d
Author: Peter Stachowski <email address hidden>
Date: Mon Aug 29 19:47:47 2016 +0000

Allow for invalid packet sequence in keepalive

    In the SQLAlchemy keep_alive class, MariaDB is failing
    as pymysql reports an invalid packet sequence.
    MariaDB seems to timeout the client in a different
    way than MySQL and PXC, which manifests itself as the
    aforementioned invalid sequence. It is now handled
    as a special-case exception.

With this fix, the MariaDB scenario tests now pass.

    The scenario tests were also tweaked a bit, which aided
    in the testing of the fix. 'group=instance' was created,
    plus instance_error properly interleaved with
    instance_create. _has_status now calls get_instance with
    the admin client so that any faults are accompanied by
    a relevant stack trace. Cases where the result code
    was being checked out-of-sequence were removed, and explicit
    calls to check the http code for the right client were added.

    The replication error messages for promote and eject were
    enhanced as well to attempt to debug spurious failures.
    One of those failures was 'Replication is not on after 60 seconds.'
    This was fixed by setting 'MASTER_CONNECT_RETRY' in the mariadb
    gtid replication strategy as was done in:
    https://review.openstack.org/#/c/188933

Finally, backup_incremental was added to MariaDB supported
groups and cleaned up elsewhere.

Closes-Bug: #1621702
Change-Id: Id6bde5a34e1d79eece3084f761dcd153c38ccbad

Changed in trove:
status:	In Progress → Fix Released

Report a bug

This report contains Public information

Everyone can see this information.

You are

Subscribing...

Edit bug mail

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.