Openstack HA , cmon monitoring script is not able to start mysql when mysql is down after few days.

Bug #1408756 reported by venu kolli
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Juniper Openstack
Fix Committed
High
Ranjeet R
R2.0
Fix Committed
High
Sanju Abraham
R2.1
Fix Committed
High
Sanju Abraham

Bug Description

Openstack HA , cmon monitoring script is not able to start mysql when mysql is down after few days.

Issue is observed on R2.0 build 22

More info to be added by ranjeet.

Tags: ha
Revision history for this message
venu kolli (vkolli) wrote :

Assigning to ranjeet

Changed in juniperopenstack:
assignee: nobody → Ranjeet R (rranjeet-n)
milestone: none → r2.1-fcs
Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : A change has been merged

Reviewed: https://review.opencontrail.org/6272
Committed: http://github.org/Juniper/contrail-provisioning/commit/da68ca883b64a60ce1ab0fbfcf669edddb321587
Submitter: Zuul
Branch: R2.1

commit da68ca883b64a60ce1ab0fbfcf669edddb321587
Author: Ranjeet R <email address hidden>
Date: Fri Jan 16 10:28:29 2015 -0800

Fixes: Openstack HA , cmon monitoring script is not able to start mysql when mysql is down after few days

Token cleanup and CMON logs cleanup is scheduled to run as a
cronjob in the midnight in all the three OpenStack controllers.
If the db is huge, it leads to WSREP lock issues leading to
data inconsistency. WSREP kills MySQL when there is data inconsistency
which leads to MySQL being killed.

To fix this, we will space out the cleanup job to run every hour.

Change-Id: I8088f18a06959eb2ef53416beb5b0bc29b44da00
Closes-Bug:1408756

Changed in juniperopenstack:
milestone: r2.1-fcs → none
Ranjeet R (rranjeet-n)
Changed in juniperopenstack:
importance: Undecided → High
status: New → Fix Committed
Revision history for this message
venu kolli (vkolli) wrote :

fix needs to be revisited , marked it as blocker for 2.1

Revision history for this message
Sanju Abraham (asanju) wrote :

Fixed issue in the way cmon stats, logs and config get captured, stored and purged.

cmon logs all the states in DB that includes disk, cpu mem, table, transaction and monitoring events and logs. These logs will grow in size across the cluster when there are transactions happening on the mysql database.

The long term fix address this by increasing the time to collect and reducing the purge duration and truncating the un-reported statistics, alarms and logs which are not consumed by any application and not notified.

Fix is committed as part of https://review.opencontrail.org/#/c/7398

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote :

Reviewed: https://review.opencontrail.org/7493
Committed: http://github.org/Juniper/contrail-provisioning/commit/94be5bdd4c4f7c384b7168ae784f71f25d2a08dd
Submitter: Zuul
Branch: R2.0

commit 94be5bdd4c4f7c384b7168ae784f71f25d2a08dd
Author: Ranjeet R <email address hidden>
Date: Mon Feb 16 12:15:15 2015 -0800

Fixes: This bug address the keystone token and cmon purge issue during scale and failure conditions

Galera cluster was having issues when DELETE calls where made to large tables like CMON.

This fix slows down the log collection and runs the command in only one node.

Change-Id: I77afc8854b8c6c284daa1b812666a57328dc7eca
Close-Bug:1408756

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote :

Reviewed: https://review.opencontrail.org/7398
Committed: http://github.org/Juniper/contrail-provisioning/commit/c1563d8c9217943b0bed7e97d22d139e79040270
Submitter: Zuul
Branch: R2.1

commit c1563d8c9217943b0bed7e97d22d139e79040270
Author: Sanju Abraham <email address hidden>
Date: Thu Feb 12 19:39:51 2015 -0800

Close-Bug:1408756. This bug address the keystone token and cmon purge issue during scale and failure conditions

Change-Id: I9429bd75315b165f8b28592723de60da92a137ea

information type: Proprietary → Public
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.