Occasional failure to start mysqd seen in buildbot

Bug #685809 reported by Kristian Nielsen
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
MariaDB
Fix Released
Medium
Kristian Nielsen

Bug Description

Very occasionally, we see the following failure in the "install" tests for
.deb package builds in Buildbot:

Setting up mariadb-server-5.1 (5.1.51-mariadb94) ...
 * Stopping MariaDB database server mysqld
   ...done.
101123 12:02:59 [Note] Plugin 'PBXT' is disabled.
101123 12:02:59 [Note] Plugin 'InnoDB' is disabled.
 * Starting MariaDB database server mysqld
   ...fail!
invoke-rc.d: initscript mysql, action "start" failed.
dpkg: error processing mariadb-server-5.1 (--configure):
 subprocess installed post-installation script returned error exit status 1

It occurs quite rarely, maybe around 1% of runs or less, so quite difficult to
reproduce. Seen in both 5.1 and 5.2.

To proceed further understanding the problem, it would be very useful to have
access to log files after a test run, in particular /var/log/daemon.log in
this case.

One way to do this would be to archive the virtual image after each build or
test run (the image is currently discarded). This requires some disk space,
but should be otherwise reasonable simple to implement.

Revision history for this message
Kristian Nielsen (knielsen) wrote :

One example is here, though this particular link will eventually be purged from the history:

http://buildbot.askmonty.org/buildbot/builders/kvm-deb-karmic-amd64/builds/602/steps/test/logs/stdio

Changed in maria:
importance: Undecided → Medium
milestone: none → 5.1
status: New → Confirmed
Revision history for this message
Kristian Nielsen (knielsen) wrote :

I now managed to repeat it locally (once), by repeatedly re-running the test thousand of times for >12 hours...

Will check logs and see what I come up with.

Changed in maria:
assignee: nobody → Kristian Nielsen (knielsen)
Revision history for this message
Kristian Nielsen (knielsen) wrote :

I found the problem. The cause is high load on the test server causing the
mysqld server to take more than 14 seconds to start.

(This is not unexpected, as eg. innodb has to initialise all its log files and
table space, and the host is running 3+ busy virtual machines, so I/O can
easily be slow at times. It could also happen eg. if innodb would need to do
recovery; not something to be recommended during upgrade, but could
happen. Also if an existing config file with large innodb logs/tablespaces
exists, again initialisation could take a long time.)

The /etc/init.d/mysql script starts the server and waits for 14 seconds for it
to become available. If it does not within this timeout, it gives up with the
given error, causing install to fail.

I am considering two options for solving this:

1. Increase the timeout to much higher, maybe 60 or 120 seconds. Then after 14
seconds we could output a warning that mysql is slow to start, and that we are
still waiting 46 (/106/whatever) seconds for it to come up.

2. Keep the existing timeout by default, add an environment variable
STARTUP_TIMEOUT that the tests can set to get a higher timeout.

A disadvantage of (1) is that it is not uncommon for mysqld to fail to start
at all. Quite common for example is a mis-match of actual innodb log file size
vs. configured size. This would cause the user to sit and wait for 60/120
seconds before getting a reported failure.

Need to think about what to do, however I lean mostly towards (2) I think. It
seems quite rare in practise that mysqld would need more than 14 seconds to
start up.

Revision history for this message
Kristian Nielsen (knielsen) wrote :

Fix pushed, will be in next MariaDB 5.1 and 5.2 release.

Changed in maria:
status: Confirmed → Fix Committed
description: updated
Changed in maria:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.