wsrep never stops when mysqld fails to start
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
MySQL patches by Codership |
Fix Released
|
Medium
|
Alex Yurchenko | ||
codership-maria |
Confirmed
|
Medium
|
Alex Yurchenko |
Bug Description
It seems when mysqld fails to start with ERROR, wsrep can't stop properly.
short version:
libexec/mysqld --defaults-
110607 23:22:38 [ERROR] Can't find messagefile '/usr/local/
....
110607 23:22:44 Percona XtraDB (http://
110607 23:22:44 [ERROR] Aborting
110607 23:22:44 [Note] waiting for 17 wsrep threads to finish
it seems mysqld stays in this state forever without any activity and never exits,
I have to kill `pidof mysqld` to stop process.
Full log:
libexec/mysqld --defaults-
110607 23:22:38 [ERROR] Can't find messagefile '/usr/local/
110607 23:22:38 [Note] WSREP: wsrep_load(): loading provider library '/data/
110607 23:22:38 [Note] WSREP: wsrep_load(): Galera 0.8pre by Codership Oy <email address hidden> loaded succesfully.
110607 23:22:38 [Note] WSREP: wsrep_sst_grab()
110607 23:22:38 [Note] WSREP: Start replication
110607 23:22:38 [Note] WSREP: Passing config to GCS: 'gcs.fc_debug' = '0'
'gcs.fc_factor' = '0.5'
'gcs.fc_limit' = '16'
'gcs.fc_
'gcs.max_
'gcs.max_throttle' = '0.25'
'gcs.recv_
'gcs.recv_
'replicator.
110607 23:22:38 [Warning] WSREP: state file not found: grastate.dat
110607 23:22:38 [Note] WSREP: Setting initial position to 00000000-
110607 23:22:38 [Note] WSREP: protonet asio version 0
110607 23:22:38 [Note] WSREP: backend: asio
110607 23:22:38 [Note] WSREP: GMCast version 0
110607 23:22:38 [Note] WSREP: (b53c588c-
110607 23:22:38 [Note] WSREP: (b53c588c-
110607 23:22:38 [Note] WSREP: EVS version 0
110607 23:22:38 [Note] WSREP: PC version 0
110607 23:22:38 [Note] WSREP: gcomm: connecting to group 'big_monkey', peer ''
110607 23:22:38 [Note] WSREP: GMCast:
} joined {
} left {
} partitioned {
})
110607 23:22:39 [Note] WSREP: gcomm: connected
110607 23:22:39 [Note] WSREP: Changing maximum message size 0 -> 32636
110607 23:22:39 [Note] WSREP: Shifting CLOSED -> OPEN (TO: 0)
110607 23:22:39 [Note] WSREP: Opened channel 'big_monkey'
110607 23:22:39 [Note] WSREP: Waiting for SST to complete.
110607 23:22:39 [Note] WSREP: New COMPONENT: primary = yes, my_idx = 0, memb_num = 1
110607 23:22:39 [Note] WSREP: Starting new group from scratch: b589380a-
110607 23:22:39 [Note] WSREP: STATE_EXCHANGE: sent state UUID: b589538a-
110607 23:22:39 [Note] WSREP: STATE EXCHANGE: sent state msg: b589538a-
110607 23:22:39 [Note] WSREP: STATE EXCHANGE: got state msg: b589538a-
110607 23:22:39 [Note] WSREP: Quorum results:
PRIMARY,
act_id = 0,
conf_id = 0,
last_appl. = -1,
protocol = 0,
group UUID = b589380a-
110607 23:22:39 [Note] WSREP: Flow-control interval: [8, 16]
110607 23:22:39 [Note] WSREP: Restored state OPEN -> JOINED (0)
110607 23:22:39 [Note] WSREP: Member 0 (localhost.
110607 23:22:39 [Note] WSREP: Shifting JOINED -> SYNCED (TO: 0)
110607 23:22:39 [Note] WSREP: New cluster view: group UUID: b589380a-
110607 23:22:39 [Note] WSREP: SST complete, status: 0
InnoDB: The InnoDB memory heap is disabled
InnoDB: Mutexes and rw_locks use GCC atomic builtins
InnoDB: Compressed tables use zlib 1.2.3
110607 23:22:39 InnoDB: Initializing buffer pool, size = 24.0G
110607 23:22:42 InnoDB: Completed initialization of buffer pool
110607 23:22:42 InnoDB: highest supported file format is Barracuda.
InnoDB: The log sequence number in ibdata files does not match
InnoDB: the log sequence number in the ib_logfiles!
110607 23:22:42 InnoDB: Database was not shut down normally!
InnoDB: Starting crash recovery.
InnoDB: Reading tablespace information from the .ibd files...
InnoDB: Restoring possible half-written data pages from the doublewrite
InnoDB: buffer...
110607 23:22:44 Percona XtraDB (http://
110607 23:22:44 [ERROR] Aborting
110607 23:22:44 [Note] waiting for 17 wsrep threads to finish
Changed in codership-maria: | |
status: | New → Confirmed |
affects: | codership-maria → codership-mysql |
Changed in codership-mysql: | |
importance: | Undecided → Medium |
assignee: | nobody → Alex Yurchenko (ayurchen) |
milestone: | none → 0.8.1 |
Changed in codership-maria: | |
status: | New → Confirmed |
importance: | Undecided → Medium |
assignee: | nobody → Alex Yurchenko (ayurchen) |
Changed in codership-mysql: | |
status: | Confirmed → Fix Committed |
Changed in codership-mysql: | |
status: | Fix Committed → Fix Released |
Still the mystery remains. These are the first two lines in the log:
110705 9:08:41 [ERROR] Can't find messagefile '/tmp/galera0/ mysql/share/ mysql/english/ errmsg. sys' galera/ lib/libgalera_ smm.so'
110705 9:08:41 [Note] WSREP: wsrep_load(): loading provider library '/tmp/galera0/
The first comes from mysqld.cc:4942:
if (init_common_ variables( MYSQL_CONFIG_ NAME,
argc, argv, load_default_ groups) ) abort(1) ;
unireg_
and should result in immediate abort. The second, however, comes from mysqld.cc:5040:
if (init_server_ components( )) abort(1) ;
unireg_
which seems to complete successfully:
110705 9:08:42 InnoDB: Initializing buffer pool, size = 595.0M
110705 9:08:42 InnoDB: Completed initialization of buffer pool
110705 9:08:43 InnoDB: Started; log sequence number 6 2033279309
but instead is followed by unireg_abort():
110705 9:08:43 [ERROR] Aborting
What is going on there? What we should be seeing is:
110705 9:08:41 [ERROR] Can't find messagefile '/tmp/galera0/ mysql/share/ mysql/english/ errmsg. sys'
110705 9:08:43 [ERROR] Aborting