Delete table/readd table can cause corruption, was: Testing TPCC-MYSQL get odd db/index coruption
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
WaffleGrid |
In Progress
|
Critical
|
MadMatt |
Bug Description
Everything loads fine...
Memcached Blocks missed 96096 : Total Time: 45469 (ms) : Avg Time : 47 (us)
Memcahced Blocks set norm 1 1001 : Total Time: 508614 (us): Avg Time : 508 (us)
Memcached Blocks set norm 1 96096 : Total Time: 500398 (ms): Avg Time : 520 (us)
090619 9:18:41 InnoDB: Error: space id and page n:o stored in the page
InnoDB: read in are 27:0, should be 27:12618!
InnoDB: Database page corruption on disk or a failed
InnoDB: file read of page 12618.
InnoDB: You may have to recover from a backup.
090619 9:18:41 InnoDB: Page dump in ascii and hex (16384 bytes):
...
Not sure if this is the first get or something else.
Changed in wafflegrid: | |
assignee: | nobody → MadMatt (yonkovim) |
importance: | Undecided → Critical |
status: | New → Confirmed |
MadMatt (yonkovim) wrote : | #1 |
MadMatt (yonkovim) wrote : | #2 |
Verified after a restart that following these steps does not produce the error:
mysql -e"set @@global.
mysql -e"set @@global.
mysql -e"set @@global.
./warmcache.sh
mysql tpcc < count.sql
mysql
./warmcache.sh
---------
MEMCACHED
---------
Memcached puts 170217
Memcached hits 73936
Memcached misses 170228
Memcached Prefix: 27049
Memcached Get Total Lat 85 (us)
Memcached Get Recent Lat 76 (us)
Memcached Miss Total Lat 45 (us)
Memcached Miss Recent Lat 35 (us)
Memcached Set Total Lat 528 (us)
Memcached Set Recent Lat 484 (us)
--------------
So memcached pages are going back and forth...
But as soon as a tpcc:
nohup ./tpcc_start localhost tpcc tpcc tpcc 20 16 20 900 > tpcc_waffle.out 2>&1 &
[2] 27659
root@frankenmat
090619 09:38:47 mysqld_safe mysqld restarted
[2]+ Segmentation fault nohup ./tpcc_start localhost tpcc tpcc tpcc 20 16 20 900 > tpcc_waffle.out 2>&1
root@frankenmat
error:
Memcached Blocks read 71071 : Total Time: 60609 (ms) : Avg Time : 85 (us)
Memcached Blocks read 1001 : Total Time: 82484 (us) : Avg Time : 82 (us)
Memcached Blocks read 72072 : Total Time: 61434 (ms) : Avg Time : 85 (us)
Memcached Blocks read 1001 : Total Time: 76644 (us) : Avg Time : 76 (us)
Memcached Blocks read 73073 : Total Time: 62200 (ms) : Avg Time : 85 (us)
[0x7a5025]
[0x4ce335]
[0x656ac0]
/lib/libc.
[0x403a06b0]
090619 9:38:46 - mysqld got signal 11 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
We will try our best to scrape up some info that will hopefully help diagnose
the problem, but since we have already crashed, something is definitely wrong
and this may fail.
key_buffer_
read_buffer_
max_used_
max_threads=151
threads_
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_
bytes of memory
Hope that's ok; if not, decrease some variables in the equation.
thd: 0x7fef7805beb0
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
MadMatt (yonkovim) wrote : | #3 |
Strange...
I wiped the tpcc database, shutdown mysql, started mysql, reloaded the tpcc db and ran through the same steps... and it works. No error so far.
MadMatt (yonkovim) wrote : | #4 |
<TpmC>
hmmmmm...
Just to be sure this is not something to do with dropping tables:
Turn waffle on...
Run TPCC
Drop Database
load dump
Run tpcc...
root@frankenmat
[2] 29865
root@frankenmat
090619 10:41:59 mysqld_safe mysqld restarted
[2]+ Segmentation fault nohup ./tpcc_start localhost tpcc tpcc tpcc 20 16 20 300 > tpcc_waffle2.out 2>&1
crashed it with:
Memcached (write) set: Block: 6e06:1b:0 : Thread : 1089898832 time:129
Memcached (write) set: Block: 6e06:1b:1 : Thread : 1089898832 time:132
Memcached (write) set: Block: 6e06:1b:2 : Thread : 1089898832 time:126
Memcached (write) set: Block: 6e06:1b:3 : Thread : 1089898832 time:128
[0x7a5025]
[0x4ce335]
[0x656ac0]
[0x42af7d]
[0x50dc21]
[0x4d8b6d]
[0x557966]
[0x55956a]
[0x4e4aae]
[0x4e5e2e]
[0x4d84bd]
[0x65311d]
[0x8796d9]
Memcached get: Block: 6e06:17:3 : Thread : 1080105296 time:394
Memcached get: Block: 6e06:17:1 : Thread : 1080105296 time:128
Memcached get: Block: 6e06:17:13e : Thread : 1080105296 time:125
Memcached get: Block: 6e06:18:3 : Thread : 1080105296 time:141
Memcached get: Block: 6e06:18:1 : Thread : 1080105296 time:118
Memcached get: Block: 6e06:13:3 : Thread : 1080105296 time:147
Memcached get: Block: 6e06:13:1 : Thread : 1080105296 time:120
Memcached get: Block: 6e06:13:3e : Thread : 1080105296 time:118
Memcached get: Block: 6e06:13:5558 : Thread : 1080105296 time:121
Memcached get: Block: 6e06:13:4001 : Thread : 1080105296 time:159
090619 10:41:59 - mysqld got signal 11 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
We will try our best to scrape up some info that will hopefully help diagnose
the problem, but since we have already crashed, something is definitely wrong
and this may fail.
key_buffer_
read_buffer_
max_used_
max_threads=151
threads_
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_
bytes of memory
Hope that's ok; if not, decrease some variables in the equation.
thd: 0x7f4da80bc450
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 0x4102af70 thread_stack 0x40000
Trying to get some variables.
Some pointers may be invalid and cause the dump to abort...
Trying to get some variables.
Some pointers may be invalid and cause the dump to abort...
thd->query at 0x7f4da827bec0 is an invalid pointer
thd->thread_id=46
thd->killed=
The manual page at http://
information that should help you find out what is causing the crash.
summary: |
- Testing TPCC-MYSQL get odd db/index coruption + Delete table/readd table can cause corruption, was: Testing TPCC-MYSQL + get odd db/index coruption |
MadMatt (yonkovim) wrote : | #5 |
Ahhhh, this is a problem with a delete and re-add.
To repeat the steps,
Create 2 databases whose combined total fit into memcached, but which only 1 can fit into the internal BP at any one time. Make sure all the data is in memcached. Drop one of the databases. Then load the other database into the internal buffer pool. Then re-add the other database. This will work fine... I can even select data out of the database perfectly fine... But when I startup a dbt2/tpcc it crashes.
[0x7a5025]
[0x4ce335]
[0x656ac0]
/lib/libc.
[0x442ced30]
090619 11:28:42 - mysqld got signal 11 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
We will try our best to scrape up some info that will hopefully help diagnose
the problem, but since we have already crashed, something is definitely wrong
and this may fail.
key_buffer_
read_buffer_
max_used_
max_threads=151
threads_
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_
bytes of memory
Hope that's ok; if not, decrease some variables in the equation.
thd: 0x3023540
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 0x442d7f70 thread_stack 0x40000
Trying to get some variables.
Some pointers may be invalid and cause the dump to abort...
thd->query at 0x302e950 = SELECT c_id
FROM customer
WHERE c_w_id = 1
AND c_d_id = 5
AND c_last = 'BARABLEABLE'
ORDER BY c_first ASC
thd->thread_id=121
thd->killed=
MadMatt (yonkovim) wrote : | #6 |
Both LRU and NONLRU Exhibit this issue.
Here is the error from the LRU version:
Memcached get: Block: 26b3:38:32 : Thread : 1150749008 time:325
Memcached set: Block: 26b3:a:6a30 : Thread : 1147554128 time:283
Memcached get: Block: 26b3:3c:1 : Thread : 1147554128 time:198
090619 18:20:45 - mysqld got signal 11 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
We will try our best to scrape up some info that will hopefully help diagnose
the problem, but since we have already crashed, something is definitely wrong
and this may fail.
key_buffer_
read_buffer_
max_used_
max_threads=151
threads_
It is poss[0x7a5025]
[0x4ce335]
[0x656ac0]
[0x401214]
[0x557c5d]
[0x55956a]
[0x4e4aae]
[0x4e5e2e]
[0x4d84bd]
[0x65311d]
[0x8796d9]
ible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_
bytes of memory
Hope that's ok; if not, decrease some variables in the equation.
thd: 0x7ffa1452f1b0
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 0x447e9f70 thread_stack 0x40000
Trying to get some variables.
Some pointers may be invalid and cause the dump to abort...
thd->query at 0x7ffa103df4d0 is an invalid pointer
thd->thread_id=196
thd->killed=
The manual page at http://
information that should help you find out what is causing the crash.
Memcached set: Block: 26b3:a:6a31 : Thread : 1147554128 time:144
**************
The quick solution here is actually to force file_per_table and then at ever space init increment a prefix to be used with memcached.
This is of course a complete hack.
The real solution is to clear memcached on a table drop... the question is how to effectively do this?
MadMatt (yonkovim) wrote : | #7 |
Looking for an easy place to include or track deleted/reused space id's... Got all the way down to the file level here:
+++++++
/* The tablespace memory cache; also the totality of logs = the log data space,
is stored here; below we talk about tablespaces, but also the ib_logfiles
form a 'space' and it is handled here */
typedef struct fil_system_struct fil_system_t;
struct fil_system_struct {
mutex_t mutex; /* The mutex protecting the cache */
hash_table_t* spaces; /* The hash table of spaces in the
system; they are hashed on the space
id */
hash_table_t* name_hash; /* hash table based on the space
name */
+++++++
Being a un-programer, I was curious as to if the hash table implementation in inno was dynamic ( built on a linked list )...
So I found where this was populated.
+++++++
system->spaces = hash_create(
system->name_hash = hash_create(
+++++++
So they are not. But looking at this made me scratch my head a bit. This means if I am reading this 100% correctly innodb has a fixed number of tables, and the memory footprint is getting pre-allocated to track these. So we could create a hash or an array to tracks tables then right? So lets look at the size
+++++++
fil_init(
/*=====*/
ulint max_n_open) /* in: max number of open files */
{
ulint hash_size;
ut_a(fil_system == NULL);
if (srv_file_
hash_size = 50000;
} else {
hash_size = 5000;
}
fil_system = fil_system_
}
+++++++
So with file per table turned on the hash table is allocated 50K nodes/records. Is an array more efficient then a hash table? I would think so as the array is more primitive, but I am not 100% sure.
I thought about adding the prefix hash/array to the fil_system_struct, however this is not accessed during the normal LRU/get process so we would have to add in more stpes to get it to work. But it seems like its the place it should go. Thoughts? Should we just make this a hash of 50K? and keep consistent? or should we make this an array outside the struct?
NOTE both the LRU and NONLRU code crash after table deletes. It's less likely in the LRU mode, but not all together eliminated ( 33% ).
MadMatt (yonkovim) wrote : | #8 |
I think alter table is also going to be an issue here as well...
This works via:
ha_innobase:
I am trying to trace back from the table create where the spaceid is assigned, but I can not find it:(
MadMatt (yonkovim) wrote : | #9 |
Spaceid's do not appear to be reused! I am verifying now.
MadMatt (yonkovim) wrote : | #10 |
Actually think the problem has to do with this error:
[0x7a5135]
[0x4ce335]
[0x656ac0]
/lib/libc.
[0x44d49cf0]
MadMatt (yonkovim) wrote : | #11 |
ok reproduced the error differently...
imported tpcc...
enabled memcached...
kicked off 1 tpcc run:
./tpcc_start localhost tpcc tpcc tpcc 20 64 10 300
Server: localhost (11211)
pid: 21526
uptime: 2846
time: 1246225919
version: 1.4.0-rc1
pointer_size: 64
rusage_user: 6.24376
rusage_system: 19.433214
curr_items: 246988
total_items: 247112
bytes: 4068732716
curr_connections: 70
total_
connection_
cmd_get: 290855
cmd_set: 247112
get_hits: 147393
get_misses: 143462
evictions: 0
bytes_read: 4068732716
bytes_written: 4068732716
limit_maxbytes: 4294967296
threads: 5
Kill that process
restarted it :
./tpcc_start localhost tpcc tpcc tpcc 20 64 10 300
in new single space create!!! #1 passed in id :1095197368
in new single space create!!! #2 passed in id :1095197368
in new single space create!!! #3 passed in id :1095197368
in space create!!! passed in id :18
in space create!!! Middle in the create id now :18
in space create!!! Lower in the create id now :18
in new single space create!!! #4 end passed in id :1095197368
[0x7a5135]
[0x4ce335]
[0x656ac0]
/lib/libc.
[0x446516b0]
090628 17:52:10 - mysqld got signal 11 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
We will try our best to scrape up some info that will hopefully help diagnose
the problem, but since we have already crashed, something is definitely wrong
and this may fail.
key_buffer_
read_buffer_
max_used_
max_threads=151
threads_
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_
bytes of memory
Hope that's ok; if not, decrease some variables in the equation.
thd: 0x4391490
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 0x44654f70 thread_stack 0x40000
Trying to get some variables.
Some pointers may be invalid and cause the dump to abort...
thd->query at 0x4a577a0 = UPDATE customer SET c_balance = c_balance + ? , c_delivery_cnt = c_delivery_cnt + 1 WHERE c_id = ? AND c_d_id = ? AND c_w_id = ?
thd->thread_id=186
thd->killed=
The manual page at http://
information that should help you find out what is causing the crash.
090628 17:52:10 mysqld_safe Number of processes running now: 0
090628 17:52:10 mysqld_safe mysqld restarted
InnoDB: The InnoDB memory heap has been disabled.
InnoDB: Mutex and rw_lock use atomics.
InnoDB: ios_per_array 256 read threads 1 write threads 1
in space create!!! passed in id :0
in space create!!! Middle in the create id now :0
in space create!!! Lower in the create id now :0
in space create!!! passed in id :4294967280
in space create!!! Middle in the create id now :4294967280
in space create!!! Lower in the create id now :4294967280
Need to upgrade t...
MadMatt (yonkovim) wrote : | #12 |
MadMatt (yonkovim) wrote : | #13 |
Have been trying to reproduce with dbt2, to no avail. Also tpcc works without any crash if waffle is disabled. This has me nervous.
MadMatt (yonkovim) wrote : | #14 |
I can confirm sysbench fails now.
in space create!!! Lower in the create id now :24
in new single space create!!! #4 end passed in id :1079176184
[0x7a5135]
[0x4ce335]
[0x656ac0]
/lib/libc.
[0x4021edc0]
090628 20:45:29 - mysqld got signal 11 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
We will try our best to scrape up some info that will hopefully help diagnose
the problem, but since we have already crashed, something is definitely wrong
and this may fail.
key_buffer_
read_buffer_
max_used_
max_threads=151
threads_
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_
MadMatt (yonkovim) wrote : | #15 |
Something is really really amiss here....
Could this be a stack problem? using libmemcached 0.30 + memcached 1.2.8 or 1.4.0-rc... mysql 5.4...
for a single threaded oltp test connecting to localhost:
---------
MEMCACHED
---------
Memcached puts 86595
Memcached hits 19595
Memcached misses 86579
Memcached Prefix: 18126
Memcached Get Total Lat 303 (us)
Memcached Get Recent Lat 202 (us)
Memcached Miss Total Lat 115 (us)
Memcached Miss Recent Lat 58 (us)
Memcached Set Total Lat 38043 (us)
Memcached Set Recent Lat 4917 (us)
--------------
5 second set latency... something is terribly terribly wrong here!!!
MadMatt (yonkovim) wrote : | #16 |
Never mind... this was my bug... the set time included the retrieval from disk ... doooh....
MadMatt (yonkovim) wrote : | #17 |
Attaching and error.log with extra debuging... i was trying to see if one of the open connections to memcached caused the issue... does not look like it... here is the code used to generate this:
buf_read_page_low(
/*==============*/
/* out: 1 if a read request was queued, 0 if the page
already resided in buf_pool, or if the page is in
the doublewrite buffer blocks in which case it is never
read into the pool, or if the tablespace does not
exist or is being dropped */
ulint* err, /* out: DB_SUCCESS or DB_TABLESPACE_
trying to read from a non-existent tablespace, or a
tablespace which is just now being dropped */
ibool sync, /* in: TRUE if synchronous aio is desired */
ulint mode, /* in: BUF_READ_
ORed to OS_AIO_
at read-ahead functions) */
ulint space, /* in: space id */
ib_longlong tablespace_version, /* in: if the space memory object has
this timestamp different from what we are giving here,
treat the tablespace as dropped; this is a timestamp we
use to stop dangling page reads from a tablespace
which we have DISCARDed + IMPORTed back */
ulint offset) /* in: page number */
{
buf_block_t* block;
ulint wake_later;
char tmpkey[40];
/* Variables for WaffleGrid, the innodb_buffer_pool layer 2 cache */
MEMC_CONN_T *memct;
ulint sec,fsec,
*err = DB_SUCCESS;
wake_later = mode & OS_AIO_
mode = mode & ~OS_AIO_
if (trx_doublewrite && space == TRX_SYS_SPACE
&& ( (offset >= trx_doublewrite
&& offset < trx_doublewrite
+ TRX_SYS_
|| (offset >= trx_doublewrite
&& offset < trx_doublewrite
+ TRX_SYS_
ut_print_
fprintf(stderr,
" InnoDB: Warning: trying to read"
" doublewrite buffer page %lu\n",
(ulong) offset);
return(0);
}
if (ibuf_bitmap_
/* Trx sys header is so low in the latching order that we play
safe and do not leave the i/o-completion to an asynchronous
i/o-thread. Ibuf bitmap pages must always be read with
syncronous i/o, to make sure they do not get involved in
thread deadlocks. */
sync = TRUE;
}
/* The following call will also check if the tablespace does not exist
or is being dropped; if we succeed in initing the page in the buffer
pool for read, then DISCARD cannot proceed until the read has
completed */
block = buf_page_
offset);
if (block == NULL) {
return(0);
}
#ifdef UNIV_DEBUG
if (buf_debug_prints) {
fprintf(stderr,
"Posting read request for page %lu, sync %lu\n",
(ulong) offset,
(ulong) sync);
}
#endif
ut_a(block->state == BUF_BLOCK_
/* set the block in memcached */
memct = (MEMC_CONN_T*) thr_local_
if (srv_memcached_
Yves Trudeau (y-trudeau) wrote : | #18 |
Bug is always related to page offset 1. At some point, there is a "set" for offset set followed by a get a few operations later that causes the crash.
**Memcached set: Block: 2763:4:1 : Thread : 2973268880 time:326
Memcached set: Block: 2763:4:29e : Thread : 2973268880 time:213
Memcached set: Block: 2763:4:2a0 : Thread : 2973268880 time:226
Memcached get: Block: 2763:4:385 : Thread : 2973268880 time:254
**Memcached get: Block: 2763:4:1 : Thread : 2973268880 time:223
[0x83f4821]
[0x810b62f]
[0xb7ff8420]
...
from buf_page_print, the page seems to be related to the insert buffer.
InnoDB: Page may be an insert buffer bitmap page
Yves Trudeau (y-trudeau) wrote : | #19 |
Easy way to reproduce... With a 8 MB buffer_pool (hacked 5.4 to allow smaller BP), load 100000 rows for the sysbench oltp test. Shutdown and restart the database and try a mysqldump > /dev/null.
Yves Trudeau (y-trudeau) wrote : | #20 |
Excluding insert buffer bitmap pages does not solve the bug. I found a faulty if statement in buf0flu.c:
if (srv_memcached_
it has been replaced by:
if (srv_memcached_
in function buf_flush_
Yves Trudeau (y-trudeau) wrote : | #21 |
Since the following procedure cause corruption....
1. start mysql
2. sysbench prepare
3. restart mysql
4. mysqldump
That point to the fact that the ibdata file is corrupted since so reads are done from memcached. The recent hack added code in buf0flu which handles writes to ibdata. I will focus on buf0flu for now.
Yves Trudeau (y-trudeau) wrote : | #22 |
The LRU code is working from version 2864 of branch lp:~y-trudeau/wafflegrid/cream-5.4-tree
Yves Trudeau (y-trudeau) wrote : | #23 |
The non-LRU code is working with 2864 of branch lp:~y-trudeau/wafflegrid/cream-5.4-tree if mysqld is started with MALLOC_CHECK_=0. Something, somewhere is freed twice, need to find out.
Yves Trudeau (y-trudeau) wrote : | #24 |
The bug is resolved in lp:~y-trudeau/wafflegrid/cream-5.4-tree revision 2865.
Yves Trudeau (y-trudeau) wrote : | #25 |
in lp:~y-trudeau/wafflegrid/cream-5.4-tree revision 2865
Changed in wafflegrid: | |
status: | Confirmed → Fix Released |
MadMatt (yonkovim) wrote : | #26 |
THIS is still not fixed yet. Verified with your most recent revision....
-rwxr-xr-x 1 root root 7.9M 2009-07-05 18:46 /mysql_
090705 18:48:37 [Note] /mysql_
Version: '5.4.2-beta' socket: '/var/run/
[0x7a4d55]
[0x4ce335]
[0x656ac0]
/lib/libc.
/lib/libc.
[0x406d9950]
090705 18:49:10 - mysqld got signal 11 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
We will try our best to scrape up some info that will hopefully help diagnose
the problem, but since we have already crashed, something is definitely wrong
and this may fail.
key_buffer_
read_buffer_
max_used_
max_threads=151
threads_
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_
bytes of memory
Hope that's ok; if not, decrease some variables in the equation.
thd: 0x43d9940
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 0x406d8f70 thread_stack 0x40000
Changed in wafflegrid: | |
status: | Fix Released → In Progress |
Yves Trudeau (y-trudeau) wrote : | #27 |
On Matt server, I clean and recreated the datadir and sysbench now runs fine. Maybe a corruption of the ibdata file.
MadMatt (yonkovim) wrote : | #28 |
Yves, still getting crashes...
InnoDB master thread running with io_capacity 800
090707 21:29:43 [Note] Event Scheduler: Loaded 0 events
090707 21:29:43 [Note] /mysql_
Version: '5.4.2-beta' socket: '/var/run/
[0x7a4d55]
[0x4ce335]
[0x656ac0]
/lib/libc.
[0x41697cc0]
090707 21:31:23 - mysqld got signal 11 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
We will try our best to scrape up some info that will hopefully help diagnose
the problem, but since we have already crashed, something is definitely wrong
and this may fail.
key_buffer_
read_buffer_
max_used_
max_threads=151
threads_
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_
bytes of memory
Hope that's ok; if not, decrease some variables in the equation.
thd: 0x442cae0
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 0x416a0f70 thread_stack 0x40000
Trying to get some variables.
Some pointers may be invalid and cause the dump to abort...
thd->query at 0x4910660 = SELECT count(c_id) FROM customer WHERE c_w_id = ? AND c_d_id = ? AND c_last = ?
thd->thread_id=7
thd->killed=
MadMatt (yonkovim) wrote : | #29 |
Crash happened after this:
Clean datadir...
Set Waffle off
Load tpcc data...
Run tpcc with no waffle
Waffle Enable ...
Run tpcc with waffle...
Waffle Times Unacceptable, 30sec + set times!
Kill tpcc
stop mysql, start mysql with waffle enabled
Runn Tpcc crash!!!! core dump.
[2]+ Segmentation fault (core dumped) ./tpcc_start localhost tpcc tpcc tpcc 20 16 10 600 > waffle_test.out
MadMatt (yonkovim) wrote : | #30 |
Tried this:
Clean datadir...
Set waffle = 0
restart mysql
Load tpcc
shutdown mysql
set waffle = 1
startup mysql
Run tpcc
Crash! same core dump
MadMatt (yonkovim) wrote : | #31 |
Sysbench fails still as well:
shutdown mysql
clean data dir
startup
load sysbench tables
root@frankenmat
sysbench 0.4.12: multi-threaded system evaluation benchmark
No DB drivers specified, using mysql
Running the test with following options:
Number of threads: 16
Doing OLTP test.
Running mixed OLTP test
Doing read-only test
Using Special distribution (12 iterations, 1 pct of values are returned in 75 pct cases)
Using "BEGIN" for starting transactions
Using auto_inc on the id column
Maximum number of requests for OLTP test is limited to 100
Threads started!
ALERT: failed to execute mysql_stmt_
FATAL: database error, exiting...
(last message repeated 1 times)
ALERT: failed to execute mysql_stmt_
ALERT: failed to execute mysql_stmt_
FATAL: database error, exiting...
(last message repeated 2 times)
FATAL: database error, exiting...
(last message repeated 1 times)
ALERT: failed to execute mysql_stmt_
FATAL: database error, exiting...
(last message repeated 1 times)
ALERT: failed to execute mysql_stmt_
(last message repeated 1 times)
FATAL: database error, exiting...
ALERT: failed to execute mysql_stmt_
FATAL: database error, exiting...
ALERT: failed to execute mysql_stmt_
FATAL: database error, exiting...
ALERT: failed to execute mysql_stmt_
(last message repeated 1 times)
FATAL: database error, exiting...
(last message repeated 1 times)
ALERT: failed to execute MySQL query: `BEGIN`:
ALERT: Error 2013 Lost connection to MySQL server during query
FATAL: database error, exiting...
ALERT: failed to execute mysql_stmt_
FATAL: database error, exiting...
(last message repeated 1 times)
ALERT: failed to execute mysql_stmt_
FATAL: database error, exiting...
(last message repeated 1 times)
Done.
seg fault....
This error did occur after I ran a full waffle test without memcached...
I then dropped the TPCC database
I then reimported the tpcc database
root@frankenmat t:/benchmarks/ tpcc-mysql# mysql -e"set @@global. innodb_ memcached_ print_stats= 1" t:/benchmarks/ tpcc-mysql# mysql -e"set @@global. innodb_ memcached_ servers= 'localhost' " t:/benchmarks/ tpcc-mysql# mysql -e"set @@global. innodb_ memcached_ enable= 1"
root@frankenmat
root@frankenmat
I then ran the count.sql to help warm the cache
I then started the tpcc benchmark.
Tpcc crashed mysql right away
This was cream revision 7.