Deadlock when trying to wake up transactions

Bug #310184 reported by Philip Stoev on 2008-12-21
2
Affects Status Importance Assigned to Milestone
PBXT
Undecided
Vladimir Kolesnikov

Bug Description

When executing a concurrent workload, PBXT deadlocked in or around xt_xn_wakeup_transactions and related functions.

To reproduce

bzr branch lp:~randgen/randgen/main

and then execute:

$ perl runall.pl \
   --basedir=/build/mysql-5.1.30 \
   --mysqld=--plugin-dir=/build/pbxt-1.0.06-beta/src/.libs/ \
   --mysqld=--plugin-load=PBXT=libpbxt.so \
   --engine=PBXT \
   --grammar=conf/transactions.yy \
   --gendata=conf/transactions.zz \
   --reporters=Deadlock,Backtrace

Related branches

Philip Stoev (pstoev) wrote :

Philip,

As far as I understand you're talking about engine being freezed during test execution. Can you try to repeat this problem on the latest version. I remember we fixed a similar issue after 1.0.06. I tried the test on the latest lp:pbxt and it works...

Thanks,

Changed in pbxt:
status: New → Incomplete
Philip Stoev (pstoev) wrote :

Yes you are right the original deadlock (0% CPU usage) appears to be gone however running the test causes a 100% cpu situation on 1 core, with the rest of the threads blocked. The backtrace of the offending thread is:

#0 0x00000000001673ac in XTTabCache::xt_tc_read_4 (this=0x28e25d0, file=0x7f8ca00008c0, ref_id=218, value=0x7f8cac343618, thread=0x29b7b40)
    at tabcache_xt.cc:278
#1 0x0000000000167e18 in xt_tab_get_row (ot=<value optimized out>, row_id=1, var_rec_id=0x29090f0) at table_xt.cc:3678
#2 0x00000000001685f9 in tab_visible (ot=0x7f8ca0006b50, rec_head=<value optimized out>, new_rec_id=0x7f8cac3436a4) at table_xt.cc:2810
#3 0x000000000016a6ab in xt_tab_seq_next (ot=0x7f8ca0006b50, buffer=0x7f8ca0012e00 "Ы\226", eof=0x7f8cac3436ec) at table_xt.cc:4852
#4 0x000000000014b001 in ha_pbxt::rnd_next (this=0x7f8ca0000ee0, buf=0xd <Address 0xd out of bounds>) at ha_pbxt.cc:3099
---Type <return> to continue, or q <return> to quit---
#5 0x00000000007027b6 in rr_sequential (info=0x7f8cac3438e0) at records.cc:381
#6 0x00000000006a8113 in mysql_update (thd=0x7f8ca80b4140, table_list=0x7f8ca0001ef0, fields=@0x7f8ca80b60e0, values=@0x7f8ca80b6520, conds=0x0,
    order_num=<value optimized out>, order=0x0, limit=18446744073709551517, handle_duplicates=DUP_ERROR, ignore=false) at sql_update.cc:571
#7 0x00000000006231e5 in mysql_execute_command (thd=0x7f8ca80b4140) at sql_parse.cc:2959
#8 0x00000000006295aa in mysql_parse (thd=0x7f8ca80b4140,
    inBuf=0x7f8ca0006620 "UPDATE `table10_pbxt_int_autoinc` SET `int` = `int` + 30, `int_key` = `int_key` - 30", length=84, found_semicolon=0x7f8cac344fb8)
    at sql_parse.cc:5787
#9 0x000000000062a6a8 in dispatch_command (command=COM_QUERY, thd=0x7f8ca80b4140,
    packet=0x7f8ca80b6b41 " UPDATE `table10_pbxt_int_autoinc` SET `int` = `int` + 30, `int_key` = `int_key` - 30 ", packet_length=<value optimized out>)
    at sql_parse.cc:1200
#10 0x000000000062b4e4 in do_command (thd=0x7f8ca80b4140) at sql_parse.cc:857
#11 0x000000000061c536 in handle_one_connection (arg=<value optimized out>) at sql_connect.cc:1115
#12 0x000000315b0073da in start_thread () from /lib64/libpthread.so.0
#13 0x000000315a4e627d in clone () from /lib64/libc.so.6

Does the test complete succesfully for you?

Philip Stoev (pstoev) wrote :

The thread appears to be stuck in this loop forever:

2815 while (var_rec_id != ot->ot_curr_rec_id) {
(gdb)
2816 if (!var_rec_id) {
(gdb)
2822 if (!xt_tab_get_rec_data(ot, var_rec_id, sizeof(XTTabRecHeadDRec), (xtWord1 *) &var_head))
(gdb)
2829 if (XT_REC_IS_CLEAN(var_head.tr_rec_type_1)) {
(gdb)
2835 if (XT_REC_IS_FREE(var_head.tr_rec_type_1)) {
(gdb)
2840 if (invalid_rec != var_rec_id) {
(gdb)
2841 var_rec_id = invalid_rec;
(gdb)
2842 goto retry_3;
(gdb)
2810 if (!(xt_tab_get_row(ot, row_id, &var_rec_id)))
(gdb)
2815 while (var_rec_id != ot->ot_curr_rec_id) {

Philip,
On the first machine where I tried the test finished successfully but now I tried on another machine and I can see the freeze with 100% cpu utilization.
Thanks for the report.

Changed in pbxt:
assignee: nobody → vkolesnikov
status: Incomplete → Confirmed
Changed in pbxt:
status: Confirmed → In Progress

At the moment the fix is available at lp:~vkolesnikov/pbxt/pbxt-bug-310184 . I will give an update when it will be merged to the trunk.

Changed in pbxt:
status: In Progress → Fix Committed
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers