V3 Attempting to rename the ORIGID attribute on a OQGRAPH engine table to 'parent' will cause mysqld to segfault

Bug #1134355 reported by Andrew McDonnell
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OQGRAPH
Fix Committed
Undecided
OQgraph developers

Bug Description

This segfault is in a different place from #1134338

Given

CREATE TABLE tol_tree (
    latch SMALLINT UNSIGNED NULL,
    origid BIGINT UNSIGNED NULL,
    destid BIGINT UNSIGNED NULL,
    weight DOUBLE NULL,
    seq BIGINT UNSIGNED NULL,
    linkid BIGINT UNSIGNED NULL,
    KEY (latch, origid, destid) USING HASH,
    KEY (latch, destid, origid) USING HASH
  ) ENGINE=OQGRAPH data_table='tol' origid='id' destid='id';

then

alter table tol_tree ORIGID='parent';

will segfault mysqld.

Note, if tol has another not null integer other than parent added to it, the following error is reported instead (and no crash)
ERROR 1036 (HY000): Table '#sql-36e_1' is read only

The only difference I can see is:
| parent | int(10) unsigned | YES | | NULL |
| someother | int(11) | NO | | NULL |

But then adding
| another | int(10) unsigned | YES | | NULL |

Allows the segfault to trigger, so it appears that it only occurs if the type itself matches.

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7ffff7f64700 (LWP 32564)]
0x000000000075c6ca in merge_engine_table_options (first=0x296c888, second=0x28868d0, root=0x7fffd41d2b58) at /home/andrew/develop/maria/repo/andrew-dev/sql/create_options.cc:609
609 for (end= first; end->next; end= end->next) /* no-op */;
(gdb) bt
#0 0x000000000075c6ca in merge_engine_table_options (first=0x296c888, second=0x28868d0, root=0x7fffd41d2b58) at /home/andrew/develop/maria/repo/andrew-dev/sql/create_options.cc:609
#1 0x00000000006bbcc5 in mysql_prepare_alter_table (thd=0x7fffd41cee30, table=0x296e840, create_info=0x7ffff7f62d10, alter_info=0x7ffff7f62e10) at /home/andrew/develop/maria/repo/andrew-dev/sql/sql_table.cc:5640
#2 0x00000000006be42e in mysql_alter_table (thd=0x7fffd41cee30, new_db=0x28868c0 "test", new_name=0x28862a8 "tol_tree", create_info=0x7ffff7f62d10, table_list=0x28862f8, alter_info=0x7ffff7f62e10, order_num=0, order=0x0, ignore=false, require_online=false) at /home/andrew/develop/maria/repo/andrew-dev/sql/sql_table.cc:6503
#3 0x000000000096f863 in Alter_table_statement::execute (this=0x2886900, thd=0x7fffd41cee30) at /home/andrew/develop/maria/repo/andrew-dev/sql/sql_alter.cc:106
#4 0x000000000061c840 in mysql_execute_command (thd=0x7fffd41cee30) at /home/andrew/develop/maria/repo/andrew-dev/sql/sql_parse.cc:4834
#5 0x000000000061f9ec in mysql_parse (thd=0x7fffd41cee30, rawbuf=0x28861e8 "alter table tol_tree ORIGID='parent'", length=36, parser_state=0x7ffff7f63670) at /home/andrew/develop/maria/repo/andrew-dev/sql/sql_parse.cc:6124
#6 0x00000000006129ba in dispatch_command (command=COM_QUERY, thd=0x7fffd41cee30, packet=0x7fffd41d40d1 "alter table tol_tree ORIGID='parent'", packet_length=36) at /home/andrew/develop/maria/repo/andrew-dev/sql/sql_parse.cc:1266
#7 0x0000000000611b35 in do_command (thd=0x7fffd41cee30) at /home/andrew/develop/maria/repo/andrew-dev/sql/sql_parse.cc:982
#8 0x0000000000721b31 in do_handle_one_connection (thd_arg=0x7fffd41cee30) at /home/andrew/develop/maria/repo/andrew-dev/sql/sql_connect.cc:1267
#9 0x00000000007215d9 in handle_one_connection (arg=0x7fffd41cee30) at /home/andrew/develop/maria/repo/andrew-dev/sql/sql_connect.cc:1181
#10 0x00007ffff7bc98ca in start_thread (arg=<value optimized out>) at pthread_create.c:300
#11 0x00007ffff706b92d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112
#12 0x0000000000000000 in ?? ()

Revision history for this message
Andrew McDonnell (andymc73) wrote :

Update - added a regression / repetition test case to test suite, and this instead now reports an error with junk table names:

mysqltest: At line 39: query 'alter table graph ORIGID = 'something_else'' failed: 1296: Got error -1 'Invalid OQGRAPH backing store ('/#sql-4ad5_2'.origid attribute not set to a valid column of 'graph_b' from OQGRAPH

Maybe hints toward result

summary: - Attempting to rename the ORIGID attribute on a OQGRAPH engine table to
- 'parent' will cause mysqld to segfault
+ V3 Attempting to rename the ORIGID attribute on a OQGRAPH engine table
+ to 'parent' will cause mysqld to segfault
Revision history for this message
Andrew McDonnell (andymc73) wrote :

Crash can still happen. I inserted an alternative statement before the triggering statement and crash happens again.

Extra ino thanks to valgrind:

==28676== Thread 4:
==28676== Invalid read of size 8
==28676== at 0x75C742: merge_engine_table_options(engine_option_value*, engine_option_value*, st_mem_root*) (create_options.cc:609)
==28676== by 0x6BBCC4: mysql_prepare_alter_table(THD*, TABLE*, st_ha_create_information*, Alter_info*) (sql_table.cc:5640)
==28676== by 0x6BE42D: mysql_alter_table(THD*, char*, char*, st_ha_create_information*, TABLE_LIST*, Alter_info*, unsigned int, st_order*, bool, bool) (sql_table.cc:6503)
==28676== by 0x96F8DA: Alter_table_statement::execute(THD*) (sql_alter.cc:106)
==28676== by 0x61C83F: mysql_execute_command(THD*) (sql_parse.cc:4834)
==28676== by 0x61F9EB: mysql_parse(THD*, char*, unsigned int, Parser_state*) (sql_parse.cc:6124)
==28676== by 0x6129B9: dispatch_command(enum_server_command, THD*, char*, unsigned int) (sql_parse.cc:1266)
==28676== by 0x611B34: do_command(THD*) (sql_parse.cc:982)
==28676== by 0x721BA8: do_handle_one_connection(THD*) (sql_connect.cc:1267)
==28676== by 0x721650: handle_one_connection (sql_connect.cc:1181)
==28676== by 0x9C5B45: pfs_spawn_thread (pfs.cc:1800)
==28676== by 0x4E2F8C9: start_thread (pthread_create.c:300)
==28676== by 0x59D992C: clone (clone.S:112)
==28676== Address 0x8f8f8f8f8f8f8faf is not stack'd, malloc'd or (recently) free'd
==28676==

Odd - table->s->option_list is used in the crash area, but we (maybe?) make a list
static const ha_create_table_option oqgraph_table_option_list[]=
{
  HA_TOPTION_STRING("data_table", table_name),
  HA_TOPTION_STRING("origid", origid),
  HA_TOPTION_STRING("destid", destid),
  HA_TOPTION_STRING("weight", weight),
  HA_TOPTION_END
};
...
  hton->table_options= oqgraph_table_option_list;

But later access as a struct ?!? How does that even work?

  oqgraph_table_option_struct *options=
    reinterpret_cast<oqgraph_table_option_struct*>(table->s->option_struct);

SQL core code, crash zone:

  create_info->option_list= merge_engine_table_options(table->s->option_list,
                                        create_info->option_list, thd->mem_root);

-->

  engine_option_value *end, *opt;
  DBUG_ENTER("merge_engine_table_options");
  LINT_INIT(end);

  /* find last element */
  if (first && second)
    for (end= first; end->next; end= end->next) /* no-op */;

Revision history for this message
Andrew McDonnell (andymc73) wrote :

I wondered if it was related to weight column not specified, but no dice

Revision history for this message
Andrew McDonnell (andymc73) wrote :
Download full text (3.5 KiB)

List memory during create table: is as expected

(gdb) print *table_arg->s->option_list
$9 = {<Sql_alloc> = {<No data fields>}, name = {str = 0x226e760 "DATA_TABLE", length = 10}, value = {str = 0x226e770 "graph_base", length = 10}, next = 0x226ffc8, parsed = true, quoted_value = true}
(gdb) print *table_arg->s->option_list->next
$10 = {<Sql_alloc> = {<No data fields>}, name = {str = 0x226e780 "ORIGID", length = 6}, value = {str = 0x226e788 "from_id", length = 7}, next = 0x2270008, parsed = true, quoted_value = true}
(gdb) print *table_arg->s->option_list->next->next
$11 = {<Sql_alloc> = {<No data fields>}, name = {str = 0x226fff8 "DESTID", length = 6}, value = {str = 0x2270000 "to_id", length = 5}, next = 0x2270048, parsed = true, quoted_value = true}
(gdb) print *table_arg->s->option_list->next->next->next
$12 = {<Sql_alloc> = {<No data fields>}, name = {str = 0x2270038 "WEIGHT", length = 6}, value = {str = 0x2270040 "w", length = 1}, next = 0x0, parsed = true, quoted_value = true}
(gdb) print *table_arg->s->option_list->next->next->next->next
Cannot access memory at address 0x0

At first alter table sql_table.cc:5639 - still as expected

(gdb) print *table->s->option_list
$15 = {<Sql_alloc> = {<No data fields>}, name = {str = 0x2275e90 "DATA_TABLE", length = 10}, value = {str = 0x22777a8 "graph_base", length = 10}, next = 0x2277948, parsed = true, quoted_value = true}
(gdb) print *table->s->option_list->next
$16 = {<Sql_alloc> = {<No data fields>}, name = {str = 0x2277938 "ORIGID", length = 6}, value = {str = 0x2277940 "from_id", length = 7}, next = 0x2277988, parsed = true, quoted_value = true}
(gdb) print *table->s->option_list->next->next
$17 = {<Sql_alloc> = {<No data fields>}, name = {str = 0x2277978 "DESTID", length = 6}, value = {str = 0x2277980 "to_id", length = 5}, next = 0x22779c8, parsed = true, quoted_value = true}
(gdb) print *table->s->option_list->next->next->next
$18 = {<Sql_alloc> = {<No data fields>}, name = {str = 0x22779b8 "WEIGHT", length = 6}, value = {str = 0x22779c0 "w", length = 1}, next = 0x0, parsed = true, quoted_value = true}
(gdb) print *table->s->option_list->next->next->next->next
Cannot access memory at address 0x0

At second alter table sql_table.cc:5639 - bogus END!
(gdb) print *table->s->option_list
$23 = {<Sql_alloc> = {<No data fields>}, name = {str = 0x2275e90 "DATA_TABLE", length = 10}, value = {str = 0x22777a8 "graph_base", length = 10}, next = 0x2277948, parsed = true, quoted_value = true}
(gdb) print *table->s->option_list->next
$24 = {<Sql_alloc> = {<No data fields>}, name = {str = 0x2277938 "ORIGID", length = 6}, value = {str = 0x0, length = 7}, next = 0x2277988, parsed = true, quoted_value = true}
(gdb) print *table->s->option_list->next->next
$25 = {<Sql_alloc> = {<No data fields>}, name = {str = 0x2277978 "DESTID", length = 6}, value = {str = 0x2277980 "to_id", length = 5}, next = 0x22779c8, parsed = true, quoted_value = true}
(gdb) print *table->s->option_list->next->next->next
$26 = {<Sql_alloc> = {<No data fields>}, name = {str = 0x22779b8 "WEIGHT", length = 6}, value = {str = 0x22779c0 "w", length = 1}, next = 0x2382cf8, parsed = true, quoted_value = true}
(gdb) print *table->s->option...

Read more...

Revision history for this message
Andrew McDonnell (andymc73) wrote :

More rigorous debugging indicates that address of table->s->option_list is unchanged between calls, so what is corrupting the end list node?

Using watchpoints, I worked out that merge_engine_table_options() adds an extra item to the end of the linked list being used with the replacement but invalid first alter table; then the second alter table that next pointer to the new item points to the same memory which is now filled with junk, so the create_info->option_list passed to merge_engine_table_options() got freed without being removed from the list (!) (it seems)

Revision history for this message
Andrew McDonnell (andymc73) wrote :

(which appears to have been cloned using in -place new, so something clobbered that

Revision history for this message
Andrew McDonnell (andymc73) wrote :

Watchpoint: Memory being clobbered at
#0 memset () at ../sysdeps/x86_64/memset.S:619
#1 0x0000000000daa23a in free_root (root=0x2332158, MyFlags=1) at /home/andrew/develop/maria/repo/oqgraph-varchar/mysys/my_alloc.c:398
#2 0x0000000000613fb9 in dispatch_command (command=COM_QUERY, thd=0x232e430, packet=0x223afc1 "", packet_length=39) at /home/andrew/develop/maria/repo/oqgraph-varchar/sql/sql_parse.cc:1686
#3 0x0000000000611b35 in do_command (thd=0x232e430) at /home/andrew/develop/maria/repo/oqgraph-varchar/sql/sql_parse.cc:982
#4 0x0000000000721ba9 in do_handle_one_connection (thd_arg=0x232e430) at /home/andrew/develop/maria/repo/oqgraph-varchar/sql/sql_connect.cc:1267
#5 0x0000000000721651 in handle_one_connection (arg=0x232e430) at /home/andrew/develop/maria/repo/oqgraph-varchar/sql/sql_connect.cc:1181
#6 0x00000000009c5b46 in pfs_spawn_thread (arg=0x2148b30) at /home/andrew/develop/maria/repo/oqgraph-varchar/storage/perfschema/pfs.cc:1800
#7 0x00007ffff7bc98ca in start_thread (arg=<optimized out>) at pthread_create.c:300
#8 0x00007ffff706b92d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112

Revision history for this message
Andrew McDonnell (andymc73) wrote :

My current synopsis:

During alter table, when altering an option, an extra option record is added to the end of the current option list.
This list node is created in something called 'root' (temporary) memory which is freed on exit from dispatch_command() in sql_parse.cc (which is back up the call stack into the crash point)

If alter table succeeds, I assume the options get merged and the list rejigged and the _old_ option in the list is properly removed and its ancestor next fixed correctly, and the new item moved out of (root) ( or some behaviour similar to this that doesnt cause memory problems)

If alter table fails when the alter consists of changing the value of an option, the merged option list doesnt get rejigged to remove the 'new' option, so when the temp memory is freed the last item in the list next is left pointing to junk on exit form dispatch_command() ...

Revision history for this message
Andrew McDonnell (andymc73) wrote :

First (expected) alter failure happens inside line 6896 (now 6907)
error= copy_data_between_tables(thd, table, new_table,
                                    alter_info->create_list, ignore,
                                    order_num, order, &copied, &deleted,
                                    alter_info->keys_onoff,
                                    alter_info->error_if_not_empty);

Looks like we nee to work out the end of the option list before calling mysql_prepare_alter_table so we can delink on error

Revision history for this message
Andrew McDonnell (andymc73) wrote :

Ultimately fixed, by disallowing alter capability.

Should revisit and file bug on MYSQL core

Changed in oqgraph:
status: New → Fix Committed
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.