node hanging in shutdown

Bug #490749 reported by Seppo Jaakola
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
MySQL patches by Codership
Fix Released
Low
Seppo Jaakola

Bug Description

091201 10:18:26 [Note] Event Scheduler: Purging the queue. 0 events
091201 10:18:26 [Note] wsrep closing connection to cluster
091201 10:18:26 [Note] WSREP: terminating thread
091201 10:18:26 [Note] WSREP: joining thread
3 nodes, testing with RC build ( mysql-5.1.39-galera-0.7-x86_64.tgz)
After 33 node sysbench run, cluster was shutdown, one by one, and last node
remains hanging with following lines in the log:

091201 10:18:26 [Note] WSREP: closing backend
091201 10:18:29 [Note] WSREP: New COMPONENT: primary = no, my_idx = 0, memb_num = 1
091201 10:18:29 [Note] WSREP: Flow-control interval: [0, 0]
091201 10:18:29 [Note] WSREP: Received NON-PRIMARY.
091201 10:18:29 [Note] WSREP: RECV thread exiting -77: File descriptor in bad state
091201 10:18:29 [Note] WSREP: recv_thread() joined.
091201 10:18:29 [Note] WSREP: Closing slave action queue.
091201 10:18:29 [Note] WSREP: New NON-PRIMARY configuration: -1, seqno: 9774994, group UUID: f9fccb43-de02-11de-0800-2e2658db1d20, members: 1, my idx: 0
091201 10:18:29 [Warning] WSREP: Waiting for 1 items to be fetched.
091201 10:18:29 [Note] WSREP: New cluster view: group UUID: f9fccb43-de02-11de-0800-2e2658db1d20, conf# -1: non-Primary, number of nodes: 1, my index: 0, first seqno: 9774995
091201 10:18:32 [Note] WSREP: gcs_recv() returned 0 (Success)
091201 10:18:32 [Note] wsrep recv thread exiting (code:5)
091201 10:18:32 [Note] wsrep starting shutdown
091201 10:18:32 [Note] WSREP: Closed GCS connection

wsrep system threads were running (rollbacker and slave applier)

Changed in codership-mysql:
status: New → Confirmed
importance: Undecided → Low
milestone: none → 0.7.1
Revision history for this message
Seppo Jaakola (seppo-jaakola) wrote :

Following stack trace was observed during shutdown hanging:

(gdb) t 2
[Switching to thread 2 (Thread 0x7f91a2bf3950 (LWP 5013))]#0 0x00007f91c61fa2e9 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib/libpthread.so.0
(gdb) bt
#0 0x00007f91c61fa2e9 in pthread_cond_wait@@GLIBC_2.3.2 ()
   from /lib/libpthread.so.0
#1 0x00007f91a36f97de in gu_fifo_close (q=0x35de520) at gu_fifo.c:177
#2 0x00007f91a3b6f0db in gcs_close (conn=0x3571208) at gcs.c:844
#3 0x00007f91a3fa3c46 in mm_galera_disconnect (gh=<value optimized out>,
    app_uuid=0x0, app_seqno=0x0) at mm_galera.c:443
#4 0x000000000061bd56 in close_connections () at mysqld.cc:982
#5 0x000000000061c2d3 in kill_server (sig_ptr=0x0) at mysqld.cc:1202
#6 0x000000000061c31f in kill_server_thread (arg=0x7f91a41ed108)
    at mysqld.cc:1233
#7 0x00007f91c61f63ba in start_thread () from /lib/libpthread.so.0
#8 0x00007f91c5687fcd in clone () from /lib/libc.so.6
#9 0x0000000000000000 in ?? ()

Changed in codership-mysql:
status: Confirmed → In Progress
assignee: nobody → Seppo Jaakola (seppo-jaakola)
Revision history for this message
Seppo Jaakola (seppo-jaakola) wrote :

This issue seems to be a symptom of: lp:498798
the treads list was emptied by mistake, and wsrep system threads were not accessible anymore
during shutdown processing.

Changed in codership-mysql:
status: In Progress → Fix Committed
Changed in codership-mysql:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.