Comment 9 for bug 1269842

Revision history for this message
Ales Perme (ales-perme) wrote :

I can succesfuly transfer 50 mio records but hit a wall at 60 mio with a message from a mysql.err log:

2014-01-20 08:57:04 195615 [ERROR] WSREP: Maximum wirteset size exceeded by 129676357: 90 (Message too long)
         at galera/src/write_set_ng.hpp:check_size():652
2014-01-20 08:57:04 195615 [ERROR] WSREP: unknown connection failure

But what is good is, that the server does not crash.

The thread howerver hangs on the master server even if I kill it.
mysql> INSERT INTO docStatsDetail_2013 SELECT * FROM docStatsDetail WHERE date>='2013-01-01' AND date <='2013-12-31';

There seems to be no I/O on disk, which looks like it could be in a "hanging" state:
procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu----
 r b swpd free buff cache si so bi bo in cs us sy id wa
 3 0 19716 1235680 184652 8389168 0 0 0 33 2168 1605 5 0 95 0
 1 0 19716 1235132 184652 8389220 0 0 32 1 1665 994 6 0 94 0
 2 0 19716 1234988 184660 8389236 0 0 16 25 2240 1245 7 0 93 0
 1 0 19716 1234956 184660 8389292 0 0 68 5 2231 1324 6 0 94 0
 2 0 19716 1233892 184660 8389312 0 0 16 84 2520 1839 6 0 94 0
 0 0 19716 1233312 184660 8389352 0 0 184 13 1474 930 4 0 96 0
 2 0 19716 1232740 184660 8389556 0 0 0 1 1905 1043 5 0 95 0

I will leave it "hanging" for let's say 4 hours and see what happens.

The table however look "operational" (i.e. not locked)
mysql> show processlist;
+------+-------------+-----------+------------+---------+-------+---------------------------+------------------------------------------------------------------------------------------------------+-----------+---------------+
| Id | User | Host | db | Command | Time | State | Info | Rows_sent | Rows_examined |
+------+-------------+-----------+------------+---------+-------+---------------------------+------------------------------------------------------------------------------------------------------+-----------+---------------+
| 1 | system user | | NULL | Sleep | 43605 | wsrep aborter idle | NULL | 0 | 0 |
| 2 | system user | | NULL | Sleep | 2609 | committed 4951387 | NULL | 0 | 0 |
| 3 | system user | | NULL | Sleep | 2609 | committed 4951394 | NULL | 0 | 0 |
| 4 | system user | | NULL | Sleep | 2609 | committed 4951389 | NULL | 0 | 0 |
| 5 | system user | | NULL | Sleep | 2609 | committed 4951393 | NULL | 0 | 0 |
| 6 | system user | | NULL | Sleep | 2609 | committed 4951392 | NULL | 0 | 0 |
| 7 | system user | | NULL | Sleep | 2609 | committed 4951390 | NULL | 0 | 0 |
| 8 | system user | | NULL | Sleep | 2609 | committed 4951388 | NULL | 0 | 0 |
| 9 | system user | | NULL | Sleep | 2609 | committed 4951391 | NULL | 0 | 0 |
| 2854 | root | localhost | cAnalytics | Query | 11794 | wsrep in pre-commit stage | INSERT INTO docStatsDetail_2013 SELECT * FROM docStatsDetail WHERE date>='2013-01-01' AND date <='20 | 0 | 60440137 |
| 3936 | root | localhost | cAnalytics | Query | 18 | Sending data | select count(*) from docStatsDetail_2013 | 0 | 0 |
| 3937 | root | localhost | cAnalytics | Query | 0 | init | show processlist | 0 | 0 |
+------+-------------+-----------+------------+---------+-------+---------------------------+------------------------------------------------------------------------------------------------------+-----------+---------------+

The count of that empty table (docStatsDetail_2013) takes 4 minutes. Perhaps PXC is performing a rollback. Let's see.

mysql> select count(*) from docStatsDetail_2013;
+----------+
| count(*) |
+----------+
| 0 |
+----------+
1 row in set (4 min 2.73 sec)

At least the server is still running & cluster is up, which is a great improvement. Will keep you posted....