Percona Server 5.7 gives many Valgrind bugs on startup

Bug #1641511 reported by Roel Van de Paar
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Percona Server moved to https://jira.percona.com/projects/PS
Status tracked in 5.7
5.6
New
Undecided
Unassigned
5.7
Incomplete
High
Unassigned

Bug Description

Many errors, including these which I have not seen before yet;

/git/PS-5.7.16_val/storage/innobase/mem/mem0mem.cc:448: 0x1db0ea18[312] unwritable at 8
/git/PS-5.7.16_val/storage/innobase/mem/mem0mem.cc:448: 0x8c27a18[216] unwritable at 8
/git/PS-5.7.16_val/storage/innobase/mem/mem0mem.cc:448: 0x8c27818[216] unwritable at 8
/git/PS-5.7.16_val/storage/innobase/mem/mem0mem.cc:448: 0x8c27718[216] unwritable at 8
/git/PS-5.7.16_val/storage/innobase/data/data0data.cc:248: 0x97ffe08[13] undefined at 8
/git/PS-5.7.16_val/storage/innobase/data/data0data.cc:248: 0x97ffe08[13] undefined at 8
/git/PS-5.7.16_val/storage/innobase/mem/mem0mem.cc:448: 0x1db0e598[328] unwritable at 8

[...]

==22131== More than 1000 different errors detected. I'm not reporting any more.
==22131== Final error counts will be inaccurate. Go fix your program!
==22131== Rerun with --error-limit=no to disable this cutoff. Note
==22131== that errors may occur in your program without prior warning from
==22131== Valgrind, because errors are no longer being displayed.

2016-11-14T06:26:28.592037Z 0 [Note] /sda/PS141116-percona-server-5.7.16-10-linux-x86_64-val/bin/mysqld: Shutdown complete

==22131==
==22131== HEAP SUMMARY:
==22131== in use at exit: 176 bytes in 2 blocks
==22131== total heap usage: 10,512 allocs, 10,510 frees, 37,206,339 bytes allocated
==22131==
==22131== LEAK SUMMARY:
==22131== definitely lost: 0 bytes in 0 blocks
==22131== indirectly lost: 0 bytes in 0 blocks
==22131== possibly lost: 0 bytes in 0 blocks
==22131== still reachable: 176 bytes in 2 blocks
==22131== suppressed: 0 bytes in 0 blocks
==22131== Rerun with --leak-check=full to see details of leaked memory
==22131==
==22131== For counts of detected and suppressed errors, rerun with: -v
==22131== ERROR SUMMARY: 13517 errors from 1000 contexts (suppressed: 0 from 0)

Revision history for this message
Roel Van de Paar (roel11) wrote :

To reproduce;

cd ~
git clone https://github.com/Percona-QA/percona-qa.git
cd /tmp
git clone --recursive https://github.com/percona/percona-server.git --depth=1 -b release-5.7.16-10 PS-5.7.16
cd PS-5.7.16
~/percona-qa/build_5.x_valgrid.sh
mv from_location to_location
cd to_location
~/percona-qa/startup.sh
./start_valgrind
./cl # try few times till you get into client (Valgrind is slow)
./stop
# wait for Valgrind to finish (ps -ef | grep valgrind)
vi log/master.err

Revision history for this message
Roel Van de Paar (roel11) wrote :
Changed in percona-server:
status: New → Confirmed
importance: Undecided → High
assignee: nobody → Laurynas Biveinis (laurynas-biveinis)
Revision history for this message
Laurynas Biveinis (laurynas-biveinis) wrote :

Unable to reproduce. If I understand the provided reproduction scenario, the warnings show up on server startup with specific server args. I am unable to reproduce with such args.

One thing which would help is reducing the reproduction steps from "git clone ... ".

Another is system details: is jemalloc used? Which version?

Finally, while the bug is logged for some errors which are not the first ones in the log, have you seen the very first errors before?

Revision history for this message
Shahriyar Rzayev (rzayev-sehriyar) wrote :
Download full text (3.3 KiB)

Hi dear all,
OS -> Ubuntu 16.04

After replaying steps provided from error log it is clear that the command used to start server was:

1. git clone --recursive https://github.com/percona/percona-server.git --depth=1 -b release-5.7.16-10 PS-5.7.16

2. sh@sh-ubuntu:~/REPOS/PS-5.7.16$ ~/REPOS/percona-qa/build_5.7_valgrind.sh .
CPack: - package: /home/sh/REPOS/PS-5.7.16_val/percona-server-5.7.16-10-linux-x86_64.tar.gz generated.
Done! Now run;
mv ../.-percona-server-5.7.16-10-linux-x86_64-val /sda

3. sh@sh-ubuntu:~/REPOS/PS-5.7.16$ sudo mv ../.-percona-server-5.7.16-10-linux-x86_64-val /sda

4. sh@sh-ubuntu:/sda$ ~/REPOS/percona-qa/startup.sh 0

5. sh@sh-ubuntu:/sda$ ./start_valgrind
Server socket: /sda/socket.sock with datadir: /sda/data

/sda/bin/mysqld --no-defaults --core-file --innodb_buffer_pool_size=2147483648 --basedir=/sda --tmpdir=/sda/data --datadir=/sda/data --plugin-load=tokudb=ha_tokudb.so --socket=/sda/socket.sock --port=16094 --log-error=/sda/log/master.err

At the very end:

==27666== More than 1000 different errors detected. I'm not reporting any more.
==27666== Final error counts will be inaccurate. Go fix your program!
==27666== Rerun with --error-limit=no to disable this cutoff. Note
==27666== that errors may occur in your program without prior warning from
==27666== Valgrind, because errors are no longer being displayed.
==27666==
2016-11-15T13:29:13.108995Z 0 [Note] Recovering after a crash using tc.log
2016-11-15T13:29:13.114812Z 0 [Note] Starting crash recovery...
2016-11-15T13:29:13.121831Z 0 [Note] Crash recovery finished.
/home/sh/REPOS/PS-5.7.16_val/storage/innobase/mem/mem0mem.cc:448: 0xb87d8018[16136] unwritable at 8
2016-11-15T13:29:13.657643Z 0 [Note] Found ca.pem, server-cert.pem and server-key.pem in data directory. Trying to enable SSL support using them.
2016-11-15T13:29:13.658415Z 0 [Note] Skipping generation of SSL certificates as certificate files are present in data directory.
2016-11-15T13:29:13.862998Z 0 [Warning] CA certificate ca.pem is self signed.
2016-11-15T13:29:13.877278Z 0 [Note] Skipping generation of RSA key pair as key files are present in data directory.
2016-11-15T13:29:13.896760Z 0 [Note] Server hostname (bind-address): '*'; port: 16094
2016-11-15T13:29:13.905115Z 0 [Note] IPv6 is available.
2016-11-15T13:29:13.983818Z 0 [Note] - '::' resolves to '::';
2016-11-15T13:29:13.999497Z 0 [Note] Server socket created on IP: '::'.
/home/sh/REPOS/PS-5.7.16_val/storage/innobase/mem/mem0mem.cc:448: 0xb87f0a18[312] unwritable at 8
/home/sh/REPOS/PS-5.7.16_val/storage/innobase/mem/mem0mem.cc:448: 0x7839e18[216] unwritable at 8
/home/sh/REPOS/PS-5.7.16_val/storage/innobase/mem/mem0mem.cc:448: 0x7839c18[216] unwritable at 8
/home/sh/REPOS/PS-5.7.16_val/storage/innobase/mem/mem0mem.cc:448: 0x7839b18[216] unwritable at 8
/home/sh/REPOS/PS-5.7.16_val/storage/innobase/data/data0data.cc:248: 0xcfffd18[13] undefined at 8
/home/sh/REPOS/PS-5.7.16_val/storage/innobase/data/data0data.cc:248: 0xcfffd18[13] undefined at 8
/home/sh/REPOS/PS-5.7.16_val/storage/innobase/mem/mem0mem.cc:448: 0xb87f0598[328] unwritable at 8
2016-11-15T13:29:17.297196Z 0 [Note] Event Scheduler: Loaded 0 events
2016-11-15T13:29:17.32...

Read more...

Revision history for this message
Roel Van de Paar (roel11) wrote :

@Laurynas the issue is now reproduced on Centos 7 as well as Ubuntu 16.04. Should be easy to replicate.

Revision history for this message
Laurynas Biveinis (laurynas-biveinis) wrote :

Jemalloc status?

Have you seen the *first* Valgrind errors in the error log before?

Revision history for this message
Laurynas Biveinis (laurynas-biveinis) wrote :

Can you also please provide a log of your ./startup.sh run? (most interestingly ./init and ./tokutek_init)

Revision history for this message
Roel Van de Paar (roel11) wrote :

Discussed with Laurynas on slack

Revision history for this message
Roel Van de Paar (roel11) wrote :
Revision history for this message
Laurynas Biveinis (laurynas-biveinis) wrote :

Unable to reproduce. Since the errors are shared with bug 1542234, please confirm jemalloc use and version at the systems you reproduced this on (ideally on all the different ones).

Here is the outline of my reproduction steps, based on your scripts
1. git checkout release-5.7.16.10
2. out-of-source build with cmake ../mysql-server/ -DWITH_ZLIB=system -DCMAKE_BUILD_TYPE=Debug
-DBUILD_CONFIG=mysql_release -DFEATURE_SET=community
-DDEBUG_EXTNAME=OFF -DWITH_EMBEDDED_SERVER=OFF -DENABLE_DOWNLOADS=1
-DDOWNLOAD_BOOST=1 -DWITH_BOOST=~/mysql-boost -DWITH_SSL=system
-DWITH_PAM=ON -DWITH_VALGRIND=ON
3. mkdir /tmp/valgrind && cd /tmp/valgrind
4. ~/obj-percona-5.7-valgrind/sql/mysqld --no-defaults --initialize-insecure --basedir=. --datadir=./data
5. ~/obj-percona-5.7-valgrind/sql/mysqld --core-file --general_log=1 --general_log_file=/tmp/valgrind/general.log --basedir=/tmp/valgrind --tmpdir=/tmp/valgrind/data --datadir=/tmp/valgrind/data --plugin-dir=/home/laurynas/obj-percona-5.7-valgrind/storage/tokudb --plugin-load-add=tokudb=ha_tokudb.so --tokudb-check-jemalloc=0 --socket=/tmp/valgrind/socket.sock --port=16264 --log-error=/tmp/valgrind/error.log
6. ~/obj-percona-5.7-valgrind/client/mysql -u root -S/tmp/valgrind/socket.sock
mysql> INSTALL PLUGIN tokudb_file_map SONAME 'ha_tokudb.so';
Query OK, 0 rows affected (0.26 sec)

mysql> INSTALL PLUGIN tokudb_fractal_tree_info SONAME 'ha_tokudb.so';
Query OK, 0 rows affected (0.07 sec)

mysql> INSTALL PLUGIN tokudb_fractal_block_map SONAME 'ha_tokudb.so';
ERROR 1127 (HY000):
mysql> INSTALL PLUGIN tokudb_fractal_tree_block_map SONAME 'ha_tokudb.so';
Query OK, 0 rows affected (0.01 sec)

mysql> INSTALL PLUGIN tokudb_trx SONAME 'ha_tokudb.so';
Query OK, 0 rows affected (0.02 sec)

mysql> INSTALL PLUGIN tokudb_locks SONAME 'ha_tokudb.so';
Query OK, 0 rows affected (0.02 sec)

mysql> INSTALL PLUGIN tokudb_lock_waits SONAME 'ha_tokudb.so';
Query OK, 0 rows affected (0.02 sec)

mysql> exit
7. ~/obj-percona-5.7-valgrind/client/mysqladmin shutdown -uroot -S/tmp/valgrind/socket.sock shutdown
8. valgrind ~/obj-percona-5.7-valgrind/sql/mysqld --core-file --basedir=/tmp/valgrind --tmpdir=/tmp/valgrind/data --datadir=/tmp/valgrind/data --plugin-dir=/home/laurynas/obj-percona-5.7-valgrind/storage/tokudb --plugin-load-add=tokudb=ha_tokudb.so --tokudb-check-jemalloc=0 --socket=/tmp/valgrind/socket.sock --port=16264 --log-error=/tmp/valgrind/error.log

Your valgrind errors should happen at step eight, but I don't see them. I don't use jemalloc, will retry once you confirm the version.

Revision history for this message
Shahriyar Rzayev (rzayev-sehriyar) wrote :

Percona now uses JIRA for bug reports so this bug report is migrated to: https://jira.percona.com/browse/PS-1029

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.