garbd crashes on CentOS 6

Bug #1283100 reported by Alan Ivey on 2014-02-21
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Galera
Status tracked in 3.x
2.x
Medium
Alex Yurchenko
3.x
Medium
Alex Yurchenko
Percona XtraDB Cluster moved to https://jira.percona.com/projects/PXC
Status tracked in 5.6
5.5
Fix Released
Undecided
Unassigned
5.6
Fix Released
Undecided
Unassigned

Bug Description

I've tried this on both Digital Ocean and Storm on Demand.

```
$ yum install http://mirrors.ptd.net/epel/6/i386/epel-release-6-8.noarch.rpm
$ yum install http://www.percona.com/downloads/percona-release/percona-release-0.0-1.x86_64.rpm
$ yum install Percona-XtraDB-Cluster-galera-56
Installed:
  Percona-XtraDB-Cluster-galera-3.x86_64 0:3.3-1.207.rhel6

Dependency Installed:
  nc.x86_64 0:1.84-22.el6

$ garbd --address=gcomm://10.38.2.132:4567 --group=galera_cluster
2014-02-21 10:20:20.135 INFO: CRC-32C: using "slicing-by-8" algorithm.
2014-02-21 10:20:20.135 INFO: Read config:
 daemon: 0
 name: garb
 address: gcomm://10.38.2.132:4567
 group: galera_cluster
 sst: trivial
 donor:
 options: gcs.fc_limit=9999999; gcs.fc_factor=1.0; gcs.fc_master_slave=yes
 cfg:
 log:

2014-02-21 10:20:20.136 INFO: protonet asio version 0
2014-02-21 10:20:20.136 INFO: Using CRC-32C (optimized) for message checksums.
2014-02-21 10:20:20.137 INFO: backend: asio
2014-02-21 10:20:20.138 INFO: GMCast version 0
terminate called after throwing an instance of 'gu::NotSet'
```

From what I was able to find, 'gu::NotSet' may have to do with glibc, libc, or libstdc++. I have the following installed and the operating system is fully updated.

```
$ rpm -qa|grep -E '(libstd|libc)'|sort
glibc-2.12-1.132.el6.x86_64
glibc-common-2.12-1.132.el6.x86_64
glibc-devel-2.12-1.132.el6.x86_64
glibc-headers-2.12-1.132.el6.x86_64
libcap-2.16-5.5.el6.x86_64
libcap-ng-0.6.4-3.el6_0.1.x86_64
libcom_err-1.41.12-18.el6.x86_64
libcurl-7.19.7-37.el6_4.x86_64
libstdc++-4.4.7-4.el6.x86_64
$ yum -y update
Loaded plugins: downloadonly, fastestmirror
Loading mirror speeds from cached hostfile
 * base: centos.mirror.nac.net
 * epel: mirror.es.its.nyu.edu
 * extras: centos.someimage.com
 * updates: centos.mirror.constant.com
Setting up Update Process
No Packages marked for Update
```

I have not tried other platforms as we only run CentOS 6.

Related branches

Alan Ivey (alanivey) wrote :

Bug 1281956 might be related, fwiw.

Rick Pizzi (pizzi) wrote :

I found the reason. It has NOTHING to do with glibc.

Using gdb I traced the assertion back to gcomm/src/gmcast.cpp, line 146:

 catch (gu::NotSet&)
    {
        // if no listen port is set for listen address in the options,
        // see if base port was configured
        try
        {
            port = conf_.get(BASE_PORT_KEY); <----- this call fails and throws the NotSet exception
        }
        catch (gu::NotFound&)
        {
            // if no base port configured, try port from the connection address
            try { port = uri_.get_port(); } catch (gu::NotSet&) {}
        }

        listen_addr_ += ":" + port;
    }

To make it work simply specify the listen address in the option, eg:

garbd -a gcomm://a.b.c.d:4567 -g clustername -o gmcast.listen_addr=tcp://10.10.10.10:5674

Backtrace:

#0 0x00007ffff67b6389 in raise () from /usr/lib/libc.so.6
#1 0x00007ffff67b7788 in abort () from /usr/lib/libc.so.6
#2 0x00007ffff70a0625 in __gnu_cxx::__verbose_terminate_handler() () from /usr/lib/libstdc++.so.6
#3 0x00007ffff709e786 in ?? () from /usr/lib/libstdc++.so.6
#4 0x00007ffff709e7b3 in std::terminate() () from /usr/lib/libstdc++.so.6
#5 0x00007ffff709e9f2 in __cxa_throw () from /usr/lib/libstdc++.so.6
#6 0x0000000000439e72 in gu::Config::get (this=<optimized out>, key=...) at galerautils/src/gu_config.hpp:126
#7 0x000000000044df18 in gcomm::GMCast::GMCast (this=0x7cbeb0, net=..., uri=...) at gcomm/src/gmcast.cpp:142
#8 0x0000000000468a49 in gcomm::PC::PC (this=0x7c8000, net=..., uri=...) at gcomm/src/pc.cpp:216
#9 0x000000000043d27e in gcomm::Transport::create (pnet=..., uri=...) at gcomm/src/transport.cpp:72
#10 0x000000000042dbc7 in GCommConn::connect (this=this@entry=0x7c61b0, channel=..., bootstrap=bootstrap@entry=false) at gcs/src/gcs_gcomm.cpp:219
#11 0x0000000000428b1b in gcomm_open (backend=<optimized out>, channel=<optimized out>, bootstrap=false) at gcs/src/gcs_gcomm.cpp:702
#12 0x0000000000420364 in gcs_core_open (core=0x78b1e0, channel=channel@entry=0x7889c8 "galera_cluster", url=url@entry=0x788ad8 "gcomm://127.0.0.1:4567", bstrap=bstrap@entry=false) at gcs/src/gcs_core.c:196
#13 0x000000000041a396 in gcs_open (conn=0x78b000, channel=0x7889c8 "galera_cluster", url=0x788ad8 "gcomm://127.0.0.1:4567", bootstrap=<optimized out>) at gcs/src/gcs.c:1272
#14 0x000000000040bf42 in garb::Gcs::Gcs (this=0x7fffffffb9b0, gconf=..., address=..., group=...) at garb/garb_gcs.cpp:26
#15 0x0000000000410670 in garb::RecvLoop::RecvLoop (this=0x7fffffffb970, config=...) at garb/garb_recv_loop.cpp:27
#16 0x0000000000409886 in garb::main (argc=<optimized out>, argv=<optimized out>) at garb/garb_main.cpp:88
#17 0x000000000040997d in main (argc=<optimized out>, argv=<optimized out>) at garb/garb_main.cpp:100

It actually crashes here:

    try
    {
        listen_addr_ = uri_.get_option (Conf::GMCastListenAddr);
    }

(the line numbers are a bit misleading due to C++ constructors)

Alan Ivey (alanivey) wrote :

Adding gmcast.listen_addr works great, including in the Red Hat/CentOS /etc/sysconfig/garb. Thank you!

Percona now uses JIRA for bug reports so this bug report is migrated to: https://jira.percona.com/browse/PXC-1628

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers