unit: galera-cluster/1: hook failed: "cluster-relation-changed" - _mysql_exceptions.OperationalError: (2002, "Can't connect to local MySQL server through socket '/var/run/mysqld/mysqld.sock' (111)"

Bug #1501947 reported by Larry Michel
12
This bug affects 1 person
Affects Status Importance Assigned to Milestone
galera-cluster (Juju Charms Collection)
New
Undecided
Unassigned

Bug Description

We are attempting to deploy galera-cluster with 3 units and were are seeing cluster-relation-changed hook failure:

This is from console:

2015-10-01 14:16:09,841 [DEBUG] deployer.env: Delta machine: 6 change:pending
2015-10-01 14:16:14,844 [DEBUG] deployer.env: Delta machine: 6 change:pending
2015-10-01 14:16:14,853 [DEBUG] deployer.env: Delta machine: 6 change:started
2015-10-01 14:16:19,855 [DEBUG] deployer.env: Delta unit: galera-cluster/1 change:pending
2015-10-01 14:16:19,861 [DEBUG] deployer.env: Delta unit: cinder/0 change:pending
2015-10-01 14:16:19,875 [DEBUG] deployer.env: Delta unit: galera-cluster/1 change:pending
2015-10-01 14:16:24,873 [DEBUG] deployer.env: Delta unit: galera-cluster/1 change:pending
2015-10-01 14:16:24,887 [DEBUG] deployer.env: Delta unit: galera-cluster/1 change:started
2015-10-01 14:16:29,899 [DEBUG] deployer.env: Delta unit: galera-cluster/1 change:started
2015-10-01 14:16:34,910 [DEBUG] deployer.env: Delta service: cinder change:{u'Err': None, u'Since': u'2015-10-01T14:16:32.781987295Z', u'Current': u'maintenance', u'Version': u'', u'Message': u'installing charm software', u'Data': {}}
2015-10-01 14:16:34,910 [DEBUG] deployer.env: Delta unit: cinder/0 change:pending
2015-10-01 14:16:39,903 [DEBUG] deployer.env: Delta unit: cinder/0 change:pending
2015-10-01 14:16:44,897 [DEBUG] deployer.env: Delta unit: galera-cluster/2 change:pending
2015-10-01 14:16:44,909 [DEBUG] deployer.env: Delta unit: galera-cluster/2 change:pending
2015-10-01 14:16:44,912 [DEBUG] deployer.env: Delta unit: galera-cluster/0 change:pending
2015-10-01 14:16:49,925 [DEBUG] deployer.env: Delta unit: galera-cluster/2 change:pending
2015-10-01 14:16:49,959 [DEBUG] deployer.env: Delta unit: galera-cluster/2 change:started
2015-10-01 14:16:49,959 [DEBUG] deployer.env: Delta unit: galera-cluster/0 change:pending
2015-10-01 14:16:54,967 [DEBUG] deployer.env: Delta service: galera-cluster change:{u'Err': None, u'Since': u'2015-10-01T14:16:21.044218775Z', u'Current': u'unknown', u'Version': u'', u'Message': u'', u'Data': {}}
2015-10-01 14:16:54,968 [DEBUG] deployer.env: Delta unit: galera-cluster/0 change:started
2015-10-01 14:16:59,985 [DEBUG] deployer.env: Delta unit: galera-cluster/1 change:started
2015-10-01 14:17:04,980 [DEBUG] deployer.env: Delta machine: 4 change:pending
2015-10-01 14:17:05,006 [DEBUG] deployer.env: Delta machine: 4 change:started
2015-10-01 14:17:05,007 [DEBUG] deployer.env: Delta unit: galera-cluster/1 change:started
2015-10-01 14:17:10,029 [DEBUG] deployer.env: Delta unit: galera-cluster/1 change:started
2015-10-01 14:17:10,036 [DEBUG] deployer.env: Delta unit: nova-cloud-controller/0 change:pending
2015-10-01 14:17:20,038 [DEBUG] deployer.env: Delta service: galera-cluster change:{u'Err': None, u'Since': u'2015-10-01T14:17:18.375558508Z', u'Current': u'error', u'Version': u'', u'Message': u'hook failed: "cluster-relation-changed"', u'Data': {u'hook': u'cluster-relation-changed', u'remote-unit': u'galera-cluster/0', u'relation-id': 0}}
2015-10-01 14:17:20,039 [DEBUG] deployer.env: Delta unit: galera-cluster/1 change:error
2015-10-01 14:17:20,040 [ERROR] deployer.env: The following units had errors:
   unit: galera-cluster/1: machine: 2 agent-state: error details: hook failed: "cluster-relation-changed"
2015-10-01 14:17:20,045 [ERROR] oil_ci.deploy.oil_deployer: Deployment failed:

This is from all machines log:

unit-galera-cluster-1[22017]: 2015-10-01 14:17:17 INFO worker.uniter.jujuc server.go:158 running hook tool "juju-log" ["-l" "DEBUG" "Using existing password file '/var/lib/charm/galera-cluster/mysql.passwd'"]
unit-galera-cluster-1[22017]: 2015-10-01 14:17:17 DEBUG worker.uniter.jujuc server.go:159 hook context id "galera-cluster/1-cluster-relation-changed-7631812583627038628"; dir "/var/lib/juju/agents/unit-galera-cluster-1/charm"
unit-galera-cluster-1[22017]: 2015-10-01 14:17:17 DEBUG unit.galera-cluster/1.juju-log server.go:254 cluster:0: Using existing password file '/var/lib/charm/galera-cluster/mysql.passwd'
unit-galera-cluster-1[22017]: 2015-10-01 14:17:17 INFO worker.uniter.jujuc server.go:158 running hook tool "juju-log" ["-l" "DEBUG" "Opening db connection for root@localhost"]
unit-galera-cluster-1[22017]: 2015-10-01 14:17:17 DEBUG worker.uniter.jujuc server.go:159 hook context id "galera-cluster/1-cluster-relation-changed-7631812583627038628"; dir "/var/lib/juju/agents/unit-galera-cluster-1/charm"
unit-galera-cluster-1[22017]: 2015-10-01 14:17:17 DEBUG unit.galera-cluster/1.juju-log server.go:254 cluster:0: Opening db connection for root@localhost
unit-galera-cluster-1[22017]: 2015-10-01 14:17:17 INFO unit.galera-cluster/1.cluster-relation-changed logger.go:40 Traceback (most recent call last):
unit-galera-cluster-1[22017]: 2015-10-01 14:17:17 INFO unit.galera-cluster/1.cluster-relation-changed logger.go:40 File "/var/lib/juju/agents/unit-galera-cluster-1/charm/hooks/cluster-relation-changed", line 455, in <module>
unit-galera-cluster-1[22017]: 2015-10-01 14:17:17 INFO unit.galera-cluster/1.cluster-relation-changed logger.go:40 main()
unit-galera-cluster-1[22017]: 2015-10-01 14:17:17 INFO unit.galera-cluster/1.cluster-relation-changed logger.go:40 File "/var/lib/juju/agents/unit-galera-cluster-1/charm/hooks/cluster-relation-changed", line 449, in main
unit-galera-cluster-1[22017]: 2015-10-01 14:17:17 INFO unit.galera-cluster/1.cluster-relation-changed logger.go:40 hooks.execute(sys.argv)
unit-galera-cluster-1[22017]: 2015-10-01 14:17:17 INFO unit.galera-cluster/1.cluster-relation-changed logger.go:40 File "/var/lib/juju/agents/unit-galera-cluster-1/charm/hooks/charmhelpers/core/hookenv.py", line 544, in execute
unit-galera-cluster-1[22017]: 2015-10-01 14:17:17 INFO unit.galera-cluster/1.cluster-relation-changed logger.go:40 self._hooks[hook_name]()
unit-galera-cluster-1[22017]: 2015-10-01 14:17:17 INFO unit.galera-cluster/1.cluster-relation-changed logger.go:40 File "/var/lib/juju/agents/unit-galera-cluster-1/charm/hooks/cluster-relation-changed", line 204, in cluster_changed
unit-galera-cluster-1[22017]: 2015-10-01 14:17:17 INFO unit.galera-cluster/1.cluster-relation-changed logger.go:40 config_changed()
unit-galera-cluster-1[22017]: 2015-10-01 14:17:17 INFO unit.galera-cluster/1.cluster-relation-changed logger.go:40 File "/var/lib/juju/agents/unit-galera-cluster-1/charm/hooks/cluster-relation-changed", line 172, in config_changed
unit-galera-cluster-1[22017]: 2015-10-01 14:17:17 INFO unit.galera-cluster/1.cluster-relation-changed logger.go:40 change_sst_password(config('sst-password'))
unit-galera-cluster-1[22017]: 2015-10-01 14:17:17 INFO unit.galera-cluster/1.cluster-relation-changed logger.go:40 File "/var/lib/juju/agents/unit-galera-cluster-1/charm/hooks/galera_utils.py", line 171, in change_sst_password
unit-galera-cluster-1[22017]: 2015-10-01 14:17:17 INFO unit.galera-cluster/1.cluster-relation-changed logger.go:40 m_helper.connect(password=m_helper.get_mysql_root_password())
unit-galera-cluster-1[22017]: 2015-10-01 14:17:17 INFO unit.galera-cluster/1.cluster-relation-changed logger.go:40 File "/var/lib/juju/agents/unit-galera-cluster-1/charm/hooks/charmhelpers/contrib/database/mysql.py", line 63, in connect
unit-galera-cluster-1[22017]: 2015-10-01 14:17:17 INFO unit.galera-cluster/1.cluster-relation-changed logger.go:40 passwd=password)
unit-galera-cluster-1[22017]: 2015-10-01 14:17:17 INFO unit.galera-cluster/1.cluster-relation-changed logger.go:40 File "/usr/lib/python2.7/dist-packages/MySQLdb/__init__.py", line 81, in Connect
unit-galera-cluster-1[22017]: 2015-10-01 14:17:17 INFO unit.galera-cluster/1.cluster-relation-changed logger.go:40 return Connection(*args, **kwargs)
unit-galera-cluster-1[22017]: 2015-10-01 14:17:17 INFO unit.galera-cluster/1.cluster-relation-changed logger.go:40 File "/usr/lib/python2.7/dist-packages/MySQLdb/connections.py", line 187, in __init__
unit-galera-cluster-1[22017]: 2015-10-01 14:17:17 INFO unit.galera-cluster/1.cluster-relation-changed logger.go:40 super(Connection, self).__init__(*args, **kwargs2)
unit-galera-cluster-1[22017]: 2015-10-01 14:17:17 INFO unit.galera-cluster/1.cluster-relation-changed logger.go:40 _mysql_exceptions.OperationalError: (2002, "Can't connect to local MySQL server through socket '/var/run/mysqld/mysqld.sock' (111)")
unit-galera-cluster-1[22017]: 2015-10-01 14:17:17 INFO juju.worker.uniter.context context.go:543 handling reboot
unit-galera-cluster-1[22017]: 2015-10-01 14:17:17 ERROR juju.worker.uniter.operation runhook.go:103 hook "cluster-relation-changed" failed: exit status 1
unit-galera-cluster-1[22017]: 2015-10-01 14:17:17 DEBUG juju.worker.uniter modes.go:31 [AGENT-STATUS] failed: run relation-changed (0; galera-cluster/0) hook
unit-nova-compute-0[22013]: 2015-10-01 14:17:17 INFO unit.nova-compute/0.install logger.go:40 Selecting previously unselected package libxfixes3:amd64.
unit-nova-compute-0[22013]: 2015-10-01 14:17:17 INFO unit.nova-compute/0.install logger.go:40 Preparing to unpack .../libxfixes3_1%3a5.0.1-1ubuntu1.1_amd64.deb ...
unit-nova-compute-0[22013]: 2015-10-01 14:17:17 INFO unit.nova-compute/0.install logger.go:40 Unpacking libxfixes3:amd64 (1:5.0.1-1ubuntu1.1) ...
unit-galera-cluster-1[22017]: 2015-10-01 14:17:18 INFO juju.worker.uniter modes.go:543 ModeAbide exiting
unit-galera-cluster-1[22017]: 2015-10-01 14:17:18 INFO juju.worker.uniter modes.go:541 ModeHookError starting

and from juju_status.yaml:

  galera-cluster:
    charm: local:trusty/galera-cluster-45
    exposed: false
    relations:
      cluster:
      - galera-cluster
    service-status:
      current: error
      message: 'hook failed: "cluster-relation-changed"'
      since: 01 Oct 2015 14:17:18Z
    units:
      galera-cluster/0:
        agent-state: started
        agent-status:
          current: executing
          message: running start hook
          since: 01 Oct 2015 14:16:49Z
          version: 1.24.6
        agent-version: 1.24.6
        machine: '1'
        open-ports:
        - 3306/tcp
        public-address: flaglervm04.oilstaging
        workload-status:
          current: unknown
          since: 01 Oct 2015 14:16:49Z
      galera-cluster/1:
        agent-state: error
        agent-state-info: 'hook failed: "cluster-relation-changed"'
        agent-status:
          current: idle
          since: 01 Oct 2015 14:17:18Z
          version: 1.24.6
        agent-version: 1.24.6
        machine: '2'
        open-ports:
        - 3306/tcp
        public-address: flaglervm02.oilstaging
        workload-status:
          current: error
          message: 'hook failed: "cluster-relation-changed"'
          since: 01 Oct 2015 14:17:18Z
      galera-cluster/2:
        agent-state: started
        agent-status:
          current: executing
          message: running cluster-relation-changed hook
          since: 01 Oct 2015 14:17:31Z
          version: 1.24.6
        agent-version: 1.24.6
        machine: '3'
        open-ports:
        - 3306/tcp
        public-address: flaglervm01.oilstaging
        workload-status:
          current: unknown
          since: 01 Oct 2015 14:16:46Z

Tags: oil
Revision history for this message
Larry Michel (lmic) wrote :
summary: - unit: galera-cluster/1: machine: 2 agent-state: error details: hook
- failed: "cluster-relation-changed"
+ unit: galera-cluster/1: hook failed: "cluster-relation-changed" -
+ _mysql_exceptions.OperationalError: (2002, "Can't connect to local MySQL
+ server through socket '/var/run/mysqld/mysqld.sock' (111)"
Revision history for this message
Larry Michel (lmic) wrote :

Attaching bundle file.

Revision history for this message
Philip Stoev (philip-stoev-f) wrote :

The server on unit-galera-cluster-1 failed to start correctly:

unit-galera-cluster-1[22017]: 2015-10-01 14:17:12 INFO unit.galera-cluster/1.cluster-relation-changed logger.go:40 Starting MySQL
unit-galera-cluster-1[22017]: 2015-10-01 14:17:17 INFO unit.galera-cluster/1.cluster-relation-changed logger.go:40 ..... * The server quit without updating PID file (/var/lib/mysql/flaglervm02.pid).
unit-galera-cluster-1[22017]: 2015-10-01 14:17:17 INFO unit.galera-cluster/1.cluster-relation-changed logger.go:40 * Failed to restart server.

Will it be possible to obtain the mysqld.err log file from that machine and attach it?

Larry Michel (lmic)
Changed in galera-cluster (Juju Charms Collection):
status: New → Incomplete
Revision history for this message
Larry Michel (lmic) wrote :

Sorry for the delay. Attempting to recreate manually and will attach log as soon as it's available.

Revision history for this message
Larry Michel (lmic) wrote :

We were not able to recreate with physical servers and my environment with VMs was down. I am not going back to that environment and I will re-attempt to recreate.

Revision history for this message
Larry Michel (lmic) wrote :
Download full text (5.4 KiB)

I was able to recreate with a set physical servers and here are the logs. For the mysqld logs, I found that they go to the syslog so I've extracted these.

From mysqld logs, there's this exception:

Nov 18 06:36:12 prunes mysqld: 2015-11-18 06:36:12 75623 [ERROR] WSREP: caught exception in PC, state dump to stderr follows:
Nov 18 06:36:12 prunes mysqld: pc::Proto{uuid=a8c811b1,start_prim=0,npvo=0,ignore_sb=0,ignore_quorum=0,state=1,last_sent_seq=0,checksum=0,instances=
Nov 18 06:36:12 prunes mysqld: #0119ad0acfb,prim=1,un=0,last_seq=3,last_prim=view_id(PRIM,9ad0acfb,3),to_seq=18,weight=1,segment=0
Nov 18 06:36:12 prunes mysqld: #0119ca2cdf9,prim=1,un=0,last_seq=8,last_prim=view_id(PRIM,9ca2cdf9,1),to_seq=7,weight=1,segment=0
Nov 18 06:36:12 prunes mysqld: #011a8c811b1,prim=0,un=0,last_seq=4294967295,last_prim=view_id(NON_PRIM,00000000,0),to_seq=-1,weight=1,segment=0
Nov 18 06:36:12 prunes mysqld: ,state_msgs=
Nov 18 06:36:12 prunes mysqld: #0119ad0acfb,pcmsg{ type=STATE, seq=0, flags= 0, node_map {#0119ad0acfb,prim=1,un=0,last_seq=3,last_prim=view_id(PRIM,9ad0acfb,3),to_seq=18,weight=1,segment=0
Nov 18 06:36:12 prunes mysqld: }}
Nov 18 06:36:12 prunes mysqld: #0119ca2cdf9,pcmsg{ type=STATE, seq=0, flags= 0, node_map {#0119ca2cdf9,prim=1,un=0,last_seq=8,last_prim=view_id(PRIM,9ca2cdf9,1),to_seq=7,weight=1,segment=0
Nov 18 06:36:12 prunes mysqld: }}
Nov 18 06:36:12 prunes mysqld: #011a8c811b1,pcmsg{ type=STATE, seq=0, flags= 0, node_map {#011a8c811b1,prim=0,un=0,last_seq=4294967295,last_prim=view_id(NON_PRIM,00000000,0),to_seq=-1,weight=1,segment=0
Nov 18 06:36:12 prunes mysqld: }}
Nov 18 06:36:12 prunes mysqld: ,current_view=view(view_id(REG,9ad0acfb,4) memb {
Nov 18 06:36:12 prunes mysqld: #0119ad0acfb,0
Nov 18 06:36:12 prunes mysqld: #0119ca2cdf9,0
Nov 18 06:36:12 prunes mysqld: #011a8c811b1,0
Nov 18 06:36:12 prunes mysqld: } joined {
Nov 18 06:36:12 prunes mysqld: #0119ad0acfb,0
Nov 18 06:36:12 prunes mysqld: #0119ca2cdf9,0
Nov 18 06:36:12 prunes mysqld: } left {
Nov 18 06:36:12 prunes mysqld: } partitioned {
Nov 18 06:36:12 prunes mysqld: }),pc_view=view((empty)),mtu=2147483647}
Nov 18 06:36:12 prunes mysqld: 2015-11-18 06:36:12 75623 [Note] WSREP: {v=0,t=1,ut=255,o=4,s=0,sr=0,as=-1,f=0,src=a8c811b1,srcvid=view_id(REG,9ad0acfb,4),insvid=view_id(UNKNOWN,00000000,0),ru=00000000,r=[-1,-1],fs=8,nl=(
Nov 18 06:36:12 prunes mysqld: )
Nov 18 06:36:12 prunes mysqld: } 64
Nov 18 06:36:12 prunes mysqld: 2015-11-18 06:36:12 75623 [ERROR] WSREP: exception caused by message: {v=0,t=3,ut=255,o=1,s=0,sr=-1,as=0,f=4,src=9ca2cdf9,srcvid=view_id(REG,9ad0acfb,4),insvid=view_id(UNKNOWN,00000000,0),ru=00000000,r=[-1,-1],fs=42,nl=(
Nov 18 06:36:12 prunes mysqld: )
Nov 18 06:36:12 prunes mysqld: }
Nov 18 06:36:12 prunes mysqld: state after handling message: evs::proto(evs::proto(a8c811b1, OPERATIONAL, view_id(REG,9ad0acfb,4)), OPERATIONAL) {
Nov 18 06:36:12 prunes mysqld: current_view=view(view_id(REG,9ad0acfb,4) memb {
Nov 18 06:36:12 prunes mysqld: #0119ad0acfb,0
Nov 18 06:36:12 prunes mysqld: #0119ca2cdf9,0
Nov 18 06:36:12 prunes mysqld: #011a8c811b1,0
Nov 18 06:36:12 prunes mysqld: } joined {
Nov 18 06:36:12 prunes mysqld: } left {
Nov 18 06:36:12 prunes m...

Read more...

Changed in galera-cluster (Juju Charms Collection):
status: Incomplete → New
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.