InfluxDB crash while scaling up from 1 to 2 nodes

Bug #1552191 reported by Ivan Lozgachev
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
StackLight
Opinion
Medium
LMA-Toolchain Fuel Plugins

Bug Description

This bug is created to track https://github.com/influxdata/influxdb/issues/5667

Environment:
3 controllers
174 compute + 20 ceph nodes
1 elasticsearch node
1 influxdb node

How to reproduce:
1. Go to Fuel dashboard and add new InfluxDB node
2. Deploy changes
3. Check status of influxd daemon on newly created node

Expected result:
influxd is running

Actual result:
influxd is failed

Tags: influxdb scale
Revision history for this message
Ivan Lozgachev (ilozgachev) wrote :
Revision history for this message
Swann Croiset (swann-w) wrote :

> SHOW SERVERS
name: data_nodes
----------------
id http_addr tcp_addr
1 node-193:8086 node-193:8088
3 node-95:8086 node-95:8088

name: meta_nodes
----------------
id http_addr tcp_addr
1 node-193:8091 node-193:8088
2 node-95:8091 node-95:8088

> SHOW DIAGNOSTICS
name: build
-----------
Branch Build Time Commit Version
HEAD 2016-02-18T20:44:27.807242 df902a4b077bb270984303b8e4f8a320e3954b40 0.10.1

name: hh
--------
node active last modified head tail

name: network
-------------
hostname
node-193.domain.tld

name: runtime
-------------
GOARCH GOMAXPROCS GOOS version
amd64 12 linux go1.4.3

name: system
------------
PID currentTime started uptime
22708 2016-03-02T11:38:25.531950129Z 2016-03-02T00:13:29.483042159Z 11h24m56.048908142s

Changed in lma-toolchain:
status: New → Confirmed
assignee: nobody → LMA-Toolchain Fuel Plugins (mos-lma-toolchain)
importance: Undecided → Medium
Revision history for this message
Swann Croiset (swann-w) wrote :
Download full text (4.9 KiB)

There is no data at all on the new node:

root@node-95:~# find /var/lib/influxdb/ -ls
     2 4 drwxr-xr-x 6 influxdb influxdb 4096 Mar 2 10:42 /var/lib/influxdb/
4587521 4 drwx------ 2 influxdb influxdb 4096 Mar 2 10:42 /var/lib/influxdb/hh
3014657 4 drwxr-xr-x 3 influxdb influxdb 4096 Mar 2 10:42 /var/lib/influxdb/meta
3014661 4 -rw-r--r-- 1 influxdb influxdb 56 Mar 2 10:42 /var/lib/influxdb/meta/node.json
3014660 4 drwxr-xr-x 2 influxdb influxdb 4096 Mar 2 11:34 /var/lib/influxdb/meta/snapshots
3014659 32 -rw------- 1 influxdb influxdb 65536 Mar 2 11:34 /var/lib/influxdb/meta/raft.db
3014658 4 -rwxr-xr-x 1 influxdb influxdb 33 Mar 2 11:34 /var/lib/influxdb/meta/peers.json
1179649 4 drwxr-xr-x 2 influxdb influxdb 4096 Mar 2 10:42 /var/lib/influxdb/data
    11 16 drwx------ 2 influxdb influxdb 16384 Mar 2 10:10 /var/lib/influxdb/lost+found

root@node-193:/var/log/influxdb# find /var/lib/influxdb/ -ls
     2 4 drwxr-xr-x 7 influxdb influxdb 4096 Mar 2 00:13 /var/lib/influxdb/
    11 16 drwx------ 2 influxdb influxdb 16384 Mar 1 22:06 /var/lib/influxdb/lost+found
2883585 4 drwxr-xr-x 3 influxdb influxdb 4096 Mar 2 10:42 /var/lib/influxdb/meta
2883588 4 drwxr-xr-x 2 influxdb influxdb 4096 Mar 2 00:13 /var/lib/influxdb/meta/snapshots
2883590 4 -rw-r--r-- 1 influxdb influxdb 56 Mar 2 10:42 /var/lib/influxdb/meta/node.json
2883587 32 -rw------- 1 influxdb influxdb 65536 Mar 2 11:34 /var/lib/influxdb/meta/raft.db
2883586 4 -rwxr-xr-x 1 influxdb influxdb 33 Mar 2 11:34 /var/lib/influxdb/meta/peers.json
3145729 4 drwx------ 4 influxdb influxdb 4096 Mar 2 00:15 /var/lib/influxdb/wal
3145730 4 drwx------ 3 influxdb influxdb 4096 Mar 2 00:13 /var/lib/influxdb/wal/_internal
3145731 4 drwx------ 3 influxdb influxdb 4096 Mar 2 00:13 /var/lib/influxdb/wal/_internal/monitor
3145732 4 drwx------ 2 influxdb influxdb 4096 Mar 2 00:13 /var/lib/influxdb/wal/_internal/monitor/1
3145733 4504 -rw-r--r-- 1 influxdb influxdb 4607939 Mar 2 11:41 /var/lib/influxdb/wal/_internal/monitor/1/_00001.wal
3145734 4 drwx------ 3 influxdb influxdb 4096 Mar 2 00:15 /var/lib/influxdb/wal/lma
3145735 4 drwx------ 4 influxdb influxdb 4096 Mar 2 00:19 /var/lib/influxdb/wal/lma/default
3145739 4 drwx------ 2 influxdb influxdb 4096 Mar 2 01:19 /var/lib/influxdb/wal/lma/default/3
3145742 0 -rw-r--r-- 1 influxdb influxdb 0 Mar 2 01:19 /var/lib/influxdb/wal/lma/default/3/_00002.wal
3145736 4 drwx------ 2 influxdb influxdb 4096 Mar 2 11:41 /var/lib/influxdb/wal/lma/default/2
3145738 10248 -rw-r--r-- 1 influxdb influxdb 10486017 Mar 2 11:41 /var/lib/influxdb/wal/lma/default/2/_01192.wal
3145740 2508 -rw-r--r-- 1 influxdb influxdb 2563523 Mar 2 11:41 /var/lib/influxdb/wal/lma/default/2/_01193.wal
3145737 10248 -rw-r--r-- 1 influxdb influxdb 10486390 Mar 2 11:40 /var/lib/influxdb/wal/lma/default/2/_01191.wal
3145744 10248 -rw-r--r-- 1 influxdb influxdb 10487090 Mar 2 11:40 /var/lib/influxdb/wal/lma/default/2/_01190.wal
406323...

Read more...

summary: - InfluxDB crash while scaling up
+ InfluxDB crash while scaling up from 1 to 2 nodes
Swann Croiset (swann-w)
description: updated
Revision history for this message
Swann Croiset (swann-w) wrote :

on the first node:

> show shards
name: _internal
---------------
id database retention_policy shard_group start_time end_time expiry_time owners
1 _internal monitor 1 2016-03-02T00:00:00Z 2016-03-03T00:00:00Z 2016-03-10T00:00:00Z 1

name: lma
---------
id database retention_policy shard_group start_time end_time expiry_time owners
3 lma default 3 2016-03-01T00:00:00Z 2016-03-02T00:00:00Z 2016-04-01T00:00:00Z 1
2 lma default 2 2016-03-02T00:00:00Z 2016-03-03T00:00:00Z 2016-04-02T00:00:00Z 1

Swann Croiset (swann-w)
tags: added: scale
tags: added: nagios
Revision history for this message
Swann Croiset (swann-w) wrote :

Scaling from 1 to 2 node is not supported

tags: added: influxdb
removed: nagios
Changed in lma-toolchain:
status: Confirmed → Opinion
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Bug attachments

Remote bug watches

Bug watches keep track of this bug in other bug trackers.