Fuel for OpenStack

Lost data in Ceph during failure adding new Ceph nodes.

Bug #1445296 reported by Denis Ipatov on 2015-04-17

This bug report is a duplicate of: Bug #1430845: Ceph HIGH IO load when add new OSDs. Edit Remove

This bug affects 1 person

Affects		Status	Importance	Assigned to	Milestone
	Fuel for OpenStack	Fix Committed	Critical	Stanislav Makar	Fuel for OpenStack 6.1
	6.0.x	In Progress	Critical	Stanislav Makar	Fuel for OpenStack 6.0-updates

Bug Description

How to reproduce it:

1. Create new cluster. All data need to be located in Ceph.
2. Add some data in the working cloud. For example add several images.
3. Add new Ceph OSD nodes. Ceph starts repalancing after installing the OS and Ceph but before finish deployment.
4. If we had any deployment's error the cluster marks as 'Error'
5. Delete Ceph nodes witch were marked as "Error"
6. Number of lost number depend on how much data was rebalanced.

How to avoid it:
1. Execute the command `ceph osd set noout` to stop rebalancing data before adding Ceph OSD nodes
and `ceph osd unset noout` after succesful end of deployment.

This is affect all version of MOS.

See original description

Tags:

Denis Ipatov (dipatov) on 2015-04-17

description:	updated
summary:	- Lost data in Ceph during fail adding new Ceph nodes. + Lost data in Ceph during failure adding new Ceph nodes.
tags:	added: customer-found

Denis Ipatov (dipatov) on 2015-04-17

description:

updated

Nastya Urlapova (aurlapova) on 2015-04-17

Changed in fuel:
milestone:	none → 6.1
assignee:	nobody → Fuel Library Team (fuel-library)
importance:	Undecided → High

Revision history for this message

Stanislav Makar (smakar) wrote on 2015-04-17:

It would be fine to know ISO version ?
We have already merged the patch https://github.com/stackforge/fuel-library/commit/c52d4fc377efe1134e8be81a18560c0a6e0138c3 which should help with it

Changed in fuel:
assignee:	Fuel Library Team (fuel-library) → Stanislav Makar (smakar)

Revision history for this message

Bogdan Dobrelya (bogdando) wrote on 2015-04-17:

Potential data loss should has a critical priority

Changed in fuel:
importance:	High → Critical

Revision history for this message

Stanislav Makar (smakar) wrote on 2015-04-20:

We have to backport to 6.0 version

Changed in fuel:
status:	New → Incomplete
status:	Incomplete → Fix Committed

Stanislaw Bogatkin (sbogatkin) on 2015-04-20

Changed in fuel:
status:	Fix Committed → Incomplete

Revision history for this message

Stanislav Makar (smakar) wrote on 2015-04-20:

Fix proposed to branch: stable/6.0
Review: https://review.openstack.org/175364

Revision history for this message

Denis Ipatov (dipatov) wrote on 2015-04-20:

I think this fix doesn't fix the bug.

Revision history for this message

Denis Ipatov (dipatov) wrote on 2015-04-20:

osd max backfills

Description: The maximum number of backfills allowed to or from a single OSD.
Type: 64-bit Unsigned Integer
Default: 10

osd recovery max active

Description: The number of active recovery requests per OSD at one time. More requests will accelerate recovery, but the requests places an increased load on the cluster.
Type: 32-bit Integer
Default: 15

These values only decrease speed of process of replication, but we must stop the replication until nodes are successfully added.

Revision history for this message

Tomasz 'Zen' Napierala (tzn) wrote on 2015-04-21:

Stas,

It should be worked on in master, not 6.0, also Incomplete status for 6.1 is not correct.

Changed in fuel:
status:	Incomplete → Confirmed

Revision history for this message

Stanislav Makar (smakar) wrote on 2015-04-21:

This patch is connected with problem when you add a lot of OSDs, cluster starts rebalancing very hard and in such case some osd nodes are lost due to big load
And it is the root cause why data was lost

Changed in fuel:
status:	Confirmed → Fix Committed

Revision history for this message

Stanislav Makar (smakar) wrote on 2015-04-21:

Data will be save but will last longer

Revision history for this message

Stanislav Makar (smakar) wrote on 2015-05-18:

#11

here is patch
https://review.openstack.org/#/c/163019/

Przemyslaw Kaminski (pkaminski) on 2015-05-18

tags:

added: on-verification

Przemyslaw Kaminski (pkaminski) on 2015-05-20

tags:

removed: on-verification

Revision history for this message

Kyrylo Romanenko (kromanenko) wrote on 2015-05-21:

#12

On verification

Revision history for this message

Kyrylo Romanenko (kromanenko) wrote on 2015-05-21:

#13

I performed deployment of ceph-enabled environment,
then added some images to glance
after that i have added one more Ceph OSD Node to deployment without errors.

How should i check data integrity in Ceph storage after rebalancing?

Revision history for this message

Kyrylo Romanenko (kromanenko) wrote on 2015-05-21:

#14

Ubuntu, 1 Controller+Ceph, 1 Compute+Ceph.
VERSION:
  feature_groups:
    - mirantis
  production: "docker"
  release: "6.1"
  openstack_version: "2014.2.2-6.1"
  api: "1.0"
  build_number: "437"
  build_id: "2015-05-19_10-05-51"
  nailgun_sha: "593c99f2b46cf52b2be6c7c6e182b6ba9f2232cd"
  python-fuelclient_sha: "e19f1b65792f84c4a18b5a9473f85ef3ba172fce"
  astute_sha: "96801c5bccb14aa3f2a0d7f27f4a4b6dd2b4a548"
  fuel-library_sha: "2814c51668f487e97e1449b078bad1942421e6b9"
  fuel-ostf_sha: "9ce1800749081780b8b2a4a7eab6586583ffaf33"
  fuelmain_sha: "68796aeaa7b669e68bc0976ffd616709c937187a"

Revision history for this message

Kyrylo Romanenko (kromanenko) wrote on 2015-05-21:

#15

My attempt to verify this fix caused new bug: https://bugs.launchpad.net/fuel/+bug/1457487

I`ll try to re-verify without exceeding disc space.

Revision history for this message

Stanislav Makar (smakar) wrote on 2015-05-21:

#16

go to any controller and run "ceph -s" before deployment new osd and after
Output should be looks like

ceph -s
    cluster a04e9f78-693f-4b1b-b73f-92af242d002b
     health HEALTH_OK
     monmap e3: 3 mons at {node-1=192.168.0.4:6789/0,node-2=192.168.0.5:6789/0,node-3=192.168.0.6:6789/0}, election epoch 6, quorum 0,1,2 node-1,node-2,node-3
     osdmap e43: 8 osds: 8 up, 8 in
      pgmap v90: 4800 pgs, 12 pools, 13696 kB data, 48 objects
            16749 MB used, 378 GB / 395 GB avail
                4800 active+clean

Could be HEALTH_WARN but not HEALTH_ERR

Revision history for this message

Denis Ipatov (dipatov) wrote on 2015-05-21:

#17

go to any controller and run "ceph -s" before deployment new osd and after

Output should be looks like before deployment:

After deleting the new node it should be (in the example are added 2 disk)
ceph -s
    cluster a04e9f78-693f-4b1b-b73f-92af242d002b
     health HEALTH_OK
     monmap e3: 3 mons at {node-1=192.168.0.4:6789/0,node-2=192.168.0.5:6789/0,node-3=192.168.0.6:6789/0}, election epoch 6, quorum 0,1,2 node-1,node-2,node-3
     osdmap e43: 10 osds: 8 up, 10 in
      pgmap v90: 4800 pgs, 12 pools, 13696 kB data, 48 objects
            16749 MB used, 378 GB / 395 GB avail
                4800 active+clean

Could be HEALTH_WARN but not HEALTH_ERR

Revision history for this message

Denis Ipatov (dipatov) wrote on 2015-05-21:

#18

Short update:
To lose data, you should add OSDs more than your replication factor.
For example: you have replication factor 3 and you should add 3 nodes.
I not sure what "list" in MOS (node or disk).

You can slack me or send e-mail if you have question.

Revision history for this message

Kyrylo Romanenko (kromanenko) wrote on 2015-05-22:

#19

Now i performed following operations:

1) Created Env and added one 1GB image (Ubuntu 14 desktop iso) and 10GB volume based on image.
Ubuntu 14.04.1, Neutron VLAN
1 Controller + Ceph OSD
1 Compute + Ceph OSD
Replication Factor = 2

2) Checked Ceph status
root@node-7:~# ceph -s
    cluster ce0608f2-f66e-44e1-a359-d7f01e59e547
     health HEALTH_OK
     monmap e1: 1 mons at {node-7=192.168.0.3:6789/0}, election epoch 2, quorum 0 node-7
     osdmap e40: 4 osds: 4 up, 4 in
      pgmap v212: 2496 pgs, 12 pools, 2005 MB data, 427 objects
            14283 MB used, 239 GB / 253 GB avail
                2496 active+clean

3) Added one more Ceph OSD Node and redeployed.
Now we have 3 OSD nodes and RP=2.

4) Rechecked ceph status
# ceph -s
    cluster ce0608f2-f66e-44e1-a359-d7f01e59e547
     health HEALTH_OK
     monmap e1: 1 mons at {node-7=192.168.0.3:6789/0}, election epoch 2, quorum 0 node-7
     osdmap e52: 6 osds: 6 up, 6 in
      pgmap v297: 2496 pgs, 12 pools, 2005 MB data, 427 objects
            16583 MB used, 364 GB / 380 GB avail
                2496 active+clean

5) Performed storage-related OSTF tests, they passed successfully.
6) Launched instance from saved Image.

Revision history for this message

Kyrylo Romanenko (kromanenko) wrote on 2015-05-22:

#20

Verified on
VERSION:
  feature_groups:
    - mirantis
  production: "docker"
  release: "6.1"
  openstack_version: "2014.2.2-6.1"
  api: "1.0"
  build_number: "437"
  build_id: "2015-05-19_10-05-51"
  nailgun_sha: "593c99f2b46cf52b2be6c7c6e182b6ba9f2232cd"
  python-fuelclient_sha: "e19f1b65792f84c4a18b5a9473f85ef3ba172fce"
  astute_sha: "96801c5bccb14aa3f2a0d7f27f4a4b6dd2b4a548"
  fuel-library_sha: "2814c51668f487e97e1449b078bad1942421e6b9"
  fuel-ostf_sha: "9ce1800749081780b8b2a4a7eab6586583ffaf33"
  fuelmain_sha: "68796aeaa7b669e68bc0976ffd616709c937187a"

Kyrylo Romanenko (kromanenko) on 2015-05-22

Changed in fuel:
status:	Fix Committed → Fix Released

Revision history for this message

Denis Ipatov (dipatov) wrote on 2015-05-22:

#21

Your test does not show anything.
You need to have:
1. Working cluster with some data, not only one TestVM image. Best way is emulate production cluster with 40-50% using of space in the cluster.
2. Add several new OSD nodes. Count of the new nodes should be more than replica factor.
3. If you see an error you need to delete these nodes. (For test you can delete the nodes after installing ceph osd. After start rebalancing but before end: in middle of the proccess). More data in Ceph cluster allow find this error easier.

In our case a customer had replication's factor 3 and he added 10 nodes with 12 disks. During installation was an deployment's error. The customer deleted these nodes. He lost around 4-5% of data.

Denis Ipatov (dipatov) on 2015-05-22

Changed in fuel:
status:	Fix Released → In Progress

Revision history for this message

Vladimir Kuklin (vkuklin) wrote on 2015-05-22:

#22

Folks, we had a patch that actually should show a warning about such possible implications. Actually, the best option here is to document this behaviout and fix it in 7.0 if it retains. User can delete nodes waiting for ceph to rebalance after each deletion operation.

Revision history for this message

Stanislav Makar (smakar) wrote on 2015-05-25:

#23

Requirements to check this patch:
- hardware 10 osd nodes
- cluster should is at least - half full, a lot of VMs are running
- Add new ceph osd node >= replication's factor (as dipatov wrote)

For more details please read https://bugs.launchpad.net/fuel/+bug/1374969/comments/3

Changed in fuel:
status:	In Progress → Fix Committed

Report a bug

This report contains Public information

Everyone can see this information.

Duplicate of bug #1430845 Remove

You are

Subscribing...

Edit bug mail

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.