Bug #1462451 “deploy is failed. A lot of ceph OSDs are in down s... : Series 6.1.x : Bugs : Fuel for OpenStack

Nastya Urlapova (aurlapova) on 2015-06-05

Changed in fuel:
milestone:	none → 6.1
assignee:	nobody → Fuel Library Team (fuel-library)

Dmitry Ilyin (idv1985) on 2015-06-05

Changed in fuel:
status:	New → Confirmed
importance:	Undecided → High

Revision history for this message

Ryan Moe (rmoe) wrote on 2015-06-05:

#1

Could you please provide a diagnostic snapshot?

Revision history for this message

Dmitry Ilyin (idv1985) wrote on 2015-06-05:

#2

First, about one third of the Ceph nodes were down. After "service ceph restart" they went up.

Not this scripts dunps inactive PGs and find a lot of them.
> There are PGs which are not in active state!

Revision history for this message

Ryan Moe (rmoe) wrote on 2015-06-05:

#3

ceph-osd.log.gz Edit (533.0 KiB, application/octet-stream)

Revision history for this message

Alexei Sheplyakov (asheplyakov) wrote on 2015-06-06:

#4

> configuration: Baremetal

cat /proc/cpuinfo

sudo lspci -vvv

Leontii Istomin (listomin) on 2015-06-07

description:

updated

Leontii Istomin (listomin) on 2015-06-08

summary:

- deploy is failed. A lot of ceph OSDs is in down state.
+ deploy is failed. A lot of ceph OSDs are in down state.

Revision history for this message

Leontii Istomin (listomin) wrote on 2015-06-08:

#5

lspci_cpuinfo.tar.gz Edit (16.6 KiB, application/x-tar)

lspci and cpuinfo from two types of nodes in the env

Revision history for this message

Leontii Istomin (listomin) wrote on 2015-06-08:

#6

has been reproduced with 511 ISO and configuration:
Baremetal,Centos,IBP,HA,Neutron-gre,Ceph-all,Nova-debug,Nova-quotas,6.1_511
Controllers:3 Computes:200
Diagnostic Snapshot: http://mos-scale-share.mirantis.com/fuel-snapshot-2015-06-07_22-52-31.tar.xz

Dmytro Iurchenko (diurchenko) on 2015-06-08

Changed in fuel:
assignee:	Fuel Library Team (fuel-library) → Dmytro Iurchenko (diurchenko)

Revision history for this message

Dmytro Iurchenko (diurchenko) wrote on 2015-06-08:

#7

The attempt to reproduce the bug with significantly fewer number of placement groups is in progress.

Revision history for this message

Stanislav Makar (smakar) wrote on 2015-06-09:

#8

We have successfully deployed with decreased number of pg_num (128 per pool ).
Then we increased it manually -- all is working.

So as we see in our case ceph cluster can not handle PGs creation during adding new OSDs due to big number of pg_num and OSDs :). To investigate why we need this cluster and hence more time.

Now we have some quick solutions:
1. Mykola Golub says that we have incorrect calculation formula for pg_num and he already has blueprint(https://blueprints.launchpad.net/fuel/+spec/ceph-osd-pool-default-pg-num).
We would like to try to deploy with Mykolas formula and see whether it fixes the problem.
2. Also we have an option to postpone ceph pool creation to time when deployment of ceph OSDs is finished (post deployment task). It cant prevent ceph cluster from handling cases when OSD are added and PGs are greated.
We would like to try it too.

If we do it we will make decision and prepare the patch.

Revision history for this message

Mykola Golub (mgolub) wrote on 2015-06-09:

#9

The cluster hanged with many placement groups stuck in state creating, with the following errors in logs:

2015-06-07 22:57:48.238453 7fd1124fe700 0 log [WRN] : slow request 2571.770231 seconds old, received at 2015-06-07 22:14:56.468107: osd_pg_create(pg0.1c,.... pg6.1ed0,5; ) v2 currently wait for new map

The problem was not reproduced when we hardcoded osd_pool_default_pg_num and osd_pool_default_pgp_num to 128 instead of allowing fuel to calculate it based on number of OSDs (8192 for the cluster of this size).

Although the root cause of the hang is not found (it might be a limit/timeout we stepped in ceph or OS , when large amount of placement groups are being created), there are some improvements for fuel that should improve the situation when deploying large clusters:

1) The formula for calculating pg number should be changed, giving values 10 times lower than currently for large clusters. Apart this issue, the overestimated number of pgs causes other issues, while the pg number is impossible to decrease after the pool creation:

https://blueprints.launchpad.net/fuel/+spec/ceph-osd-pool-default-pg-num

2) Pools are created after controller nodes are deployed, but before OSDs are deployed. As a result if the pools have large pg num, huge number of PGs are in creating state, then OSD nodes are staring to be added, large number of PGs being created on the nodes that are deployed first, then new OSD appears and PGs should be moved. This process is not optimal and it is much less stressful for the cluster to create PGs after all OSDs are deployed and in IN and UP state, so no reballancing will ok and "early" ODS are not overloaded with placement groups.

Revision history for this message

Dmytro Iurchenko (diurchenko) wrote on 2015-06-10:

#10

Stanislav Makar is going to try out the second way (create pools after the OSD nodes are added).
If it won't work out, then the formula of PG num calculation will be altered as Mykola Golub said.

Revision history for this message

Leontii Istomin (listomin) wrote on 2015-06-10:

#11

Reproduced with Ubuntu
Baremetal,Ubuntu,IBP,HA,Neutron-gre,Ceph-all,Nova-debug,Nova-quotas,6.1_521
Controllers:3 Computes:200

root@node-37:~# ceph -s
    cluster 6292c17b-39a9-45c9-9a01-6161eb72f816
     health HEALTH_WARN 234 pgs peering; 32872 pgs stuck inactive; 32872 pgs stuck unclean
     monmap e3: 3 mons at {node-37=192.168.0.40:6789/0,node-42=192.168.0.45:6789/0,node-56=192.168.0.59:6789/0}, election epoch 6, quorum 0,1,2 node-37,node-42,node-56
     osdmap e68: 200 osds: 200 up, 200 in
      pgmap v461: 32960 pgs, 7 pools, 0 bytes data, 0 objects
            407 GB used, 181 TB / 181 TB avail
                  33 inactive
               32605 creating
                  66 peering
                 168 creating+peering
                  88 active+clean

Diagnostic Snapshot: http://mos-scale-share.mirantis.com/fuel-snapshot-2015-06-10_16-39-02.tar.xz

OpenStack Infra (hudson-openstack) on 2015-06-11

Changed in fuel:
assignee:	Dmytro Iurchenko (diurchenko) → Stanislav Makar (smakar)
status:	Confirmed → In Progress

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2015-06-12: Fix proposed to fuel-library (stable/6.1)

#12

Fix proposed to branch: stable/6.1
Review: https://review.openstack.org/190953

Revision history for this message

Mykola Golub (mgolub) wrote on 2015-06-12:

#13

Moving pool creation to later stage (after osd are added) is a right change.

Still I think decreasing ceph default value for pool placement groups is also important. I filled a separate bug for this change:

https://bugs.launchpad.net/fuel/+bug/1464656

Nastya Urlapova (aurlapova) on 2015-06-15

tags:

added: 6.1rc2

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2015-06-15: Fix merged to fuel-library (master)

#14

Reviewed: https://review.openstack.org/190189
Committed: https://git.openstack.org/cgit/stackforge/fuel-library/commit/?id=0b0d8d8b1182c97276a32d0fb80d2f382ed79a78
Submitter: Jenkins
Branch: master

commit 0b0d8d8b1182c97276a32d0fb80d2f382ed79a78
Author: Stanislav Makar <email address hidden>
Date: Wed Jun 10 13:15:16 2015 +0000

Fix the problem with ceph deployment on scale lab

    Postpone ceph pool creation to post deploy:
    * Add task for ceph pool creation and put it in post deploy
    * Change ceph/compute.pp and move to post deploy
    * Remove from ceph/manifests/init.pp the pool creation code

Closes-bug: #1462451
Change-Id: Iee72e5f8e59c3ced0ba0d7f971380e5932cbb0fc

Changed in fuel:
status:	In Progress → Fix Committed

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2015-06-15: Fix merged to fuel-library (stable/6.1)

#15

Reviewed: https://review.openstack.org/190953
Committed: https://git.openstack.org/cgit/stackforge/fuel-library/commit/?id=43b25e4b200c5b994cde81439454d6e2e908a88f
Submitter: Jenkins
Branch: stable/6.1

commit 43b25e4b200c5b994cde81439454d6e2e908a88f
Author: Stanislav Makar <email address hidden>
Date: Wed Jun 10 13:15:16 2015 +0000

Fix the problem with ceph deployment on scale lab

    Postpone ceph pool creation to post deploy:
    * Add task for ceph pool creation and put it in post deploy
    * Change ceph/compute.pp and move to post deploy
    * Remove from ceph/manifests/init.pp the pool creation code

Closes-bug: #1462451
Change-Id: Iee72e5f8e59c3ced0ba0d7f971380e5932cbb0fc

Revision history for this message

Dmitry Borodaenko (angdraug) wrote on 2015-06-16:

#16

This should not have been merged to stable/6.1, the change is too disruptive for post-hard-code-freeze. Please revert and merge a fix for bug #1464656 instead (change pg_num calculation).

Updated the 7.0.x status to Fix Committed reflect the fact that this was merged to master branch, not just stable/6.1.

Revision history for this message

Stanislav Makar (smakar) wrote on 2015-06-16:

#17

@dborodaenko before try this fix we tried 1024 per pool and it did not help, it was the same
Then we tried this patch, it fixed the problem even with too high pg_num

QA folks can prove it

Revision history for this message

Dan Hata (dhata) wrote on 2015-06-16:

#18

For Eugene Bogdanov

Clear steps to reproduce and expected result vs actual result
Deployment of ceph nodes through Fuel with more than 200 drives will fail.

Rough estimate of the probability of user facing the issue
This works fine with 50 drives but fails with 200 drives. We have not tested the number in between. We do know that it is 100% reproducible.

What is the real user facing impact / severity and is there a workaround available?
IMPACT: The user will experience a failed Ceph deployment.
WORKAROUND: The user can manually configure the placement groups and deployment to get around this.

Can we deliver the fix later and apply it easy on running env?
Yes, we are first experimenting with deploying the minimal number of placement groups by dividing the number by 10. We are also exploring a more complex issue that will delay the creation of placement groups. This fix will take more testing and though.

Revision history for this message

Bogdan Dobrelya (bogdando) wrote on 2015-06-18:

#19

The patch https://review.openstack.org/192919/ reverts https://review.openstack.org/190953, hence returning original issue to the confirmed state again for the 6.1

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2015-06-18: Fix proposed to fuel-library (stable/6.1)

#20

Fix proposed to branch: stable/6.1
Review: https://review.openstack.org/193076

Revision history for this message

Vitaly Sedelnik (vsedelnik) wrote on 2015-06-24:

#21

Targeted to 6.1-updates because 6.1 GA was rolled out yesterday. The fix should be accompanied with errata information and go through full patching process.

Eugene Bogdanov (ebogdanov) on 2015-06-29

tags:

added: 6.1-mu-1

Revision history for this message

Stanislav Makar (smakar) wrote on 2015-06-30:

#22

new patches are here
https://review.openstack.org/#/q/I05b53042e24da8cb1693049bd95e682c8903c812,n,z
waiting to be tested on scale lab

OpenStack Infra (hudson-openstack) on 2015-07-01

Changed in fuel:
assignee:	Leontiy Istomin (listomin) → Sergii Golovatiuk (sgolovatiuk)

Revision history for this message

Viktoria Efimova (vefimova) wrote on 2015-07-02:

#23

TESTED: Deployment with 200 Ceph nodes setting passed successfuly with applied patch.

Revision history for this message

Sergii Golovatiuk (sgolovatiuk) wrote on 2015-07-02:

#24

Viktoria, Last time we ran into issues on operation phase. Live migration didn't work. That was the reason why the patch was rejected from 6.1. Could you test functionality like live migration or ephemeral storage to ensure ceph works as expected. Thanks a lot.

Revision history for this message

Leontii Istomin (listomin) wrote on 2015-07-03:

#25

rally_report_and_rally_logs.tar.gz Edit (1.3 MiB, application/x-tar)

We performed boot_and_migrate_server and boot_server_from_volume_and_live_migrate rally scenarios with the fix. This tests have passed successfully. rally report and rally logs are attached

OpenStack Infra (hudson-openstack) on 2015-07-06

Changed in fuel:
assignee:	Sergii Golovatiuk (sgolovatiuk) → Stanislav Makar (smakar)

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2015-07-08: Change abandoned on fuel-library (master)

#26

Change abandoned by Stanislav Makar (<email address hidden>) on branch: master
Review: https://review.openstack.org/198735
Reason: this patch is included into
https://review.openstack.org/#/c/195468/2

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2015-07-09: Fix merged to fuel-library (master)

#27

Reviewed: https://review.openstack.org/195468
Committed: https://git.openstack.org/cgit/stackforge/fuel-library/commit/?id=bf5ec482cfdf6ec412a5c7685113936e750f582a
Submitter: Jenkins
Branch: master

commit bf5ec482cfdf6ec412a5c7685113936e750f582a
Author: Stanislav Makar <email address hidden>
Date: Wed Jun 10 13:15:16 2015 +0000

Fix the problem with ceph deployment on scale lab

    Postpone ceph pool creation to post deploy:
    * Add task for ceph pool creation and put it in post deploy
    * Change ceph/compute.pp and move to post deploy
    * Remove from ceph/manifests/init.pp the pool creation code
    * Add NOOP tests

Change-Id: I05b53042e24da8cb1693049bd95e682c8903c812
Closes-bug: #1462451

Changed in fuel:
status:	In Progress → Fix Committed

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2015-08-18: Change abandoned on fuel-library (stable/6.1)

#28

Change abandoned by Fuel DevOps Robot (<email address hidden>) on branch: stable/6.1
Review: https://review.openstack.org/193076
Reason: This review is > 4 weeks without comment and currently blocked by a core reviewer with a -2. We are abandoning this for now. Feel free to reactivate the review by pressing the restore button and contacting the reviewer with the -2 on this review to ensure you address their concerns.

Alexander Arzhanov (aarzhanov) on 2015-09-04

tags:

added: on-verification

Revision history for this message

Alexander Arzhanov (aarzhanov) wrote on 2015-09-07:

#29

Download full text (3.3 KiB)

Verified on ISO#286:

api: '1.0'
astute_sha: 8283dc2932c24caab852ae9de15f94605cc350c6
auth_required: true
build_id: '286'
build_number: '286'
feature_groups:
- mirantis
fuel-agent_sha: 082a47bf014002e515001be05f99040437281a2d
fuel-library_sha: ff63a0bbc93a3a0fb78215c2fd0c77add8dfe589
fuel-nailgun-agent_sha: d7027952870a35db8dc52f185bb1158cdd3d1ebd
fuel-ostf_sha: 1f08e6e71021179b9881a824d9c999957fcc7045
fuelmain_sha: 9ab01caf960013dc882825dc9b0e11ccf0b81cb0
nailgun_sha: 5c33995a2e6d9b1b8cdddfa2630689da5084506f
openstack_version: 2015.1.0-7.0
production: docker
python-fuelclient_sha: 1ce8ecd8beb640f2f62f73435f4e18d1469979ac
release: '7.0'
release_versions:
  2015.1.0-7.0:
    VERSION:
      api: '1.0'
      astute_sha: 8283dc2932c24caab852ae9de15f94605cc350c6
      build_id: '286'
      build_number: '286'
      feature_groups:
      - mirantis
      fuel-agent_sha: 082a47bf014002e515001be05f99040437281a2d
      fuel-library_sha: ff63a0bbc93a3a0fb78215c2fd0c77add8dfe589
      fuel-nailgun-agent_sha: d7027952870a35db8dc52f185bb1158cdd3d1ebd
      fuel-ostf_sha: 1f08e6e71021179b9881a824d9c999957fcc7045
      fuelmain_sha: 9ab01caf960013dc882825dc9b0e11ccf0b81cb0
      nailgun_sha: 5c33995a2e6d9b1b8cdddfa2630689da5084506f
      openstack_version: 2015.1.0-7.0
      production: docker
      python-fuelclient_sha: 1ce8ecd8beb640f2f62f73435f4e18d1469979ac
      release: '7.0'

#########################################
id | status | name | cluster | ip | mac | roles | pending_roles | online | group_id
---|--------|------------------|---------|------------|-------------------|----------------------|---------------|--------|---------
4 | ready | Untitled (23:2b) | 1 | 10.109.0.7 | 64:98:a9:2b:23:2b | ceph-osd, compute | | True | 1
5 | ready | Untitled (55:72) | 1 | 10.109.0.5 | 64:c6:67:35:55:72 | ceph-osd, compute | | True | 1
2 | ready | Untitled (47:7a) | 1 | 10.109.0.6 | 64:73:7e:a0:47:7a | ceph-osd, controller | | True | 1
3 | ready | Untitled (5b:b0) | 1 | 10.109.0.3 | 64:6b:8a:5b:5b:b0 | ceph-osd, controller | | True | 1
1 | ready | Untitled (41:ce) | 1 | 10.109.0.4 | 64:d3:e7:8d:41:ce | ceph-osd, controller | | True | 1
#########################################

#########################################
root@node-1:~# ceph osd tree
# id weight type name up/down reweight
-1 0.3499 root default
-2 0.04999 host node-3
0 0.04999 osd.0 up 1
-3 0.04999 host node-2
1 0.04999 osd.1 up 1
-4 0.04999 host node-1
2 0.04999 osd.2 up 1
-5 0.09998 host node-5
3 0.04999 osd.3 up 1
5 0.04999 osd.5 up 1
-6 0.09998 host node-4
4 0.04999 osd.4 up 1
6 0.04999 osd.6 up 1
#########################################

#########################################
root@node-1:~# ceph -s
    cluster 6778a6a6-09f6-4e31-a7cd-33e80ea8a806
     health HEALTH_OK
     monmap e3: 3 mons at {node-1=10.109.2.4:6789/0,node-2=10.109.2.6:6789/0,node-3=10.109.2.5:6789/0}, election epoch 4, quorum 0,1,2 node-1,node-3,node-2
     osdmap e39: 7 osds: 7 up, 7 in...

	Status	Importance	Assigned to	Milestone
Fuel for OpenStack	Fix Released	High	Stanislav Makar	Fuel for OpenStack 6.1
6.1.x	In Progress	High	MOS Maintenance	Fuel for OpenStack 6.1-updates
7.0.x	Fix Released	High	Stanislav Makar	Fuel for OpenStack 7.0

Fuel for OpenStack

deploy is failed. A lot of ceph OSDs are in down state.

Bug Description

Other bug subscribers

Bug attachments

Remote bug watches