Ceph HIGH IO load when add new OSDs

Bug #1430845 reported by Stanislav Makar
16
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Fuel for OpenStack
Fix Committed
High
Stanislav Makar
5.1.x
Won't Fix
High
MOS Maintenance
6.0.x
Won't Fix
High
MOS Maintenance

Bug Description

When there is a big ceph cluster and almost full
And you add a lot of new OSDs to increase capacity it can lead to HIGH IO load and strange behavior during rebalancing

Below options are proposed to add to ceph.conf :
osd max backfills = 1
osd recovery max active = 1

Revision history for this message
Stanislav Makar (smakar) wrote :
Changed in fuel:
status: Confirmed → In Progress
Revision history for this message
Stanislav Makar (smakar) wrote :
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to fuel-library (master)

Reviewed: https://review.openstack.org/163019
Committed: https://git.openstack.org/cgit/stackforge/fuel-library/commit/?id=c52d4fc377efe1134e8be81a18560c0a6e0138c3
Submitter: Jenkins
Branch: master

commit c52d4fc377efe1134e8be81a18560c0a6e0138c3
Author: Stanislav Makar <email address hidden>
Date: Tue Mar 10 14:23:48 2015 +0000

    Decrease I/O load when adding/removing OSD nodes

    * Set options osd_max_backfills and osd_recovery_max_active
    * Fix the options which use white space instead of underscore

    Closes-bug: #1430845
    Related-bug: #1374969
    Change-Id: I3cdcec6c5bd39e5cbc55ecf8a29b751c784851a0

Changed in fuel:
status: In Progress → Fix Committed
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to fuel-library (stable/6.0)

Fix proposed to branch: stable/6.0
Review: https://review.openstack.org/175364

Revision history for this message
Mykola Golub (mgolub) wrote :

Verified on

{
    "build_id": "2015-05-19_10-05-51",
    "build_number": "437",
    "auth_required": true,
    "fuel-ostf_sha": "9ce1800749081780b8b2a4a7eab6586583ffaf33",
    "fuel-library_sha": "2814c51668f487e97e1449b078bad1942421e6b9",
    "nailgun_sha": "593c99f2b46cf52b2be6c7c6e182b6ba9f2232cd",
    "openstack_version": "2014.2.2-6.1",
    "production": "docker",
    "api": "1.0",
    "python-fuelclient_sha": "e19f1b65792f84c4a18b5a9473f85ef3ba172fce",
    "astute_sha": "96801c5bccb14aa3f2a0d7f27f4a4b6dd2b4a548",
    "fuelmain_sha": "68796aeaa7b669e68bc0976ffd616709c937187a",
    "feature_groups": [
        "mirantis"
    ],
    "release": "6.1",
    "release_versions": {
        "2014.2.2-6.1": {
            "VERSION": {
                "build_id": "2015-05-19_10-05-51",
                "build_number": "437",
                "fuel-library_sha": "2814c51668f487e97e1449b078bad1942421e6b9",
                "nailgun_sha": "593c99f2b46cf52b2be6c7c6e182b6ba9f2232cd",
                "fuel-ostf_sha": "9ce1800749081780b8b2a4a7eab6586583ffaf33",
                "production": "docker",
                "api": "1.0",
                "python-fuelclient_sha": "e19f1b65792f84c4a18b5a9473f85ef3ba172fce",
                "astute_sha": "96801c5bccb14aa3f2a0d7f27f4a4b6dd2b4a548",
                "fuelmain_sha": "68796aeaa7b669e68bc0976ffd616709c937187a",
                "feature_groups": [
                    "mirantis"
                ],
                "release": "6.1",
                "openstack_version": "2014.2.2-6.1"
            }
        }
    }
}

using Juno on Ubuntu 14.04.1.

The settings are set correctly.

Revision history for this message
Kyrylo Romanenko (kromanenko) wrote :

Does it works properly in you environment?

I tried to test more complicated case, it looks that images storage did not survived interruption of re-balancing.

Steps:
1. Created Environment:
Ubuntu 14.04.1, Neutron VLAN
1 Controller + Ceph OSD
1 Compute + Ceph OSD
Replication Factor = 2
3 nodes left not allocated.
All this were launched on Virtualbox on my desktop.

2. Filled Glance by Images (they took 70% of Ceph storage)
# df -h
...
/dev/sdc3 64G 44G 21G 69% /var/lib/ceph/osd/ceph-0
/dev/sdb3 64G 45G 19G 71% /var/lib/ceph/osd/ceph-1

Also calculated and saved md5 sums of uploaded images.

3. Then i added 3 new OSD nodes.
4. Deployment of additional nodes passed OK.
5. After new OSD nodes were deployed - Ceph rebalancing started. It heavily loaded hardware resources host machine.
6. For test after some time of re-balancing i have switched-off additional OSD nodes.
7. Check Ceph status
# ceph -s
    cluster 1f5161b7-a500-4757-a486-0b6d44430a15
     health HEALTH_WARN 481 pgs degraded; 25 pgs down; 25 pgs peering; 734 pgs stale; 25 pgs stuck inactive; 734 pgs stuck stale; 506 pgs stuck unclean; recovery 1525/20738 objects degraded (7.354%)
     monmap e1: 1 mons at {node-1=192.168.0.3:6789/0}, election epoch 1, quorum 0 node-1
     osdmap e131: 10 osds: 4 up, 4 in
      pgmap v5033: 2496 pgs, 12 pools, 82212 MB data, 10369 objects
            133 GB used, 119 GB / 253 GB avail
            1525/20738 objects degraded (7.354%)
                 253 stale+active+clean
                1737 active+clean
                  25 down+peering
                 481 stale+active+degraded

8. List images - ok.
Try to download image:
# glance image-download --file ~/dummy${i}.iso dummy${i}.iso
<html><body><h1>504 Gateway Time-out</h1>
The server didn't respond in time.
</body></html>
 (HTTP N/A)
Try to list images one more time:
# glance image-list
<html><body><h1>503 Service Unavailable</h1>
No server is available to handle this request.
</body></html>
 (HTTP 503)
Horizon also refuses to retrieve images list.

VERSION:
  feature_groups:
    - mirantis
  production: "docker"
  release: "6.1"
  openstack_version: "2014.2.2-6.1"
  api: "1.0"
  build_number: "461"
  build_id: "2015-05-24_10-19-44"
  nailgun_sha: "76441596e4fe6420cc7819427662fa244e150177"
  python-fuelclient_sha: "e19f1b65792f84c4a18b5a9473f85ef3ba172fce"
  astute_sha: "0bd72c72369e743376864e8e8dabfe873d40450a"
  fuel-library_sha: "889c2534ceadf8afd5d1540c1cadbd913c0c8c14"
  fuel-ostf_sha: "9a5f55602c260d6c840c8333d8f32ec8cfa65c1f"
  fuelmain_sha: "5c8ebddf64ea93000af2de3ccdb4aa8bb766ce93"

Revision history for this message
Kyrylo Romanenko (kromanenko) wrote :
Revision history for this message
Stanislav Makar (smakar) wrote :

5. After new OSD nodes were deployed - Ceph rebalancing started. It heavily loaded hardware resources host machine.

You have got above due to you ran it on virtual env.
This patch requires to be verified on HW nodes.

Revision history for this message
Mykola Golub (mgolub) wrote :

Failure is expected in your case. You had replication factor 2 (while 3 is recommended for production), you ended up with the cluster that had 10 osd. but only 4 was up and in, so there was a large probability of both replica being on the lost nodes.

It is not supposed, that after adding a node, any other changes with cluster are performed until the rebalancing is complete. Also it is recommened remove osds one by one.

The proposed ceph.conf options decrease load during rebalancing and recovery still you don't have to expect much from clusters like yours (severals OSDs under virtual machines).

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to fuel-library (stable/6.0)

Reviewed: https://review.openstack.org/175364
Committed: https://git.openstack.org/cgit/stackforge/fuel-library/commit/?id=cfa8b455e3801c417e5bea63d1e5bdc890f57df0
Submitter: Jenkins
Branch: stable/6.0

commit cfa8b455e3801c417e5bea63d1e5bdc890f57df0
Author: Stanislav Makar <email address hidden>
Date: Tue Mar 10 14:23:48 2015 +0000

    Decrease I/O load when adding/removing OSD nodes

    * Set options osd_max_backfills and osd_recovery_max_active
    * Fix the options which use white space instead of underscore

    Closes-bug: #1430845
    Related-bug: #1374969
    Change-Id: I3cdcec6c5bd39e5cbc55ecf8a29b751c784851a0

Revision history for this message
Miroslav Anashkin (manashkin) wrote :

This issue highly affects our existing customers with growing Ceph clusters.
However - fix is trivial.

Re-targeted back-port from 6.0.1 to the next 6.0 and 5.1.1 Maintenance Updates.
Re-assigned to MOS-Sustaining team.

tags: added: customer-found
Revision history for this message
Vitaly Sedelnik (vsedelnik) wrote :

We don't support master node updates in 5.1/5.1.1 and 6.0 maintenance updates. Setting to Invalid for 5.1.1 and 6.0 updates.

Revision history for this message
Vitaly Sedelnik (vsedelnik) wrote :

As requested below is detailed explanation why the fix merged to stable/6.0 branch of fuel-libary cannot be included into maintenance updates for 6.0.

First, puppet manifests in 6.0 are not packaged so there is no way to deliver updated manifests to master node via packages / updates repositories.
Second, if we apply the proposed change to puppet manifests on master node it will affect all deployments and will be applied on every operation like adding/removing controller, OSD node, etc. There is high risk of unexpected rebalancing of Ceph nodes within the cluster after applying such updates which is not expected and desired.

That's why I would recommend to fix the Ceph config files on the affected deployment and update puppets if needed but don't include this change into maintenance updates.

Roman Rufanov (rrufanov)
tags: added: support
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.