Bug #1430845 “Ceph HIGH IO load when add new OSDs” : Bugs : Fuel for OpenStack

Revision history for this message

Stanislav Makar (smakar) wrote on 2015-03-11:

#1

related bug https://bugs.launchpad.net/fuel/+bug/1415954

OpenStack Infra (hudson-openstack) on 2015-03-11

Changed in fuel:
status:	Confirmed → In Progress

Revision history for this message

Stanislav Makar (smakar) wrote on 2015-03-12:

#2

Here is patch https://review.openstack.org/#/c/163019/

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2015-04-09: Fix merged to fuel-library (master)

#3

Reviewed: https://review.openstack.org/163019
Committed: https://git.openstack.org/cgit/stackforge/fuel-library/commit/?id=c52d4fc377efe1134e8be81a18560c0a6e0138c3
Submitter: Jenkins
Branch: master

commit c52d4fc377efe1134e8be81a18560c0a6e0138c3
Author: Stanislav Makar <email address hidden>
Date: Tue Mar 10 14:23:48 2015 +0000

Decrease I/O load when adding/removing OSD nodes

* Set options osd_max_backfills and osd_recovery_max_active
* Fix the options which use white space instead of underscore

    Closes-bug: #1430845
    Related-bug: #1374969
    Change-Id: I3cdcec6c5bd39e5cbc55ecf8a29b751c784851a0

Changed in fuel:
status:	In Progress → Fix Committed

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2015-04-20: Fix proposed to fuel-library (stable/6.0)

#4

Fix proposed to branch: stable/6.0
Review: https://review.openstack.org/175364

Revision history for this message

Mykola Golub (mgolub) wrote on 2015-05-26:

#5

Verified on

{
    "build_id": "2015-05-19_10-05-51",
    "build_number": "437",
    "auth_required": true,
    "fuel-ostf_sha": "9ce1800749081780b8b2a4a7eab6586583ffaf33",
    "fuel-library_sha": "2814c51668f487e97e1449b078bad1942421e6b9",
    "nailgun_sha": "593c99f2b46cf52b2be6c7c6e182b6ba9f2232cd",
    "openstack_version": "2014.2.2-6.1",
    "production": "docker",
    "api": "1.0",
    "python-fuelclient_sha": "e19f1b65792f84c4a18b5a9473f85ef3ba172fce",
    "astute_sha": "96801c5bccb14aa3f2a0d7f27f4a4b6dd2b4a548",
    "fuelmain_sha": "68796aeaa7b669e68bc0976ffd616709c937187a",
    "feature_groups": [
        "mirantis"
    ],
    "release": "6.1",
    "release_versions": {
        "2014.2.2-6.1": {
            "VERSION": {
                "build_id": "2015-05-19_10-05-51",
                "build_number": "437",
                "fuel-library_sha": "2814c51668f487e97e1449b078bad1942421e6b9",
                "nailgun_sha": "593c99f2b46cf52b2be6c7c6e182b6ba9f2232cd",
                "fuel-ostf_sha": "9ce1800749081780b8b2a4a7eab6586583ffaf33",
                "production": "docker",
                "api": "1.0",
                "python-fuelclient_sha": "e19f1b65792f84c4a18b5a9473f85ef3ba172fce",
                "astute_sha": "96801c5bccb14aa3f2a0d7f27f4a4b6dd2b4a548",
                "fuelmain_sha": "68796aeaa7b669e68bc0976ffd616709c937187a",
                "feature_groups": [
                    "mirantis"
                ],
                "release": "6.1",
                "openstack_version": "2014.2.2-6.1"
            }
        }
    }
}

using Juno on Ubuntu 14.04.1.

The settings are set correctly.

Revision history for this message

Kyrylo Romanenko (kromanenko) wrote on 2015-05-26:

#6

Does it works properly in you environment?

I tried to test more complicated case, it looks that images storage did not survived interruption of re-balancing.

Steps:
1. Created Environment:
Ubuntu 14.04.1, Neutron VLAN
1 Controller + Ceph OSD
1 Compute + Ceph OSD
Replication Factor = 2
3 nodes left not allocated.
All this were launched on Virtualbox on my desktop.

2. Filled Glance by Images (they took 70% of Ceph storage)
# df -h
...
/dev/sdc3 64G 44G 21G 69% /var/lib/ceph/osd/ceph-0
/dev/sdb3 64G 45G 19G 71% /var/lib/ceph/osd/ceph-1

Also calculated and saved md5 sums of uploaded images.

3. Then i added 3 new OSD nodes.
4. Deployment of additional nodes passed OK.
5. After new OSD nodes were deployed - Ceph rebalancing started. It heavily loaded hardware resources host machine.
6. For test after some time of re-balancing i have switched-off additional OSD nodes.
7. Check Ceph status
# ceph -s
    cluster 1f5161b7-a500-4757-a486-0b6d44430a15
     health HEALTH_WARN 481 pgs degraded; 25 pgs down; 25 pgs peering; 734 pgs stale; 25 pgs stuck inactive; 734 pgs stuck stale; 506 pgs stuck unclean; recovery 1525/20738 objects degraded (7.354%)
     monmap e1: 1 mons at {node-1=192.168.0.3:6789/0}, election epoch 1, quorum 0 node-1
     osdmap e131: 10 osds: 4 up, 4 in
      pgmap v5033: 2496 pgs, 12 pools, 82212 MB data, 10369 objects
            133 GB used, 119 GB / 253 GB avail
            1525/20738 objects degraded (7.354%)
                 253 stale+active+clean
                1737 active+clean
                  25 down+peering
                 481 stale+active+degraded

8. List images - ok.
Try to download image:
# glance image-download --file ~/dummy${i}.iso dummy${i}.iso
<html><body><h1>504 Gateway Time-out</h1>
The server didn't respond in time.
</body></html>
(HTTP N/A)
Try to list images one more time:
# glance image-list
<html><body><h1>503 Service Unavailable</h1>
No server is available to handle this request.
</body></html>
(HTTP 503)
Horizon also refuses to retrieve images list.

VERSION:
  feature_groups:
    - mirantis
  production: "docker"
  release: "6.1"
  openstack_version: "2014.2.2-6.1"
  api: "1.0"
  build_number: "461"
  build_id: "2015-05-24_10-19-44"
  nailgun_sha: "76441596e4fe6420cc7819427662fa244e150177"
  python-fuelclient_sha: "e19f1b65792f84c4a18b5a9473f85ef3ba172fce"
  astute_sha: "0bd72c72369e743376864e8e8dabfe873d40450a"
  fuel-library_sha: "889c2534ceadf8afd5d1540c1cadbd913c0c8c14"
  fuel-ostf_sha: "9a5f55602c260d6c840c8333d8f32ec8cfa65c1f"
  fuelmain_sha: "5c8ebddf64ea93000af2de3ccdb4aa8bb766ce93"

Does it works properly in you environment?

I tried to test more complicated case, it looks that images storage did not survived interruption of re-balancing.

Steps:
1. Created Environment:
Ubuntu 14.04.1, Neutron VLAN
1 Controller + Ceph OSD
1 Compute + Ceph OSD
Replication Factor = 2
3 nodes left not allocated. 
All this were launched on Virtualbox on my desktop.

2. Filled Glance by Images (they took 70% of Ceph storage)
# df -h
...
/dev/sdc3             64G   44G   21G  69% /var/lib/ceph/osd/ceph-0
/dev/sdb3             64G   45G   19G  71% /var/lib/ceph/osd/ceph-1

Also calculated and saved md5 sums of uploaded images.

3. Then i added 3 new OSD nodes. 
4. Deployment of additional nodes passed OK. 
5. After new OSD nodes were deployed - Ceph rebalancing started. It heavily loaded hardware resources host machine.
6. For test after some time of re-balancing i have switched-off additional OSD nodes. 
7. Check Ceph status 
# ceph -s
    cluster 1f5161b7-a500-4757-a486-0b6d44430a15
     health HEALTH_WARN 481 pgs degraded; 25 pgs down; 25 pgs peering; 734 pgs stale; 25 pgs stuck inactive; 734 pgs stuck stale; 506 pgs stuck unclean; recovery 1525/20738 objects degraded (7.354%)
     monmap e1: 1 mons at {node-1=192.168.0.3:6789/0}, election epoch 1, quorum 0 node-1
     osdmap e131: 10 osds: 4 up, 4 in
      pgmap v5033: 2496 pgs, 12 pools, 82212 MB data, 10369 objects
            133 GB used, 119 GB / 253 GB avail
            1525/20738 objects degraded (7.354%)
                 253 stale+active+clean
                1737 active+clean
                  25 down+peering
                 481 stale+active+degraded

8. List images - ok.
Try to download image: 
# glance image-download --file ~/dummy${i}.iso dummy${i}.iso
<html><body><h1>504 Gateway Time-out</h1>
The server didn't respond in time.
</body></html>
 (HTTP N/A)
Try to list images one more time: 
# glance image-list
<html><body><h1>503 Service Unavailable</h1>
No server is available to handle this request.
</body></html>
 (HTTP 503)
Horizon also refuses to retrieve images list.

VERSION:
  feature_groups:
    - mirantis
  production: "docker"
  release: "6.1"
  openstack_version: "2014.2.2-6.1"
  api: "1.0"
  build_number: "461"
  build_id: "2015-05-24_10-19-44"
  nailgun_sha: "76441596e4fe6420cc7819427662fa244e150177"
  python-fuelclient_sha: "e19f1b65792f84c4a18b5a9473f85ef3ba172fce"
  astute_sha: "0bd72c72369e743376864e8e8dabfe873d40450a"
  fuel-library_sha: "889c2534ceadf8afd5d1540c1cadbd913c0c8c14"
  fuel-ostf_sha: "9a5f55602c260d6c840c8333d8f32ec8cfa65c1f"
  fuelmain_sha: "5c8ebddf64ea93000af2de3ccdb4aa8bb766ce93"

Revision history for this message

Kyrylo Romanenko (kromanenko) wrote on 2015-05-26:

#7

Diagnostic snapshot.
https://drive.google.com/file/d/0B6E70aHvCcRQWjFyRmVLZWl0Z1k/view?usp=sharing

Revision history for this message

Stanislav Makar (smakar) wrote on 2015-05-27:

#8

5. After new OSD nodes were deployed - Ceph rebalancing started. It heavily loaded hardware resources host machine.

You have got above due to you ran it on virtual env.
This patch requires to be verified on HW nodes.

Revision history for this message

Mykola Golub (mgolub) wrote on 2015-05-27:

#9

Failure is expected in your case. You had replication factor 2 (while 3 is recommended for production), you ended up with the cluster that had 10 osd. but only 4 was up and in, so there was a large probability of both replica being on the lost nodes.

It is not supposed, that after adding a node, any other changes with cluster are performed until the rebalancing is complete. Also it is recommened remove osds one by one.

The proposed ceph.conf options decrease load during rebalancing and recovery still you don't have to expect much from clusters like yours (severals OSDs under virtual machines).

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2015-06-15: Fix merged to fuel-library (stable/6.0)

#10

Reviewed: https://review.openstack.org/175364
Committed: https://git.openstack.org/cgit/stackforge/fuel-library/commit/?id=cfa8b455e3801c417e5bea63d1e5bdc890f57df0
Submitter: Jenkins
Branch: stable/6.0

commit cfa8b455e3801c417e5bea63d1e5bdc890f57df0
Author: Stanislav Makar <email address hidden>
Date: Tue Mar 10 14:23:48 2015 +0000

Decrease I/O load when adding/removing OSD nodes

* Set options osd_max_backfills and osd_recovery_max_active
* Fix the options which use white space instead of underscore

    Closes-bug: #1430845
    Related-bug: #1374969
    Change-Id: I3cdcec6c5bd39e5cbc55ecf8a29b751c784851a0

Revision history for this message

Miroslav Anashkin (manashkin) wrote on 2015-07-30:

#11

This issue highly affects our existing customers with growing Ceph clusters.
However - fix is trivial.

Re-targeted back-port from 6.0.1 to the next 6.0 and 5.1.1 Maintenance Updates.
Re-assigned to MOS-Sustaining team.

tags:

added: customer-found

Revision history for this message

Vitaly Sedelnik (vsedelnik) wrote on 2015-08-05:

#12

We don't support master node updates in 5.1/5.1.1 and 6.0 maintenance updates. Setting to Invalid for 5.1.1 and 6.0 updates.

Revision history for this message

Vitaly Sedelnik (vsedelnik) wrote on 2015-09-03:

#13

As requested below is detailed explanation why the fix merged to stable/6.0 branch of fuel-libary cannot be included into maintenance updates for 6.0.

First, puppet manifests in 6.0 are not packaged so there is no way to deliver updated manifests to master node via packages / updates repositories.
Second, if we apply the proposed change to puppet manifests on master node it will affect all deployments and will be applied on every operation like adding/removing controller, OSD node, etc. There is high risk of unexpected rebalancing of Ceph nodes within the cluster after applying such updates which is not expected and desired.

That's why I would recommend to fix the Ceph config files on the affected deployment and update puppets if needed but don't include this change into maintenance updates.

Roman Rufanov (rrufanov) on 2015-09-16

tags:

added: support

	Status	Importance	Assigned to	Milestone
Fuel for OpenStack	Fix Committed	High	Stanislav Makar	Fuel for OpenStack 6.1
5.1.x	Won't Fix	High	MOS Maintenance	Fuel for OpenStack 5.1.1-updates
6.0.x	Won't Fix	High	MOS Maintenance	Fuel for OpenStack 6.0-updates

Fuel for OpenStack

Ceph HIGH IO load when add new OSDs

Bug Description

Duplicates of this bug

Other bug subscribers

Remote bug watches