Increase kernel.pid_max parameter value to meet Ceph requirements

Bug #1536271 reported by Miroslav Anashkin
10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Fuel for OpenStack
Fix Released
High
Maksim Malchuk
6.1.x
Fix Released
High
Maksim Malchuk
7.0.x
Fix Released
High
Maksim Malchuk
8.0.x
Fix Released
High
Maksim Malchuk

Bug Description

Fuel deploys the nodes with the default kernel.pid_max=32768 value.

However, normally Ceph OSD node may require more than 1000 PIDs per OSD in idle state, running on the node. This number of required PIDs increases with the number of placement groups configured and with the load to OSDs.

Lack of allowed PIDs leads to all running OSD restart on the node. Such restart sometimes may trigger cascade OSD node failures.

Related Ceph issue: http://tracker.ceph.com/issues/10988 (please go directly to the several last posts after reading the description)

I propose to change Fuel Library to set kernel.pid_max=4194303 out of the box for each deployed node.

Workaround is to increase the allowed PID number manually, it is possible on the fly with:

`sysctl -w kernel.pid_max=4194303`
`echo 4194303 > /proc/sys/kernel/pid_max`

description: updated
Dmitry Pyzhov (dpyzhov)
tags: added: area-library
no longer affects: fuel/mitaka
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to fuel-library (master)

Fix proposed to branch: master
Review: https://review.openstack.org/270955

Changed in fuel:
status: Confirmed → In Progress
tags: added: team-bugfix
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to fuel-library (master)

Reviewed: https://review.openstack.org/270955
Committed: https://git.openstack.org/cgit/openstack/fuel-library/commit/?id=2cec77da86f2148a9e256bd46d58aaad2de874ea
Submitter: Jenkins
Branch: master

commit 2cec77da86f2148a9e256bd46d58aaad2de874ea
Author: Maksim Malchuk <email address hidden>
Date: Thu Jan 21 20:28:45 2016 +0300

    Increase kernel.pid_max value for ceph-osd nodes

    To meet Ceph requirements and eliminate accidental cascade
    OSD node failures this commit increases kernel.pid_max
    parameter during deploy ceph-osd nodes. Also this commit
    contain some puppet-lint cleanups and validations for this
    module.

    Change-Id: I70a64ff7b85229bf0760be266302d9b32039989a
    Closes-Bug: #1536271
    Closes-Bug: #1537084

Changed in fuel:
status: In Progress → Fix Committed
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to fuel-library (stable/8.0)

Fix proposed to branch: stable/8.0
Review: https://review.openstack.org/272019

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to fuel-library (stable/7.0)

Fix proposed to branch: stable/7.0
Review: https://review.openstack.org/272056

Revision history for this message
Maksim Malchuk (mmalchuk) wrote :

Fix proposed to branch: stable/6.1
Review: https://review.openstack.org/272064

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to fuel-library (stable/6.1)

Reviewed: https://review.openstack.org/272064
Committed: https://git.openstack.org/cgit/openstack/fuel-library/commit/?id=898c29dea9cb5dc0ccd2717158c6e9c238492bc2
Submitter: Jenkins
Branch: stable/6.1

commit 898c29dea9cb5dc0ccd2717158c6e9c238492bc2
Author: Maksim Malchuk <email address hidden>
Date: Thu Jan 21 20:28:45 2016 +0300

    Increase kernel.pid_max value for ceph-osd nodes

    To meet Ceph requirements and eliminate accidental cascade
    OSD node failures this commit increases kernel.pid_max
    parameter during deploy ceph-osd nodes. Also this commit
    contain some puppet-lint cleanups and validations for this
    module.

    Change-Id: I70a64ff7b85229bf0760be266302d9b32039989a
    (cherry-picked from commit 2cec77da86f2148a9e256bd46d58aaad2de874ea)
    Closes-Bug: #1536271
    Closes-Bug: #1537084

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to fuel-library (stable/8.0)

Reviewed: https://review.openstack.org/272019
Committed: https://git.openstack.org/cgit/openstack/fuel-library/commit/?id=d4dbeb56ca8d8aa161f4491e1cc2cd94988ddce0
Submitter: Jenkins
Branch: stable/8.0

commit d4dbeb56ca8d8aa161f4491e1cc2cd94988ddce0
Author: Maksim Malchuk <email address hidden>
Date: Thu Jan 21 20:28:45 2016 +0300

    Increase kernel.pid_max value for ceph-osd nodes

    To meet Ceph requirements and eliminate accidental cascade
    OSD node failures this commit increases kernel.pid_max
    parameter during deploy ceph-osd nodes. Also this commit
    contain some puppet-lint cleanups and validations for this
    module.

    Change-Id: I70a64ff7b85229bf0760be266302d9b32039989a
    (cherry-picked from commit 2cec77da86f2148a9e256bd46d58aaad2de874ea)
    Closes-Bug: #1536271
    Closes-Bug: #1537084

Revision history for this message
Dmitry Belyaninov (dbelyaninov) wrote :

Verified on ISO #507

VERSION:
  feature_groups:
    - mirantis
  production: "docker"
  release: "8.0"
  api: "1.0"
  build_number: "507"
  build_id: "507"
  fuel-nailgun_sha: "8e954abd70ef0083109f34289de2553dcda544d4"
  python-fuelclient_sha: "4f234669cfe88a9406f4e438b1e1f74f1ef484a5"
  fuel-agent_sha: "658be72c4b42d3e1436b86ac4567ab914bfb451b"
  fuel-nailgun-agent_sha: "b2bb466fd5bd92da614cdbd819d6999c510ebfb1"
  astute_sha: "b81577a5b7857c4be8748492bae1dec2fa89b446"
  fuel-library_sha: "ec7e212972ead554f21b52b9e165156665f659df"
  fuel-ostf_sha: "ab5fd151fc6c1aa0b35bc2023631b1f4836ecd61"
  fuel-mirror_sha: "351d568fa3b3e4dd062054b91d766aa54d379867"
  fuelmenu_sha: "234cb4cbb30fbd2df00f388c28f31606d9cae15f"
  shotgun_sha: "63645dea384a37dde5c01d4f8905566978e5d906"
  network-checker_sha: "a43cf96cd9532f10794dce736350bf5bed350e9d"
  fuel-upgrade_sha: "616a7490ec7199f69759e97e42f9b97dfc87e85b"
  fuelmain_sha: "94507c5e4dad6d8cfbd8f5d41aa8389d5335990a"

There is updated max pid value on the Ceph node:

root@node-1:~# sysctl kernel.pid_max
kernel.pid_max = 4194303

tags: added: on-verification
Revision history for this message
Ekaterina Shutova (eshutova) wrote :

Verified on 6.1-mu-5.

root@node-10:~# sysctl kernel.pid_max
kernel.pid_max = 4194303

tags: removed: on-verification
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to fuel-library (stable/7.0)

Reviewed: https://review.openstack.org/272056
Committed: https://git.openstack.org/cgit/openstack/fuel-library/commit/?id=92130ddbccf34e81dc14e7c738500884dc13ac37
Submitter: Jenkins
Branch: stable/7.0

commit 92130ddbccf34e81dc14e7c738500884dc13ac37
Author: Maksim Malchuk <email address hidden>
Date: Thu Jan 21 20:28:45 2016 +0300

    Increase kernel.pid_max value for ceph-osd nodes

    To meet Ceph requirements and eliminate accidental cascade
    OSD node failures this commit increases kernel.pid_max
    parameter during deploy ceph-osd nodes. Also this commit
    contain some puppet-lint cleanups and validations for this
    module.

    Change-Id: I70a64ff7b85229bf0760be266302d9b32039989a
    (cherry-picked from commit 2cec77da86f2148a9e256bd46d58aaad2de874ea)
    Closes-Bug: #1536271
    Closes-Bug: #1537084

tags: added: on-verification
Revision history for this message
Ekaterina Shutova (eshutova) wrote :

Verified on MOS 7.0 mu3 updates.

On node with ceph:
root@node-8:~# sysctl kernel.pid_max
kernel.pid_max = 4194303

tags: removed: on-verification
Revision history for this message
Mikhail Samoylov (msamoylov) wrote :

Verified for ISO: fuel-9.0-200-2016-04-14_08-00-00.iso.torrent

root@node-2:~# sysctl kernel.pid_max
kernel.pid_max = 4194303
root@node-2:~# hiera roles
["ceph-osd", "compute"]
root@node-2:~#

tags: added: on-verification
Changed in fuel:
status: Fix Committed → Fix Released
tags: removed: on-verification
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.