Highly inefficient storage allocation on ceph-osd nodes

Bug #1585167 reported by Alexei Sheplyakov
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Fuel for OpenStack
Won't Fix
High
Oleksiy Molchanov
8.0.x
Won't Fix
High
MOS Maintenance
Mitaka
Won't Fix
High
Oleksiy Molchanov

Bug Description

Fuel creates OSD journals within the main storage (on the rotating hard drive) even if an SSD is available.

OSD journal is mostly write-only and gets fsync'ed very often. Therefore it should reside on a fast
device (such as an SSD). Also storing the journal in the filesystem (especially a journaling one)
introduces substantial overhead for nothing good at all, therefore a raw block devices (partitions,
logical volumes, etc) are more appropriate for OSD journals.

* Steps to reproduce:

  - Deploy a cluster on several nodes having an SSD and a rotating hard drives, use Ceph for nova, cinder, and glance

* Expected results:

  - Fuel allocates OSD journal partitions on SSD on every OSD node

* Actual results:

  - Fuel allocates the OSD journal as an ordinary file residing in the main OSD storage (on the rotating hard drive)

* Impact:

  - IO to cinder volumes is extremely slow (like ~2 -- 10 MB/sec)
  - Creating, resizing glance images (of a reasonable size < 10 GB) is extremely slow (up to few minutes)
  - Performance and scale tests fail with a timeout, see https://bugs.launchpad.net/mos/+bug/1576669 and similar reports

* Workaround:

  - Manually repartition SSD and move the SSD journal there. Not suitable for automated tests, though.

Revision history for this message
Dmitry Pyzhov (dpyzhov) wrote :

Thank you for report, we aware of the issue and it requires refactoring of volumes manager.

tags: added: area-python module-volumes need-bp
Revision history for this message
Alexei Sheplyakov (asheplyakov) wrote :

> we aware of the issue and it requires refactoring of volumes manager.

Nice. I'll redirect "slow glance/cinder" bug reports (like https://bugs.launchpad.net/mos/+bug/1576669) here, as well as clients experiencing the same problem.

Revision history for this message
Dmitry Pyzhov (dpyzhov) wrote :

Workaround: allocate storage manually.

Revision history for this message
Alexei Sheplyakov (asheplyakov) wrote :

> Workaround: allocate storage manually.

I thought we had a higher software quality standards.

no longer affects: fuel/newton
Revision history for this message
Oleksiy Molchanov (omolchanov) wrote :

Ceph journal can be placed to separate partition/drive through 'Disks configuration' menu for each node. I am not sure that we must make a decision how to use client's storage drives.

Changed in fuel:
status: Confirmed → Won't Fix
Revision history for this message
Alexei Sheplyakov (asheplyakov) wrote :

> I am not sure that we must make a decision how to use client's storage drives

There's no way to not make such a decision for a working OSD needs a journal.
Current decision is to co-locate the journal with the data by default.
Such a decision is not very smart (to put it extremely mildly) and affects both clients *and*
our CI infrastructure.

> status: Confirmed → Won't Fix

No problem. The reprorts regarding listing glance images being so slow will be solved the same way

Revision history for this message
Alexei Sheplyakov (asheplyakov) wrote :

> Ceph journal can be placed to separate partition/drive through 'Disks configuration' menu for each node.

Fuel is certainly the best automation tool on this planet.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.