possible data corruption using ceph rdb with caching enabled

Bug #1627775 reported by Evgeny Kozhemyakin
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Mirantis OpenStack
Fix Released
High
MOS Ceph
7.0.x
Won't Fix
High
Alexey Stupnikov
8.0.x
Fix Released
High
Alexey Stupnikov
9.x
Fix Released
High
MOS Ceph

Bug Description

Detailed bug description: spurious page corruptions in SQL Server running on Windows 2012R2 instances. The instance use ceph rbd storage with cache enabled.
The issue is not reproducible on LVM/file based storage.

Steps to reproduce: run SQL Server running on Windows 2012R2 or SQLioSim (stress test utility emulating SQL server)

Expected results: no errors

Actual result:
xpected FileId: 0x0
Received FileId: 0x0
Expected PageId: 0xCB19C
Received PageId: 0xCB19A (does not match expected)
Received CheckSum: 0x9F444071
Calculated CheckSum: 0x89603EC9 (does not match expected)
Received Buffer Length: 0x2000

Reproducibility: steadily reproducable with SQLioSim
Was reproduced in
MOS 6.0
MOS 7.0
MOS 8.0

Workaround: completely disabling rbd cache.
But it's not acceptable due to significant performance degradation.
SQL server cannot keep up with the required transaction rate.

tags: added: customer-found
Changed in mos:
importance: Undecided → Critical
assignee: nobody → MOS Ceph (mos-ceph)
Revision history for this message
Alexei Sheplyakov (asheplyakov) wrote :

The problem looks like the application bug (forgotten fsync() or whatever it is on Windows).
librbd/ceph is also responsible for writing the filesystem metadata, and there are no (guest) filesystem metadata inconsistencies (no kernel panics/BSODs).

Please run a filesystem stress test, preferably a metadata heavy one (for instance, writing a lot of small files in a single directory)

Revision history for this message
Alexei Sheplyakov (asheplyakov) wrote :

There's a work-around, so it's not critical at all

Changed in mos:
importance: Critical → High
tags: added: support
Revision history for this message
Alexei Sheplyakov (asheplyakov) wrote :

For now it's not clear if it's the app fails to flush the data properly, or librbd corrupts the data (the former is most likely since the OS does not complain about inconsistent filesystem metadata). Changed the bug title accordingly

summary: - data corruption using ceph rdb with caching enabled
+ possible data corruption using ceph rdb with caching enabled
description: updated
Revision history for this message
Evgeny Kozhemyakin (ekozhemyakin) wrote :

I've chanched bug's description.

+The issue is not reproducible on LVM/file based storage.

Please note that the workaround is not acceptable in case of SQL transactions.
Restore rate from SQL mirror is unadequate.

+But it's not acceptable due to significant performance degradation.
SQL server cannot keep up with the required transaction rate.

Revision history for this message
Alexei Sheplyakov (asheplyakov) wrote :

> Please note that the workaround is not acceptable in case of SQL transactions.

Can you reproduce the bug with cache=directsync?

> Restore rate from SQL mirror is unadequate.

What's the "SQL mirror"?

> SQL server cannot keep up with the required transaction rate.

Using ceph as a database storage is quite challening and requires proper planning and tuning, see
https://www.youtube.com/watch?v=OqlC7S3cUKs

Revision history for this message
Evgeny Kozhemyakin (ekozhemyakin) wrote :

Sorry for being unclear. Let me site our customer.

"A few comments for the LP case to explain/comment:

The SQL server running in Openstack is the passive slave node in an MSSQL cluster. What happens when we switch to directsync mode is that the openstack hosted mirror node cannot keep up commiting the stream of transactions received from the master. This cluster had at the time a very moderate transaction rate and I guess it may have required 100-200 iops to get by.

Due to this we cannot say if corruptions would persist in SQL server using cache=directsync but as mentioned before, we cannot reproduce the bug when using the SQLiosim tool in a much better performing test environment.

We already know that we probably need to do work improving storage performance but the storage being slow should not cause data to be corrupted (at least not sliletly) and the errors can be reproduced in SQLiosim running solitary in pure SSD ceph pools where performance is good."

Revision history for this message
Evgeny Kozhemyakin (ekozhemyakin) wrote :

Guys could we please elevate importance level? This is a really critical issue.

tags: added: area-ceph
Revision history for this message
Victor Denisov (vdenisov) wrote :

So far the theory is that we've encountered a race condition in qemu
process. Due to high latency of ceph storage(compared to local hard
drive) MSSQL(or qemu) supposedly forgets to flush some data.
It works fine with local hard drives, but inevitably leads to an issue
with high latency devices.

Revision history for this message
Evgeny Kozhemyakin (ekozhemyakin) wrote :
Revision history for this message
Alexei Sheplyakov (asheplyakov) wrote :
Revision history for this message
Rodion Tikunov (rtikunov) wrote :

Patch https://review.fuel-infra.org/#/c/25721/ has merged. So - fix commited

tags: added: on-verification
Revision history for this message
TatyanaGladysheva (tgladysheva) wrote :

Verified on 9.2 snapshot #537.

Ceph was updated to 0.94.9 version:
root@node-4:~# dpkg -l | grep ceph | grep 0.94.9
ii ceph 0.94.9-1~u14.04+mos1 amd64 distributed storage and file system
ii ceph-common 0.94.9-1~u14.04+mos1 amd64 common utilities to mount and interact with a ceph storage cluster
ii libcephfs1 0.94.9-1~u14.04+mos1 amd64 Ceph distributed file system client library
ii python-ceph 0.94.9-1~u14.04+mos1 all Meta-package for python libraries for the Ceph libraries
ii python-cephfs 0.94.9-1~u14.04+mos1 amd64 Python libraries for the Ceph libcephfs library

tags: removed: on-verification
Revision history for this message
Alexey Stupnikov (astupnikov) wrote :

I have spoke with Alexey Sheplyakov and Denis Meltsaykin and concluded that we shouldn't merge a fix to stable/7.0 branch. Proposed patch contain a lot of changes, and could break existing installations if existing nodes aren't updated properly, or new nodes are deployed without updating the old ones.

If ceph cluster deployed with Fuel7 is to be updated, new packages can be downloaded from [1].

[1] http://perestroika-repo-tst.infra.mirantis.net/review/LP-1532882/mos-repos/ubuntu/7.0/

Revision history for this message
Alexey Stupnikov (astupnikov) wrote :

It looks like patch to be merged to stable/8.0 shouldn't break anything. I will nominate it to the next MU.

Revision history for this message
Alexey Stupnikov (astupnikov) wrote :

Changed 7.0-updates status to Won't Fix, as updated ceph packages will not be shipped with next MU. Workaround is to install them properly from temporary build system.

Revision history for this message
Alexey Stupnikov (astupnikov) wrote :

MOS-linux team would like to review patches with successfull systest, so need to troubleshoot what is wrong with https://packaging-ci.infra.mirantis.net/job/8.0-pkg-systest-ubuntu/3397/

Repo to check: fuel-infra/jenkins-jobs

Revision history for this message
Fuel Devops McRobotson (fuel-devops-robot) wrote : Fix merged to packages/trusty/ceph (8.0)

Reviewed: https://review.fuel-infra.org/27739
Submitter: Pkgs Jenkins <email address hidden>
Branch: 8.0

Commit: 35f0e943a0b7f4b57e3196031442f2808f304e22
Author: Alexei Sheplyakov <email address hidden>
Date: Thu Mar 23 08:29:29 2017

Fix possible rbd data corruption

Fixes http://tracker.ceph.com/issues/17545

Closes-bug: #1627775
Change-Id: Ia016914438da8ff649c86e0d1c46de728fa23707

tags: added: on-verification
Revision history for this message
TatyanaGladysheva (tgladysheva) wrote :

Verified on 8.0 + MU4 updates.

Ceph was updated to 0.94.5-0u~u14.04+mos3+mos8.0+3 version:
root@node-17:~# dpkg -l | grep ceph | grep 0.94.5
ii ceph 0.94.5-0u~u14.04+mos3+mos8.0+3 amd64 distributed storage and file system
ii ceph-common 0.94.5-0u~u14.04+mos3+mos8.0+3 amd64 common utilities to mount and interact with a ceph storage cluster
ii libcephfs1 0.94.5-0u~u14.04+mos3+mos8.0+3 amd64 Ceph distributed file system client library
ii python-ceph 0.94.5-0u~u14.04+mos3+mos8.0+3 all Meta-package for python libraries for the Ceph libraries
ii python-cephfs 0.94.5-0u~u14.04+mos3+mos8.0+3 amd64 Python libraries for the Ceph libcephfs library

tags: removed: on-verification
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.