[backport] Optional storage clear on delete

Bug #1102004 reported by Attila Fazekas on 2013-01-20
This bug affects 2 people
Affects Status Importance Assigned to Milestone
John Griffith

Bug Description

The storage clear on delete has significant I/O impact

On large volume deletion you had to wait too many before you can use your storage space again.

On systems with limited I/O performance , like in the OpenStack gate (test) system it has significant impact on performance and reliability (test timing).

In private clouds where you consider all data in the volume group as your data, you do not need to sanitize the storage.

Please make this method call optionally configurable (default ON), even with a lot of warning.

The device-mapper thin provisioning layer is able to sanitize chunks just on the first access, in this case you do not need to "manually" sanitize on delete. It requires at least 3.2 Linux kernel version, and you can use it with the dmsetup command.
Newer lvm2 packages support it by the lvcreate command too.

 It is configurable in the master branch, but not in the stable/folsom

Yaniv Kaul (ykaul) wrote :

The default should probably still be ON.
What granularity can we have for this config option? Globally makes little sense, but per tenant may not be feasible as well, as tenants may be sharing storage space, no?

Can you expand on the LVM capability?

(Also long term, the storage should do this for us. I hope there's a Cinder API for it).

description: updated
Robert Collins (lifeless) wrote :

Given this is test environment need, having a simple global option should be fine - lets not overcomplicate things.

Robert Collins (lifeless) wrote :

@Attila - how many GB of data gets written during a tempest run? I wonder if there's some other performance bug we haven't identified (e.g. unnecessary fsyncs) at play as well?

Attila Fazekas (afazekas) wrote :

I just changed the description to default on, before I saw your comment.

Tenant are sharing on storage space, so if you have different customers who MUST NOT see each others data, the sanitize MUST be ON, but if you just using the tenants for separate projects, but when all data is yours in a cluster you may not need to sanitize.

The http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=blob;f=Documentation/device-mapper/thin-provisioning.txt;h=30b8b83bd333401a2cc1138d664d6086b4d47aef;hb=HEAD#l227 describes this feature.

"skip_block_zeroing: Skip the zeroing of newly-provisioned blocks."
If you don't turn it on, you get clean blocks, and I/O just happens when it needed.

The new man page describes how to use it with lvm2:

According to this message you need at least lvm2-2.02.89 to try it.

Attila Fazekas (afazekas) wrote :

Tempest create 100 times a 1 GB volume.

The test systems running on a VM with 1 CPU and 4GB ram, probably we have ~1GB for page cache. When it is full it needs to flush the data anyway.

Depending on the VM's storage configuration and the underlying host system's load and capabilities, this can cause a wait type in the test VM, where it can't do anything else on the waiting cpu thread. It can be really bad on a one core VM.

Yaniv Kaul (ykaul) wrote :

Page cache is not going to help when clearing the block storage, which is supposed to use direct IO (or fsync).

Avishay Traeger (avishay-il) wrote :

Clearing the volume for LVM has been made a parameter. It is called "volume_clear" and valid options are "none", "zero", and "shred". I too would be interested if you could share the new LVM capabilities. Changing status of bug since it is fixed, and we may improve on it later.

Changed in cinder:
status: New → Fix Committed
John Griffith (john-griffith) wrote :

As Avishay points out, you can configure the method used via volume_clear, w/ respect to gate tests including tempest, we used to *always* zero out the volumes on delete, however 12.04 introduced an issue with kernel hangs on Ubuntu when trying to do the delete. Since then we've configured the gate tests to NOT perform the secure delete or wipe in tests and to my knowledge that's still the case.

Looking at recent gate jobs delete time on volumes is approximately 2 seconds so I don't see the issue from that perspective.

Changed in cinder:
status: Fix Committed → Invalid
Attila Fazekas (afazekas) wrote :

Thank you. Ohh, I am blind.

summary: - Optional storage clear on delete
+ [backport] Optional storage clear on delete
description: updated
Changed in cinder:
status: Invalid → New
Pavel Sedlák (psedlak) wrote :

Configurable clearing of volume was introduced in https://github.com/openstack/cinder/commit/bb06ebd0f6a75a6ba55a7c022de96a91e3750d20 but I don't know if any other commits/features are also required for this backport.

Changed in cinder:
assignee: nobody → John Griffith (john-griffith)
status: New → Confirmed
importance: Undecided → High
John Griffith (john-griffith) wrote :

Sorry but Folsom was closed out and we couldn't get this in, there won't be any more Folsom releases so we're out of luck, the only option here is to upgrade to Grizzly release.

Changed in cinder:
status: Confirmed → Won't Fix
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers