Spawn may fail when cache=none on block device with logical block size > 512
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
OpenStack Compute (nova) |
Fix Released
|
Medium
|
Dr. Jens Harbott | ||
Ocata |
Fix Committed
|
Medium
|
Elod Illes | ||
Pike |
Fix Committed
|
Medium
|
Dr. Jens Harbott | ||
Queens |
Fix Committed
|
Medium
|
Dr. Jens Harbott | ||
Rocky |
Fix Committed
|
Medium
|
Dr. Jens Harbott |
Bug Description
Description
===========
When we spawn instances without cache enabled (cache='none') on a file system
there a check in nova code that test if file system support direct IO:
https:/
Because this test use 512b alignment size it seems to failed on newer block device that have
logical block size > 512b like nvme:
parted /dev/nvme0n1 print | grep "Sector size"
Sector size (logical/physical): 4096B/4096B
reason should be that alignement size of direct io must be a multiple of logical block size of underlying device (not of fs block size) as explain here:
http://
O_DIRECT
...
Under Linux 2.4, transfer sizes, and the alignment of the user buffer
and the file offset must all be multiples of the logical block size
of the filesystem. Since Linux 2.6.0, alignment to the logical block
size of the underlying storage (typically 512 bytes) suffices
Because this test failed, it fallbacks value of cache to "writethrough" which have following consequences:
1) qemu run without direct io even device/fs support but with higher block size
2) qemu failed to start because cache=writethrough may conflict with other dev paramer like "io=native": with the following message:
2018-08-22 20:50:41.226 80512 ERROR oslo_messaging.
2018-08-22 20:50:41.226 80512 ERROR oslo_messaging.
2018-08-22 20:50:41.226 80512 ERROR oslo_messaging.
Steps to reproduce
==================
to reproduce spawn issue:
having instances on fs with block device with logical block size > 512b (typically nvme with 4096 8192 sector size)
nova.conf with:
images_type=raw
preallocate_
Solution
========
Can we consider increasing align_size from 512b to 8192b as it will work on most cases?
Is there any other reason to keep 512b ?
Set it to 4096 or 8192 fix the issue in my environment.
Environment
===========
I met the issue on newton, but same check with 512b exists on master.
summary: |
- Spawn may failed when cache=none on block device with logical block size - > 512 + Spawn may fail when cache=none on block device with logical block size > + 512 |
Changed in nova: | |
importance: | High → Medium |
Changed in nova: | |
assignee: | Dr. Jens Harbott (j-harbott) → melanie witt (melwitt) |
Changed in nova: | |
assignee: | melanie witt (melwitt) → Dr. Jens Harbott (j-harbott) |
We are affected by this on stable/pike and stable/queens now after we had to enable preallocate_ images= space in order to avoid some out-of-space issues. Have been running on nvme for local storage with block size 4096 for some time for performance reasons.
I haven't seen any disks with block size 4096 yet, so setting the value to 4096 would be fine for us.