Mounting 4K block size Cinder Volume on instance fails

Bug #1195913 reported by John Griffith
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Fix Released
High
Chet Burgess

Bug Description

When using a 4K block sized volume created in Cinder, it attaches to the instance no problem and partitions can be created and there are no issues with formatting the partition. However when attempting to mount the volume an IO Error occurs and the mount fails.

This behavior has been seen on KVM/Libvirt and happens on various instances (Ubuntu, Cirros, Fedora etc). I suspect there's something in the block-size parameter of the libvirt attach XML that's either missing or incorrect, I'll try looking into this but wanted to get this out incase somebody already knows what might be going on here.

Revision history for this message
John Griffith (john-griffith) wrote :

Looking at this again I *believe* the issue is that we're not doing any specifications on blkio logical_block_size or physical_block_size so it's always passed in as the default, so 4K fails??

I'm wondering if what's needed as an additional option to the attach which specifies these values and an entry in libvirt/config.py to set and build the XML for them.

Matt Riedemann (mriedem)
tags: added: libvirt volumes
Revision history for this message
Chet Burgess (cfb-n) wrote :

John,

I think you are correct. As a quick point of data what is the block size current reported as in one of these effected VMs?

As an example:

[root@cfb-test-1 ]# cat /sys/block/sda/queue/logical_block_size
512
[root@cfb-test-1 ]# cat /sys/block/sda/queue/physical_block_size
512

In order to implement it we would need to get the virtual and physical block size from cinder.

How does cinder store this today? In volume_metadata?

It looks like right now we simply call initialize_connection via cinder client and use the info in that call to build the mapping. If its in metadata with a fixed key name we could update our cinder wrapper to get the metadata and add it to the connection info before passing it along.

Revision history for this message
John Griffith (john-griffith) wrote :

Hey Chet,

Yep, so this is indeed what's going on, (the queue files show 512 by the way).

I was going to hack something in to the config file and more importantly as you point out the block_size in Cinder's provider location. Currently we don't pass anything here, so I'll need to create an item for Cinder and add optional:
    logical_block_size=4096
    physical_block_size=4096

Revision history for this message
John Griffith (john-griffith) wrote :

For reference I've added the Cinder bug: 1196248

Revision history for this message
Chet Burgess (cfb-n) wrote :

John,

Happy to help.

Seems like we need changes to cinder (probably generically and per driver?), some cinder client updates, and the nova updates for libvirt.

Divide and conquer and meet in the middle? I can start working on the nova pieces (doesn't look too difficult), and can help out on the cinder side at your direction?

As this be observed as a problem on any other hypervisors?

Revision history for this message
John Griffith (john-griffith) wrote :

Chet,

Sounds great!! I'll start working on the cinder/cinder-client fixes tomorrow and ping you on IRC. My initial thought on the Cinder side is to add the columns mentioned to the volume reference object. This would be driver specific, not certain how I want to pass that back up to the manager but I'll think about that a bit and keep you updated.

I think it would be ideal if this was added similar to the provider_location column we already use, it would be optional metadata that's triggered by the drivers on volume-create. It would be nice if for default we just don't fill anything in here, and only modify it for devices that request/use something other than 512.

Anyway, thanks a bunch! Divide and Conquer sounds great to me :)

Revision history for this message
Chet Burgess (cfb-n) wrote :

John,

Completely agree with the optional data approach. I will update the libvirt driver to only add the elements to the XML if we have them defined in the volume info. I will start working on the approach in the drivers and will assume that the data will be available via calls we make today. Once we iron out exactly how to get the info from cinder I can update if needed, but I can at least finish up the driver side as discussed.

I'm cburgess on IRC and in the dev and nova channels regularly. I'll catch up with you via IRC on Monday.

Changed in nova:
assignee: nobody → Chet Burgess (cfb-n)
Changed in nova:
status: New → In Progress
Changed in nova:
assignee: Chet Burgess (cfb-n) → Dan Smith (danms)
Chet Burgess (cfb-n)
Changed in nova:
assignee: Dan Smith (danms) → Chet Burgess (cfb-n)
Dan Smith (danms)
Changed in nova:
importance: Undecided → High
milestone: none → havana-3
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (master)

Reviewed: https://review.openstack.org/35214
Committed: http://github.com/openstack/nova/commit/b6eb64497836608a804e69b208bb4312fed0c049
Submitter: Jenkins
Branch: master

commit b6eb64497836608a804e69b208bb4312fed0c049
Author: Chet Burgess <email address hidden>
Date: Fri Jun 28 21:21:36 2013 +0000

    Support setting block size for block devices

    cinder now supports backend with different block sizes. Sometimes
    we need to set a specific block size or operations against the
    backend will result in I/O errors. Additionally its a good idea
    to tune the block sizes to the correct size as it can improve I/O
    performance.

    The basic disk configuration for libvrit now supports setting the
    logical and physical block size of devices. The block size info
    is stored in connection_info of the block_device_mapping table
    for later use.

    By default it does not set the block sizes which defaults to
    512 bytes.

    DocImpact:
    This feature requires libvirt 0.10.2+ and the QEMU or KVM
    hypervisor. If a volume has a custom block size, cinder considers
    it required. As such the hypervisor and libvirt version are
    checked during the attach_volume call. If the hypervisor is not
    supported or the libvirt version is too old the attach is
    aborted and an exception is raised.

    Change-Id: I77adc96b340680218f82bbef8f9ec14711d26a7f
    Fixes Bug #1195913

Changed in nova:
status: In Progress → Fix Committed
Thierry Carrez (ttx)
Changed in nova:
status: Fix Committed → Fix Released
Thierry Carrez (ttx)
Changed in nova:
milestone: havana-3 → 2013.2
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.