Cinder volume from image can cause a full fs

Bug #1399427 reported by Ian Cordasco
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack-Ansible
Fix Released
Medium
Hugh Saunders
Juno
Fix Released
Medium
Jesse Pretorius
Trunk
Fix Released
Medium
Hugh Saunders

Bug Description

Opened by cloudnull on 2014-09-24 21:34:17+00:00 at https://github.com/rcbops/ansible-lxc-rpc/issues/166

------------------------------------------------------------

In order to ensure that cinder is available at all times and does not suffer from 100% full disks due to building volumes from images the path /mnt/var/lib/cinder/conversion/ needs to be bind mounted to the host or the cinder_volume container needs to be built with a larger default LV size.

Tags: Backport Potential, bug, in progress, prio:2

====================== COMMENTS ============================

Comment created by jameswthorne on 2014-10-24 16:30:01+00:00

Some additional info: https://gist.github.com/jameswthorne/78dc7fab6a83598aff85

------------------------------------------------------------

Comment created by b3rnard0 on 2014-11-13 15:53:23+00:00

We will need some documentation if this does not get fixed.

------------------------------------------------------------

Comment created by hughsaunders on 2014-11-20 16:55:21+00:00

From @jameswthorne's logs:

```
2014-10-24 12:26:22.901 23976 TRACE oslo.messaging.rpc.dispatcher Command: sudo cinder-rootwrap /etc/cinder/rootwrap.conf dd if=/dev/zero of=/dev/mapper/cinder--volumes-volume--d03e1fce--a2f4--466f--aa7b--a408f6c08c9f count=10240 bs=1M conv=fdatasync
2014-10-24 12:26:22.901 23976 TRACE oslo.messaging.rpc.dispatcher Exit code: 1
2014-10-24 12:26:22.901 23976 TRACE oslo.messaging.rpc.dispatcher Stdout: ''
2014-10-24 12:26:22.901 23976 TRACE oslo.messaging.rpc.dispatcher Stderr: "/bin/dd: error writing '/dev/mapper/cinder--volumes-volume--d03e1fce--a2f4--466f--aa7b--a408f6c08c9f': No space left on device\n204
9+0 records in\n2048+0 records out\n2147483648 bytes (2.1 GB) copied, 2.57972 s, 832 MB/s\n"
```

This failed DDing directly to LV, I'm not sure how adding space to /mnt/var/lib/cinder/conversion will help? Question is, why was DD only able to write just over 2GB to the device while LVS shows that the LV size is 10GB?

------------------------------------------------------------

Comment created by hughsaunders on 2014-11-21 10:10:46+00:00

I am going to discount the DD problem in @jameswthorne's post as I don't have enough information to determine the cause. This issue will now focus on adding space for cinder to convert images. James, please raise another issue if the DD problem persists.

------------------------------------------------------------

Comment created by cloudnull on 2014-11-22 21:55:37+00:00

@hughsaunders - the changes you have in your private repo seem sensible, at commit "22845a6" do you think you could PR them into the master for this issue or do you think we should bind mount back to the host?

------------------------------------------------------------

Comment created by hughsaunders on 2014-11-22 22:52:58+00:00

@cloudnull my commit doesn't work due to https://github.com/ansible/ansible/issues/7844 so I need to refactor to avoid that issue :(

------------------------------------------------------------

Comment created by hughsaunders on 2014-11-25 10:11:18+00:00

Still working on this, but bumped into issue https://github.com/ansible/ansible/issues/8705 (see: https://gist.github.com/hughsaunders/4082a3732b3bf807cec3#file-gistfile2-txt-L8). This was mostly due to me running ansible 1.7.2

Changed in openstack-ansible:
importance: Undecided → Medium
milestone: none → next
Revision history for this message
Kevin Carter (kevin-carter) wrote :
Changed in openstack-ansible:
milestone: next → 10.1.0
status: New → Confirmed
assignee: nobody → Hugh Saunders (hughsaunders)
Changed in openstack-ansible:
status: Confirmed → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to os-ansible-deployment (master)

Reviewed: https://review.openstack.org/139241
Committed: https://git.openstack.org/cgit/stackforge/os-ansible-deployment/commit/?id=a769413d895cd48cec6d8bc0de4cf601e4aabd0c
Submitter: Jenkins
Branch: master

commit a769413d895cd48cec6d8bc0de4cf601e4aabd0c
Author: Hugh Saunders <email address hidden>
Date: Fri Nov 21 11:51:42 2014 +0000

    Enlarge Cinder-Volume container

    Cinder requires temporary working space to convert images. This patch
    exposes cinder_volume_lv_size_gb to the user config file, so the user
    can decide how large the cinder volumes container should be based on
    available space and the size of images that will need to be converted.

    cinder_volume_lv_size_gb is used to override container_lvm_fssize in
    group_vars/cinder_volume. Simple enough but doesn't work because
    templated variables (or indirect variables) are not expanded when
    accessed via hostvars[] see: ansible/ansible#7844. In order to work
    around that, I have eliminated hostvars[] usage from the container
    creation mechanism. This may have positive speed implications as the
    limit of container creation parallelism is now forks rather than number
    of hosts. However it does make this change larger than a small bug fix.

    Also note that this patch makes use of delegate_to, so specific ansible
    versions must be used to avoid ansible/ansible#8705. Our requirements
    file currently specifies a version before this bug was introduced.

    There are two commits in this PR as one is the actual bugfix, the other
    is infrastructure changes required for that bugfix to work. Also only
    the bugfix may be needed if the upstream bugs are fixed.

    Closes-Bug: #1399427
    Change-Id: I2b5c5e692d3d72b603fdd6298475cb76c52c66df

Changed in openstack-ansible:
status: In Progress → Fix Committed
Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Reviewed: https://review.openstack.org/139264
Committed: https://git.openstack.org/cgit/stackforge/os-ansible-deployment/commit/?id=fea671ec1692617d0c64c8b3d27df530bf58c004
Submitter: Jenkins
Branch: master

commit fea671ec1692617d0c64c8b3d27df530bf58c004
Author: Kevin Carter <email address hidden>
Date: Fri Nov 28 12:19:16 2014 -0600

    Changed the container interaction process

    This changes the way that containers are interacted with. With this
    change, container actions are deletgated to the host instead of looping
    through the hacky mess that we were doing. This change will make it
    so that the entire container process is faster.

    This also removes the needs for the "/openstack/monitoring" directory which
    was held over cruft from long ago. This should address the race condition
    when delegating to a host and the monitoring directory attempts to be created
    at the same time on the same host.

    Closes-Bug: #1399427
    Change-Id: Ifaa0fa5719f79180610b4a63d590ca8bc681f87d

Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Reviewed: https://review.openstack.org/139242
Committed: https://git.openstack.org/cgit/stackforge/os-ansible-deployment/commit/?id=96b9b494a46e1e81ffbb15238d84ac7dd646b18c
Submitter: Jenkins
Branch: master

commit 96b9b494a46e1e81ffbb15238d84ac7dd646b18c
Author: Hugh Saunders <email address hidden>
Date: Mon Nov 24 17:06:12 2014 +0000

    delegate container_create to physical hosts

    This eliminates the use of hostvars[] in container creation which allows
    us to use templated variables, necessary for the fix for #1399427 which
    has already been merged.

    Notes:
      * Creates containers in parallel.
      * Improves flexability in container create as a specific group or container
        can be targeted on the CLI.
      * Syntax clean up.
      * Adds option to specify the container volume group name.
      * Removes the "ignored" failure note when running in an environment that does
        not use LVM and have an LXC volume group.

    Closes-Bug: #1399427
    Change-Id: I4270a9d11039b62d631f82d24de7cc87d5f142c9

Changed in openstack-ansible:
milestone: 10.1.0 → 10.1.2
Changed in openstack-ansible:
status: Fix Committed → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to os-ansible-deployment (juno)

Fix proposed to branch: juno
Review: https://review.openstack.org/150359

Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Fix proposed to branch: juno
Review: https://review.openstack.org/150374

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to os-ansible-deployment (juno)

Reviewed: https://review.openstack.org/150374
Committed: https://git.openstack.org/cgit/stackforge/os-ansible-deployment/commit/?id=7125206b21a211464526570240f7a7d751d86de6
Submitter: Jenkins
Branch: juno

commit 7125206b21a211464526570240f7a7d751d86de6
Author: Kevin Carter <email address hidden>
Date: Fri Nov 28 12:19:16 2014 -0600

    Changed the container interaction process

    This changes the way that containers are interacted with. With this
    change, container actions are deletgated to the host instead of looping
    through the hacky mess that we were doing. This change will make it
    so that the entire container process is faster.

    This also removes the needs for the "/openstack/monitoring" directory which
    was held over cruft from long ago. This should address the race condition
    when delegating to a host and the monitoring directory attempts to be created
    at the same time on the same host.

    Closes-Bug: #1399427
    Change-Id: Ifaa0fa5719f79180610b4a63d590ca8bc681f87d
    (cherry picked from commit fea671ec1692617d0c64c8b3d27df530bf58c004)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Reviewed: https://review.openstack.org/150359
Committed: https://git.openstack.org/cgit/stackforge/os-ansible-deployment/commit/?id=61ccecfd9a1a8aa025713a13533f2f35b4085b99
Submitter: Jenkins
Branch: juno

commit 61ccecfd9a1a8aa025713a13533f2f35b4085b99
Author: Hugh Saunders <email address hidden>
Date: Mon Nov 24 17:06:12 2014 +0000

    delegate container_create to physical hosts

    This eliminates the use of hostvars[] in container creation which allows
    us to use templated variables, necessary for the fix for #1399427 which
    has already been merged.

    Notes:
      * Creates containers in parallel.
      * Improves flexability in container create as a specific group or container
        can be targeted on the CLI.
      * Syntax clean up.
      * Adds option to specify the container volume group name.
      * Removes the "ignored" failure note when running in an environment that does
        not use LVM and have an LXC volume group.

    Closes-Bug: #1399427
    Change-Id: I4270a9d11039b62d631f82d24de7cc87d5f142c9
    (cherry picked from commit 96b9b494a46e1e81ffbb15238d84ac7dd646b18c)

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.