EBS volume time out: storage not cleaned up

Bug #1677240 reported by Curtis Hovey
4
This bug affects 1 person
Affects Status Importance Assigned to Milestone
juju-ci-tools
Triaged
High
viswesuwara nathan

Bug Description

As seen at
    http://reports.vapour.ws/releases/issue/55badc4c749a561625b0391c

This is a spike starting with build revision 5063 starting on March 28 where tests running in AWS us-east-1 where bootstraps or deployments fail because of "EBS volume time out". The issue persisted for about 12 hours, but AWS did not report a problem. The Juju revision at the start of the problem has nothing to do with AWS or storage.

I looked at the volumes section in the aws us-east-1 console and discovered 287 "available" volumes dating from Feb 3. After deleting the "available" volumes, tests started to pass again.

I suspect the issue assess_storage.py test should be deleting the volumes when it is done. I don't think juju should deleting these volumes automatically. Maybe the test needs to ask juju to delete the storage.

Revision history for this message
Torsten Baumann (torbaumann) wrote : Re: [Bug 1677240] [NEW] EBS volume time out: storage not cleaned up

Your off today, right? :-)

> On Mar 29, 2017, at 9:09 AM, Curtis Hovey <email address hidden> wrote:
>
> Public bug reported:
>
> As seen at
> http://reports.vapour.ws/releases/issue/55badc4c749a561625b0391c
>
> This is a spike starting with build revision 5063 starting on March 28
> where tests running in AWS us-east-1 where bootstraps or deployments
> fail because of "EBS volume time out". The issue persisted for about 12
> hours, but AWS did not report a problem. The Juju revision at the start
> of the problem has nothing to do with AWS or storage.
>
> I looked at the volumes section in the aws us-east-1 console and
> discovered 287 "available" volumes dating from Feb 3. After deleting the
> "available" volumes, tests started to pass again.
>
> I suspect the issue assess_storage.py test should be deleting the
> volumes when it is done. I don't think juju should deleting these
> volumes automatically. Maybe the test needs to ask juju to delete the
> storage.
>
> ** Affects: juju-ci-tools
> Importance: High
> Status: Triaged
>
> --
> You received this bug notification because you are subscribed to juju-
> ci-tools.
> Matching subscriptions: juju-ci-tools
> https://bugs.launchpad.net/bugs/1677240
>
> Title:
> EBS volume time out: storage not cleaned up
>
> Status in juju-ci-tools:
> Triaged
>
> Bug description:
> As seen at
> http://reports.vapour.ws/releases/issue/55badc4c749a561625b0391c
>
> This is a spike starting with build revision 5063 starting on March 28
> where tests running in AWS us-east-1 where bootstraps or deployments
> fail because of "EBS volume time out". The issue persisted for about
> 12 hours, but AWS did not report a problem. The Juju revision at the
> start of the problem has nothing to do with AWS or storage.
>
> I looked at the volumes section in the aws us-east-1 console and
> discovered 287 "available" volumes dating from Feb 3. After deleting
> the "available" volumes, tests started to pass again.
>
> I suspect the issue assess_storage.py test should be deleting the
> volumes when it is done. I don't think juju should deleting these
> volumes automatically. Maybe the test needs to ask juju to delete the
> storage.
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/juju-ci-tools/+bug/1677240/+subscriptions

Curtis Hovey (sinzui)
Changed in juju-ci-tools:
assignee: nobody → viswesuwara nathan (viswesn)
Revision history for this message
viswesuwara nathan (viswesn) wrote :

I will start working on this bug. Thanks

Revision history for this message
viswesuwara nathan (viswesn) wrote :

Juju doesn't provide any command to destroy or delete the storage that was created during charm/unit creation or any added storage using juju 'add-storage' command. Juju deletes storage automatically that were bounded to the charm/unit on running 'remove-unit' or 'remove-application' or 'remove-machine'.

Now a couple of questions to go further on this issue.

Do you see any instance in AWS other than storage volumes? I am asking this because EBS volumes which do not have "Delete on Termination" set to true will persist after this instance is terminated.
I don't think that juju deploy changed any property on this in recent time.

>> After deleting the "available" volumes, tests started to pass again.
Why "EBS volume time out" error stops any further instance to launch on the AWS region?
I don't think AWS has a limit on EBS volumes per region.

>> I suspect the issue assess_storage.py test should be deleting the volumes when it is done
Juju doesn't provide any command to remove/delete or destroy the storage volumes that were added using "add-storage" or "create-storage-pool";

All resources that were allocated will be released on destroying the controller or immediately after coming out from the bootstrap context manager. A quick look into assess_storage.py doesn't show any leak on this properties.

>> This is a spike starting with build revision 5063 starting on March 28
How can I get the changes that went on each revision? Please let me know the link to see them and I hope it will also show me the code changes too.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.