Periodic Jobs failing at tempest config while creating image(with swift backend)

Bug #1746734 reported by yatin
16
This bug affects 2 people
Affects Status Importance Assigned to Milestone
tripleo
Fix Released
Critical
Christian Schwede
wes hayutin (weshayutin)
tags: added: promotion-blocker
wes hayutin (weshayutin)
Changed in tripleo:
importance: Undecided → Critical
status: New → Triaged
milestone: none → queens-rc1
Revision history for this message
Alan Pevec (apevec) wrote :
Revision history for this message
Alan Pevec (apevec) wrote :

Jan 26 was when Swift was unpinned https://review.rdoproject.org/r/11248

Revision history for this message
Christian Schwede (cschwede) wrote :

Looks like Swift can't create data in the given directory (/srv/node/d1).

https://logs.rdoproject.org/openstack-periodic/periodic-tripleo-ci-centos-7-ovb-1ctlr_1comp-featureset002-master-upload/ea77cda/overcloud-controller-foo-0/var/log/swift/swift.log.txt.gz#_Feb__1_04_15_26

There were two required fixes for t-h-t and puppet-tripleo that merged on master ~ 3 weeks ago:

https://review.openstack.org/#/c/517374/
https://review.openstack.org/#/c/517373/

However, it seems they are already in?

Any chance to SSH into a deployment to check this directly? Or any other debugging possibility to check the /srv/node/d1 directory?

Revision history for this message
wes hayutin (weshayutin) wrote :
Revision history for this message
Bogdan Dobrelya (bogdando) wrote :

The logs show 'ERROR Insufficient Storage'

Revision history for this message
Alan Pevec (apevec) wrote :
Revision history for this message
Alan Pevec (apevec) wrote :

Actually proxy-server: ERROR Insufficient Storage 172.19.0.17:6002/d1 is the same as in https://bugs.launchpad.net/tripleo/+bug/1729569 b/c of which Swift was pinned back in November.
Looks like THT fix https://review.openstack.org/517374 is not enough?

Revision history for this message
Alan Pevec (apevec) wrote :

> Looks like THT fix https://review.openstack.org/517374 is not enough?

@Christian that changed puppet/services/swift-storage.yaml shouldn't also docker/services/swift-storage.yaml be adjusted?

Changed in tripleo:
assignee: nobody → Christian Schwede (cschwede)
Revision history for this message
Christian Schwede (cschwede) wrote :

> Looks like THT fix https://review.openstack.org/517374 is not enough?

> @Christian that changed puppet/services/swift-storage.yaml shouldn't also docker/services/swift-> storage.yaml be adjusted?

Isn't this automatically used because of the imported SwiftStorageBase?
Even if not, the setting is true by default in puppet-tripleo, so this can't be the only reason.

I'm trying to reproduce this on rdo-cloud; if someone can check in the meantime if /srv/node/d1 does exist and is writeable by the Swift containers, that would be helpful.

Revision history for this message
Sagi (Sergey) Shnaidman (sshnaidm) wrote :

Is any progress on that?
It's main blocking promotion bug atm
thanks

tags: added: alert
Revision history for this message
Christian Schwede (cschwede) wrote :

Using the reproducer script on RDO wasn't successful so far - it run into a timeout two times in a row, third time failed just now when deploying the overcloud. Now trying something different with OOOq.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to tripleo-heat-templates (master)

Fix proposed to branch: master
Review: https://review.openstack.org/541662

Changed in tripleo:
status: Triaged → In Progress
Revision history for this message
Christian Schwede (cschwede) wrote :

So the /srv/node/d1 directory was missing, i'm still wondering why because puppet-tripleo should create this automatically (see https://review.openstack.org/#/c/517374/). And in fact it is creating this when running OOOq on master.

It's just not working when using the reproducer on RDO cloud - which seems strange to me.

I proposed an upstream fix, but this one doesn't work on the reproducer either: https://review.openstack.org/541662

Revision history for this message
Christian Schwede (cschwede) wrote :

Also interesting: puppet-swift seems to create the directory d1 on the reproducer, but it's not there once containers are running. I see this in /var/log/messages:

Notice: /Stage[main]/Tripleo::Profile::Base::Swift::Storage/File[/srv/node/d1]/ensure: created

I also see that the command from my posted fix is executed:

mkdir -p /srv/node/d1

Still, this directory does not exist once containers are up and running?

Revision history for this message
Bogdan Dobrelya (bogdando) wrote :

I think we want the /srv/node/d1 created with host prep tasks and bind-mounted for the container

Revision history for this message
Christian Schwede (cschwede) wrote :

Updated my patch - please have a look!

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to tripleo-heat-templates (master)

Reviewed: https://review.openstack.org/541662
Committed: https://git.openstack.org/cgit/openstack/tripleo-heat-templates/commit/?id=bc8618126f12d3a0d41aa59701c0575c70e17879
Submitter: Zuul
Branch: master

commit bc8618126f12d3a0d41aa59701c0575c70e17879
Author: Christian Schwede <email address hidden>
Date: Wed Feb 7 10:56:15 2018 +0100

    Fix missing Swift d1 directory

    The /srv/node/d1 directory was missing, thus creating it in advance.

    Note: there is a related change that merged earlier (f6108f5d) but
    for some reason didn't work as expected.

    Closes-Bug: 1746734
    Change-Id: Iabaa2033d065c9da653f7ba9e25430c3554a1169

Changed in tripleo:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/tripleo-heat-templates 8.0.0.0rc1

This issue was fixed in the openstack/tripleo-heat-templates 8.0.0.0rc1 release candidate.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.