manage-projects error on new project creation

Bug #1242569 reported by Timur Sufiev
58
This bug affects 12 people
Affects Status Importance Assigned to Milestone
OpenStack Core Infrastructure
Fix Released
Critical
Monty Taylor

Bug Description

Please, see commit https://review.openstack.org/#/c/52871/
There is message from Jenkins, that we should rebase patchset, but is already rebased.

Revision history for this message
Timur Sufiev (tsufiev-x) wrote :

On 18.10.13 clarkb wrote in #openstack-infra that he found the problem and the solution is https://review.openstack.org/#/c/52689/ - though it is not merged yet. Perhaps this is why our problem still persists?

Revision history for this message
Antoine "hashar" Musso (hashar) wrote :

That happens on other repositories as well such as stackforge/libra https://review.openstack.org/#/c/52866/ which LinuxJedi attempted to get merged.

""This change was unable to be automatically merged with the current state of the repository. Please rebase your change and upload a new patchset.""

:/

Changed in openstack-ci:
status: New → Confirmed
summary: - No patchset can be merged into murano-repository repo
+ No patchset can be merged into any repository
Alvaro Lopez (aloga)
no longer affects: keystone
Revision history for this message
Anita Kuno (anteaya) wrote : Re: No patchset can be merged into any repository

2013-10-21T10:07:57 <jeblair> restarted zuul

Clark Boylan (cboylan)
Changed in openstack-ci:
importance: Undecided → High
milestone: none → icehouse
Revision history for this message
Clark Boylan (cboylan) wrote :

The root cause of this problem is in jeepby's manage-projects. That script iterates over the list of projects, if a project isn't in Gerrit it creates it in Gerrit as an empty repo, then it attempts to create the project in github and finally it force pushes any preexisting content to Gerrit which is replicated to github. If anything after the Gerrit project creation (which happens first), the script bails out for that particular project and moves on to the next one. This leaves an empty repo in Gerrit for that project.

Some time later a change is proposed to that repo causing Zuul to clone the repo in Gerrit which is empty. This creates an empty repo in the Zuul git repo cache. Eventually someone will get around to rerunning manage-projects for that project which will do the github creation and force push content. Now everything but Zuul is happy. Zuul does do `git remote updates` so it knows about upstream changes but none of them have been checked out into the working dir (they show up as deleted with git status) causing the merge failures.

The proper way to fix this cascading problem is to make both manage-projects and Zuul resilient to failures. There is a proposed change linked to earlier in this bug with a Zuul fix. But we haven't made manage-projects better at handling failure. It should probably create the project, do the force push of content, create the project in Github, then trigger a Gerrit replication. That way if anything fails it can bail out without negatively affecting the next steps.

James E. Blair (corvus)
summary: - No patchset can be merged into any repository
+ manage-projects error on new project creation
Changed in openstack-ci:
importance: High → Critical
status: Confirmed → Triaged
James E. Blair (corvus)
Changed in openstack-ci:
assignee: nobody → Monty Taylor (mordred)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to jeepyb (master)

Fix proposed to branch: master
Review: https://review.openstack.org/59107

Changed in openstack-ci:
status: Triaged → In Progress
Changed in openstack-ci:
assignee: Monty Taylor (mordred) → Clark Boylan (cboylan)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to jeepyb (master)

Reviewed: https://review.openstack.org/59107
Committed: http://github.com/openstack-infra/jeepyb/commit/dd9520792a324b40d2848abc72fd5891eb37bce6
Submitter: Jenkins
Branch: master

commit dd9520792a324b40d2848abc72fd5891eb37bce6
Author: Monty Taylor <email address hidden>
Date: Thu Nov 28 18:20:42 2013 -0500

    Do github last

    github project creation is really only a convenience, but API errors
    cause other things to get borked from time to time. Move the interaction
    with github to the very end, after we've done all of the things that
    exist in our own backyard. Additionally, if we create the project in
    github, it's possible we did it later on, so go ahead and trigger a
    replication

    Change-Id: I51572afe41f7ec9977ea7c17a90bd4df49b9a0f1
    Closes-bug: #1242569

Changed in openstack-ci:
status: In Progress → Fix Committed
Clark Boylan (cboylan)
Changed in openstack-ci:
status: Fix Committed → Fix Released
Revision history for this message
James E. Blair (corvus) wrote :

This still seems to be a problem; the stackforge/milk project was not created correctly. Possibly the code path for creating a new empty project is broken.

Changed in openstack-ci:
status: Fix Released → Triaged
assignee: Clark Boylan (cboylan) → nobody
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to jeepyb (master)

Fix proposed to branch: master
Review: https://review.openstack.org/63490

Changed in openstack-ci:
assignee: nobody → Clark Boylan (cboylan)
status: Triaged → In Progress
Revision history for this message
Stefano Maffulli (smaffulli) wrote :

What's the status of this bug now that https://review.openstack.org/63490 has merged? Is it still valid?

Revision history for this message
Jeremy Stanley (fungi) wrote :

I tested today with another multi-group project request, and saw similar symptoms: groups were not created in Gerrit and repositories on review.openstack.org and git0[1-4].openstack.org and zuul.openstack.org had no HEAD ref.

Nothing in the log (puppet still isn't preserving the stdout from the exec), but manually running manage-projects from a shell on review.openstack.org after deleting the broken repositories at /home/gerrit2/review_site/git, /var/lib/git and /var/lib/jeepyb resulted in no errors. After that the missing groups were automatically created as expected, projects cloned and remote git mirrors updated. I cleaned up the remaining broken repositories on zuul.openstack.org in /var/lib/zuul/git so that they won't get in the way of gating later.

I need to try another one after locally tweaking manage_projects.py to be a complete no-op, then disable puppet, unbreak the script and re-run manage-projects --verbose to see if I can get a better idea as to where this is going off the rails.

Revision history for this message
Jeremy Stanley (fungi) wrote :

We temporarily disabled automated manage-projects runs in puppet last week and Monty tried manually running it for several new projects but was unable to recreate teh failure mode we see when it gets run from puppet. This probably means it's a contextual problem and we'll need better debug/error logging in place so we can see what is actually happening when it's run under puppet.

Revision history for this message
James E. Blair (corvus) wrote :

Yeah, I think the way to start here is to add proper debug logging to manage-projects. Regardless of what the ultimate solution ends up being, the fact that we have no idea what's going on is the reason that this bug report has been going in circles.

Revision history for this message
Jeremy Stanley (fungi) wrote :
Download full text (5.1 KiB)

The refresh-triggered puppet exec for manage-projects got reenabled in https://review.openstack.org/72178 and then logoutput was turned on for it in https://review.openstack.org/72158 so we could test in context.

Next, I merged https://review.openstack.org/70761 and https://review.openstack.org/73036 to see what happened. As luck would have it, things mostly worked as designed... the only problem I encountered was that the gerrit replicate call happened before git03 and git04 ran create-cgitrepos, so the resultant repos on those servers were empty. Manually retriggering gerrit replicate for the two projects involved solved this immediately.

After that, I got bolder and merged https://review.openstack.org/66521 and https://review.openstack.org/71293 and that's when the real fun started. New project creation failed, reporting these tracebacks through puppet into the syslog:

ERROR:manage_projects:Exception creating stackforge/savanna-ci-config in Gerrit.
Traceback (most recent call last):
  File "/usr/local/lib/python2.7/dist-packages/jeepyb/cmd/manage_projects.py", line 478, in create_gerrit_project
    gerrit.createProject(project)
  File "/usr/local/lib/python2.7/dist-packages/gerritlib/gerrit.py", line 131, in createProject
    out, err = self._ssh(cmd)
  File "/usr/local/lib/python2.7/dist-packages/gerritlib/gerrit.py", line 228, in _ssh
    raise Exception("Gerrit error executing %s" % command)
Exception: Gerrit error executing gerrit create-project --require-change-id --name stackforge/savanna-ci-config
ERROR:manage_projects:Problems creating stackforge/savanna-ci-config, moving on.
Traceback (most recent call last):
  File "/usr/local/lib/python2.7/dist-packages/jeepyb/cmd/manage_projects.py", line 625, in main
    project, project_list, gerrit)
  File "/usr/local/lib/python2.7/dist-packages/jeepyb/cmd/manage_projects.py", line 478, in create_gerrit_project
    gerrit.createProject(project)
  File "/usr/local/lib/python2.7/dist-packages/gerritlib/gerrit.py", line 131, in createProject
    out, err = self._ssh(cmd)
  File "/usr/local/lib/python2.7/dist-packages/gerritlib/gerrit.py", line 228, in _ssh
    raise Exception("Gerrit error executing %s" % command)
Exception: Gerrit error executing gerrit create-project --require-change-id --name stackforge/savanna-ci-config
ERROR:manage_projects:Exception creating stackforge/savanna-guestagent in Gerrit.
Traceback (most recent call last):
  File "/usr/local/lib/python2.7/dist-packages/jeepyb/cmd/manage_projects.py", line 478, in create_gerrit_project
    gerrit.createProject(project)
  File "/usr/local/lib/python2.7/dist-packages/gerritlib/gerrit.py", line 131, in createProject
    out, err = self._ssh(cmd)
  File "/usr/local/lib/python2.7/dist-packages/gerritlib/gerrit.py", line 228, in _ssh
    raise Exception("Gerrit error executing %s" % command)
Exception: Gerrit error executing gerrit create-project --require-change-id --name stackforge/savanna-guestagent
ERROR:manage_projects:Problems creating stackforge/savanna-guestagent, moving on.
Traceback (most recent call last):
  File "/usr/local/lib/python2.7/dist-packages/jeepyb/cmd/manage_projects.py", line 625, in main
    pr...

Read more...

Revision history for this message
Jeremy Stanley (fungi) wrote :

Based on an unfortunately unrepresentatively small sample size, it looks like we're probably fine on projects which import an existing repository (except for the create-cgitrepos race), but are having some issue with projects which start with no existing repository to import.

Revision history for this message
Doug Hellmann (doug-hellmann) wrote :

Given that we want most everyone using a cookiecutter template, maybe it's OK to require providing a repo with at least one commit for import?

Revision history for this message
James E. Blair (corvus) wrote :

I would rather that not be a requirement; I think we should fix the bug.

James E. Blair (corvus)
Changed in openstack-ci:
assignee: Clark Boylan (cboylan) → Monty Taylor (mordred)
Revision history for this message
James E. Blair (corvus) wrote :

This has now been fixed, due to a combination of jeepby improvements and puppet run sequencing.

Changed in openstack-ci:
status: In Progress → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.