File conflict when merge back packaging branch to upstream

Bug #530584 reported by Didier Roche-Tolomelli
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Bazaar
Confirmed
High
canonical-bazaar
Ubuntu Distributed Development
Triaged
Medium
Unassigned

Bug Description

typical case:
- you have your upstream branch with a "foo" file
 - this file isn't distributed into your tarball, so, at first merge-upstream, it will removing it from your packaging branch
 - then, with the recipe, on the merge back, no issue (deleted)
 … until the day, upstream change the content of "foo" file
 change in "foo" in trunk + merge of a deleted file from the packaging branch -> FAIL

-> upstream has to include all kind of crude packages like autogen.sh, .bzrignore to tarball. When we miss one, conflict occurs in hudson.
dx and desktop team loose approx one to two hours a week because of that, figuring out which files aren't distributed, relaunching hudson build and so on.

-------------

The scenario seems to be the following (taking HACKING file as a filename example):

have a branch A (containing a HACKING file),
create a branch A-packaging from A (containing a HACKING file too, consequentely)

then, use merge-upstream workflow, the A pristine tar not having a HACKING file., HACKING file is so removed from A-packaging.

If upstream changed the HACKING file, and merge back the packaging branch to their upstream branch (for instance, with hudson daily build), there will be a conflict in the HACKING file.

Revision history for this message
James Westby (james-w) wrote :

Hi,

Thinking about this I think bzr-builddeb is acting correctly, we shouldn't
arbitrarily include things in the packaging tree that aren't in the tarball, if
you want to add it as part of the packaging then you can, and if you
get the file-ids right then this would work.

Therefore I reassign this to UDD as I think we should do something to handle
those conflicts better when we merge back as part of bzr-builder.

Thanks,

James

Changed in bzr-builddeb:
status: New → Triaged
importance: Undecided → Medium
affects: bzr-builddeb → udd
James Westby (james-w)
tags: added: bzr
description: updated
Revision history for this message
Didier Roche-Tolomelli (didrocks) wrote :

As discussed the issue with james, the issue most of the time came from the fact that the packaging branch is branched from upstream ones, so autogen.sh is there for instance, and will get removed on first merge-upstream as not in the tarball…

until the day upstream change something in autogen.sh.

Revision history for this message
Robert Collins (lifeless) wrote : Re: [Bug 530584] Re: File conflict when merge back packaging branch to upstream

So, bzr's conflict resolver says 'change to a deleted file, oh noes';
for this specific case we want to 'know' that the tarballification
process deletes the file /name/ not the file at a specific version,
and as such we don't want conflicts on it.

Revision history for this message
Robert Collins (lifeless) wrote :

James, you've assigned this to udd, but udd is kind of an effort-gathering place, not a product per se. Do you think bzr-builder is where the code changes should happen?

description: updated
Revision history for this message
John A Meinel (jameinel) wrote :
Download full text (3.6 KiB)

I'd like to write out how I understand this bug, to make sure it is well defined, and then brainstorm some possible fixes.

1) Upstream is imported (as normal), and contains some files that don't end up in the final tarball. These are generally things that are used as templates (such as autogen.sh that I don't really understand, but I assume stuff like Makefile.am and/or Makefile.in also fall into this category.)

2) bzr-builder is used to combine the upstream code with a packaging branch. What I don't really understand here is what gets merged into what. Do you start with a copy of upstream, and call 'bzr merge-upstream' (doesn't make a lot of sense). It seems that you would start with the packaging import, and then call 'bzr merge-upstream'. However, the statement was:
  "- this file isn't distributed into your tarball, so, at first merge-upstream, it will removing it from your packaging branch"
If you started with a packaging branch, then you wouldn't *have* the file in your packaging branch, it would have only been in upstream, and then 'bzr merge-upstream' would presumably just ignore that autogen.sh wasn't present in your local branch.

Or is 'merge-upstream' actually mean 'bring in the latest deb import' and not 'bring in the latest upstream source code'. (yay for having multiple possible definitions of 'upstream' :(

3) At this point, you have a branch which is generally a combination of whatever packaging information, and whatever code has been imported from upstream vcs. You then see that more work has been done in upstream vcs and want to merge it into your packaging branch.

However, upstream vcs also modified the autogen.sh file, which has been deleted in the packaging branch. Which causes a conflict because a file was modified which is now deleted.

So far I think I have a handle on it, though some of the particulars of (2) are hazy.

Here are some more issues:

a) It seems possible to create a per-file merge hook that could say "deleted files supersede". So any time you went to merge content into a deleted file, it would just ignore it, and say "nope, I want that file gone, don't merge any new changes".

The only questions in my mind are how to configure it to know when it is actually active.

  i) always (seems risky if you are switching between projects)
  ii) By path/locations.conf (so whenever I'm working on these projects, deletes supersede, but over here they
      conflict as expected, so I don't accidentally lose bugfixes when code gets moved around.)
  iii) By explicit enumeration in something like .bzr-meta/delete-files
       Instead of running just 'bzr rm autogen.sh', you would then run 'bzr force-deleted-file autogen.sh'
       And bzr-builder could automatically inject that sort of thing if it wanted to.

b) If upstream changes autogen.sh, won't that chain down to something that *should* change in the packaging? Using what I know, if Makefile.am was changed to include a new source file, that would end up touching Makefile.in, and then finally Makefile. And while Makefile.am is not in the tarball, I think Makefile.in would be (which is then ./configure-ed into becoming the Makefile used to build the binary.)

It ...

Read more...

Revision history for this message
John A Meinel (jameinel) wrote :

Another point I don't quite understand:
  "If upstream changed the HACKING file, and merge back the packaging branch
   to their upstream branch (for instance, with hudson daily build), there will be a
   conflict in the HACKING file."

I would think it would be very ill advised to merge a packaging branch back into the upstream branch. Specifically given that it *deletes all of these files*. Regardless of whether the content changed. If upstream was using autogen.sh, which then gets deleted in the packaging branch, merging packaging back in will delete autogen.sh, which will mess up the upstream build chain, right?

Or is it just an ongoing "ok, now we have a new upstream code, merge the old packaging branch in to get a new packaging branch".

It still feels to me that whatever changed in autogen.sh is going to need to be evaluated to see if it can be safely ignored, or whether it actually changes how the project is built, etc, and that needs to get propagated into the rest of the content.

As a specific example, say upstream adds a new file (mynewcode.c), which then needs to be incorporated into the build process. Is it assumed that they have made a new tarball release which should have incorporated those changes? (I can admit that my lack of understanding is probably not understanding the full workflow.)

Revision history for this message
Didier Roche-Tolomelli (didrocks) wrote : Re: [Bug 530584] Re: File conflict when merge back packaging branch to upstream
Download full text (4.7 KiB)

Le mardi 29 juin 2010 à 16:16 +0000, John A Meinel a écrit :
> I'd like to write out how I understand this bug, to make sure it is well
> defined, and then brainstorm some possible fixes.
>
> 1) Upstream is imported (as normal), and contains some files that don't
> end up in the final tarball. These are generally things that are used as
> templates (such as autogen.sh that I don't really understand, but I
> assume stuff like Makefile.am and/or Makefile.in also fall into this
> category.)

Right, this is similar

>
> 2) bzr-builder is used to combine the upstream code with a packaging branch. What I don't really understand here is what gets merged into what. Do you start with a copy of upstream, and call 'bzr merge-upstream' (doesn't make a lot of sense). It seems that you would start with the packaging import, and then call 'bzr merge-upstream'. However, the statement was:
> "- this file isn't distributed into your tarball, so, at first merge-upstream, it will removing it from your packaging branch"

Well, packaging import isn't the first branch created for everything :)
Here, we are in case where dx team is upstream, so we need packaging
branch for cooking things a lot before reaching a state that enables us
to push it to maverick. That's why the way to do this and have upstream
file-ids is to start with bzr branch trunk/ packaging and then, bzr
merge-upstream. If you have better procedure, do not hesitate to state
them here.

> If you started with a packaging branch, then you wouldn't *have* the file in your packaging branch, it would have only been in upstream, and then 'bzr merge-upstream' would presumably just ignore that autogen.sh wasn't present in your local branch.
>
> Or is 'merge-upstream' actually mean 'bring in the latest deb import'
> and not 'bring in the latest upstream source code'. (yay for having
> multiple possible definitions of 'upstream' :(

all is related to previous workflow there, due to have upstream using
bzr and needing packaging branch for daily build before it reaches
ubuntu.

>
> 3) At this point, you have a branch which is generally a combination of
> whatever packaging information, and whatever code has been imported from
> upstream vcs. You then see that more work has been done in upstream vcs
> and want to merge it into your packaging branch.
>
> However, upstream vcs also modified the autogen.sh file, which has been
> deleted in the packaging branch. Which causes a conflict because a file
> was modified which is now deleted.
>
> So far I think I have a handle on it, though some of the particulars of
> (2) are hazy.
>
> Here are some more issues:
>
> a) It seems possible to create a per-file merge hook that could say
> "deleted files supersede". So any time you went to merge content into a
> deleted file, it would just ignore it, and say "nope, I want that file
> gone, don't merge any new changes".
>
> The only questions in my mind are how to configure it to know when it is
> actually active.
>
> i) always (seems risky if you are switching between projects)
> ii) By path/locations.conf (so whenever I'm working on these projects, deletes supersede, but over here they
> conflict as expe...

Read more...

Revision history for this message
Didier Roche-Tolomelli (didrocks) wrote :

Le mardi 29 juin 2010 à 16:21 +0000, John A Meinel a écrit :
> Another point I don't quite understand:
> "If upstream changed the HACKING file, and merge back the packaging branch
> to their upstream branch (for instance, with hudson daily build), there will be a
> conflict in the HACKING file."
>
> I would think it would be very ill advised to merge a packaging branch
> back into the upstream branch. Specifically given that it *deletes all
> of these files*. Regardless of whether the content changed. If upstream
> was using autogen.sh, which then gets deleted in the packaging branch,
> merging packaging back in will delete autogen.sh, which will mess up the
> upstream build chain, right?

This is what the recipe are doing for daily build and shipping in
hudson. I think the udd project is responsible of this.

>
> Or is it just an ongoing "ok, now we have a new upstream code, merge the
> old packaging branch in to get a new packaging branch".
>
> It still feels to me that whatever changed in autogen.sh is going to
> need to be evaluated to see if it can be safely ignored, or whether it
> actually changes how the project is built, etc, and that needs to get
> propagated into the rest of the content.
>
> As a specific example, say upstream adds a new file (mynewcode.c), which
> then needs to be incorporated into the build process. Is it assumed that
> they have made a new tarball release which should have incorporated
> those changes? (I can admit that my lack of understanding is probably
> not understanding the full workflow.)
>

Well, I'm not in favor of this merging back either, but that's how daily
build are working. And that's the root cause of all those issues.

Revision history for this message
James Westby (james-w) wrote :

Hopefully LP won't mess up the formatting of this comment

dx team -------------A-------C
                                \
Ubuntu B

The dx team has their branch at lp:foo, which we wish to package. We take this branch
and create a packaging branch, with revision B, based on revision A.

B only contains files that are in the tarball released from revision A, which is not a complete
snapshot of the tree. We do this because the source package that B represents, is built
on the tarball, and so won't contain those files.

Therefore there are files that are in A, but not in B.

Now we want to builds packages with a recipe that is

lp:foo
merge packaging lp....

which merges B in to C.

As A->B deletes some files, then there will be conflicts if any of those files were modified
in A->C.

However, we don't want conflicts. These deletions aren't important to us, and we want the
conflict to be automatically resolved in favour of C.

I don't think this applies across the board to any deletion/modification conflicts when using
bzr-builder, but it will be the common case.

Thanks,

James

Revision history for this message
James Westby (james-w) wrote :

Rob, I didn't add another task as I don't know where the problem should be fixed.

Thanks,

James

Martin Pool (mbp)
Changed in bzr:
status: New → Confirmed
importance: Undecided → High
tags: added: merge udd
Revision history for this message
Andrew Bennetts (spiv) wrote :

So a possible solution might be to have a 'merge-ignore-deletes' directive in a bzr-builder recipe? (I'm not proposing that exact syntax, just exploring ideas.) It could perform the merge and either via a merge_file_content hook or a post-merge introspection of the conflicts discard the conflicts due to a delete in the branch being merged from. What do you think James?

Revision history for this message
James Westby (james-w) wrote :

On Fri, 13 Aug 2010 04:41:19 -0000, Andrew Bennetts <email address hidden> wrote:
> So a possible solution might be to have a 'merge-ignore-deletes'
> directive in a bzr-builder recipe? (I'm not proposing that exact
> syntax, just exploring ideas.) It could perform the merge and either
> via a merge_file_content hook or a post-merge introspection of the
> conflicts discard the conflicts due to a delete in the branch being
> merged from. What do you think James?

That would solve this problem, yes.

Obviously it's not ideal that it is an explicit instruction, but aside
from that...

It does have some correctness issues: There are some deletes that would
be considered conflicts. It's rare to actually delete a file in
packaging, but it would still happen from time to time.

I'd obviously like to see a solution that worked from the right end
(somehow), as it's at import time when this files are "removed", but we
don't consider them deletions, just "missing" and welcome to come back
if they like. Even doing this automatically would be a little risky
though, as it's not clear whether the lack of a file in the tarball that
is in the branch is significant or not.

Thanks,

James

Martin Pool (mbp)
Changed in bzr:
assignee: nobody → canonical-bazaar (canonical-bazaar)
Revision history for this message
John A Meinel (jameinel) wrote :

This is somewhat related to the more generic bug #364336. That one is about allowing a user to say "this file is deleted, ignore changes from merge sources, rather than giving me conflicts". *This* bug is a bit more specific to how the package importing process is run. They aren't dupes because:
 1) If we solved bug #364336, then to solve this we would need to teach the importer how to mark those files as uninteresting.
 2) We could solve this one with very specific logic in the importer, wrt auto generated files, etc.

Jelmer Vernooij (jelmer)
tags: added: check-for-breezy
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.