#493646 git cvsimport -m: too eager to declare a merge

#493646#5
Date:
2008-08-03 20:12:18 UTC
From:
To:
When looking at the history with gitk, when there is a merge, it's not
clear at all to me what exactly got merged.

It's ussually only the last commit on the developement branch
that is merged to a stable branch.  But it could also be more than 1
commit.

The merge commit only shows the log message, which isn't very useful.
The colors don't seem to help either.  They just seem to change on the
first commit in that branch.


Kurt

#493646#10
Date:
2010-02-12 22:49:34 UTC
From:
To:
severity 493646 wishlist
thanks

Hi Kurt,

About a year ago, you wrote:

Could you say a little more about this?  Is the problem that you want
a list of patches representing the diff between a merge commit and its
first parent?

For the future, the “[merge] log = true” setting in .gitconfig can be
good for putting this in the log message.  gitk --topo-order can also
be helpful.  But maybe gitk and other tools could cope better in other
ways with a history in which the first parent always represents the
upstream.

For example, here is something I have sometimes wished for:

 - Make “gitk --first-parent” stop using --boundary, so I really
   can pretend there is a linear history when I look at the result
 - For each merge, have a widget I can toggle for “more detail”,
   which adds its remaining parents to the list of commits to
   traverse.

That way, I could quickly find the piece of history I am interested
in in a high-level overview, and then walk downstream until I
understand what happened in detail.

Thoughts?
Jonathan

#493646#17
Date:
2010-02-12 23:16:03 UTC
From:
To:
I think the main problem is that when I look at the history in git
that it's unclear which patches are all applied to a branch.

If you have a branch and you add a few patches to it, and then
merge only the last one in an other branch, I think it looks
just the same as if you've merged all the patches.

I think it's just unclear what the parent is in each branch.


Kurt

#493646#22
Date:
2010-02-12 23:26:40 UTC
From:
To:
Kurt Roeckx wrote:

Hmm, this leaves me a bit confused. :(

Suppose (where time flows left to right), I have this history:

   E -- F -- G [topic]
  /
 A -- B -- C -- D [master]

If I am on branch master and use ‘git merge topic’, then all the
changes from A to G get applied to master to form the merge commit.

So it is not clear to me what it means to merge only the last patch on
a branch.

Jonathan

#493646#27
Date:
2010-02-13 13:16:58 UTC
From:
To:
I'm not sure how this is done in git, but as far as I know, I've
seen this done.  It's basicly about backporting fixes from a
master branch to one of the release/stable branches.


Kurt

#493646#32
Date:
2010-02-13 21:02:46 UTC
From:
To:
Kurt Roeckx wrote:

Oh!  Thanks for clarifying.

In some projects, it makes sense to develop each fix on its own branch
based on a stable branch and then merge it into each integration
branch when it is ready.  See gitworkflows(7).

Suppose, however, that yor project does not work that way, or you
have a fix that was never intended for a stable branch and you need to
backport it.  For example, you might have a history like this

    E -- F -- G --- H --- I --- ... [devel]
   /
  A -- B -- C -- D [stable]

and only the patch H is suitable for the stable branch; the rest of
the changes are not quite cooked yet.

Then while on the stable branch you use 'git cherry-pick H'.  This
produces a history like so:

    E -- F -- G --- H --- I --- ... [devel]
   /
  A -- B -- C -- D --- H' [stable]

where H' introduces the same change as H did.

In this case, git does not record for itself that this H' is based on
H.  I think Subversion would have recorded this information and used
it for later merges, but in git the merge algorithm is not so far from
a simple three-way merge, and in particular it has no way to use
details like this one.  If stable is merged to devel, that diff hunk
will be noticed to be the same for both branches, so unless something
else interesting happened nearby it will not produce a conflict.

Commands such as 'git rebase' that consider commits one by one, on the
other hand, will often notice that H and H' represent the same change,
even though that fact is not recorded.  This magic is implemented
by the 'git patch-id' command.

A human might have very good reasons to be interested in the fact that
H' is based on H.  The 'git cherry-pick -x' option is meant to record
this in the commit message and gitweb will treat the result as a link.
The 'git cherry' front-end to patch-id can also be helpful when
cherry-pick was used without -x.

Sorry for the information dump.  With all that out of the way, I
wonder:

 - Where do you think this kind of overview documentation could go
   to make it easier to find?

 - How should gitk help?  For example, if gitk converted commit object
   IDs to hyperlinks like gitweb does, would that help?

Jonathan

#493646#37
Date:
2010-02-14 02:08:03 UTC
From:
To:
Right, so it's called cherry picking.  I guess I'm still to used
on how CVS works where this is just a merge.  And the difference
with CVS seems to be that CVS doesn't know anymore that it came
from the devel branch after that.

I have no suggestion for that.

A better example might be that both F and H where merged to the
stable branch, and that H fixes a bug introduced in G.

I guess what I want to be able to see is each patch as applied to
the branch.  If I'm currently at H', I want to see that F' is the
previous change in that branch.  But I think it's showing me only
H now, and that G is the previous patch.  The same goes for what
is the next patch obviously.

I think there used to be a way to only see the current branch
and not all branches that got merged in it?

Maybe it should atleast show the branch name for each parent of
a commit?


Kurt

#493646#42
Date:
2010-02-14 09:16:50 UTC
From:
To:
Kurt Roeckx wrote:

I don’t know CVS, but if I understand you correctly then git is like
CVS in this respect (except for the name for the operation).  Git
doesn’t know any more that H' came from the devel branch.  Humans
might be able to tell from the commit message, and with effort git can
figure it out by comparing patches, but that information is not
explicitly stored anywhere.

Part of my confusion is that what you are talking about is creating
history, but gitk is mostly a tool for viewing history.  So I am
trying to imagine what series of commands created the history you are
talking about and failing.

The setup is presumably as before:

 test_commit() {
	: > "$1"
	git add "$1"
	git commit -m "$1"
	git tag "$1"
 }
 git init test-repo
 cd test-repo
 test_commit A
 git checkout -b devel
 for i in E F G H I J; do test_commit "$i"; done
 git checkout -b stable A
 for i in B C D; do test_commit "$i"; done

I am on the stable branch, and I cherry-pick the bug fix F:

 git cherry-pick -x F

Next I want to cherry-pick H.  Why?  H is a bug fix for G, so I guess
I wanted G as well.

 git cherry-pick -x G
 git cherry-pick -x H

And at this point, "gitk" and "git log" will show me a series of
commits with subjects H, G, F, D, C, B, A, although the commits
labelled H, G, and F on this branch have different commit IDs than the
commits tagged H, G, and F, which were made on the devel branch.

I guess this is where your wish comes in: maybe I wanted to cherry-pick
the changes from G and H with a single commit?  Let me back out the
last two changes:

 git reset --hard HEAD^^

Now I apply the changes from G and H without making a commit:

 git cherry-pick --no-commit G
 git cherry-pick --no-commit H

and make a commit

 git commit

being sure to write a message that describes the combined change.
Throughout, I might have a window open from running

 gitk --all &	# or just gitk, or gitk devel, or whatever

and hit Ctrl+F5 after each command to see its effect in gitk.
in the stable branch.

But suppose you want to see just those commits on the stable branch
that do not have anything analogous in devel.  (I have often wanted
something like this.)  The best way I know to do this is kind of
clunky:

 gitk --left-right --cherry-pick devel...stable

This will show the commits in devel but not in stable with a left-
pointing marker and those stable but not in devel with a right-
pointing marker.

Note: a commit like the combined G+H described above would show up as
a commit in stable but not in devel.

This sounds like a CVS-ism.  Commits aren’t attached to any particular
branch in git.

Hope that helps,
Jonathan

#493646#47
Date:
2010-02-14 14:57:05 UTC
From:
To:
I think they're mostly created using git cvsimport, which seems
to be doing something else and makes it show up as a merge.  I can't
currently show you a cvs repo which has that problem, nor do
I have a copy left of that.  I tried to re-import it but it
cvsps seems to have a problem with the repo.
  git cherry-pick -x F
  git cherry-pick -x H

But that's clearly not the behaviour I was seeing, and that looks
obvious what the history is.  It also does not indicate any "merge".


Kurt

#493646#52
Date:
2010-02-15 06:17:17 UTC
From:
To:
retitle 493646 git cvsimport -m: too eager to declare a merge
reassign 493646 git-cvs
thanks

Kurt Roeckx wrote:

Okay, so the problem is that merges in the CVS world and git world are
completely different things, but git cvsimport is pretending that they
are not.  Here’s what the merge detection code (which is very old:
v0.99.5~15^2~6, 2005-08-16) says:

	[PATCH] Add merge detection to git-cvsimport

	Added -m and -M flags for git-cvsimport to detect merge
	commits in cvs.  While this trusts the commit message, in
	repositories where merge commits indicate 'merged from
	FOOBRANCH' the import works surprisingly well.

	Even if some merges from CVS are bogus or incomplete, the
	resulting branches are in better state to go forward (and
	merge) than without any merge detection.

It detects merges by looking at the commit message and is not smart
enough to tell when this was just a backport of a few patches from a
branch.

I do not know how much information CVS actually records about a merge.
Is the information you want available in CVS?  Can it be determined
locally (from looking at ,v files) or remotely (with cvs log or rlog)?

Right.  Thanks for your explanations --- they have been very helpful.

Jonathan

#493646#63
Date:
2010-02-15 19:12:02 UTC
From:
To:
[...]

If it has the same commit message (and is on a different branch),
it's probably just that patch that is applied.

As far as I know, a cvs merge just takes a bunch of commits from 1
branch, turns that in a patch and applies that patch in an other
branch.  And then you have to commit that patch, possibly after
resolving any conflict.

Cvs doesn't even track a commit over several files, and this is
basicly guessed by looking at the commit message and timestamp
to see if this is 1 commit or not.


Kurt