首页 > 解决方案 > Why are git commits on a merged branch showing on my main branch?

问题描述

I currently have a git commit tree that looks something like this with pointers (?) in parentheses:

*   305f Merge branch 'develop' (HEAD->master, origin/master, origin/HEAD)
|\
| * d97b Some other commit on dev branch (develop) 
| * df14 Some commit on dev branch
|/
* 7a761b6 Initial commit

I've pushed master branch to remote (Gitlab, if it matters), and when I look at the commits on the master branch in the Gitlab UI, all 4 commits are present, where I would have expected only the "Merge branch 'develop'" and "Initial commit" commits to be present on the master branch.

My understanding is that master refers to the two commits I just listed, whereas develop refers to "Some other..." , "Some commit...", and possibly "Initial commit" as well, since it is an ancestor.

Where am I going wrong?

标签: gitmergebranchgit-branch

解决方案


In some version control systems, when you make commit C on branch B, commit C is on branch B forever. Anyone who obtains commit C acquires branch B. If they had their own branch B before, well, now their branch B has a new commit C in it.

Git does not do this. Commits are not permanently tacked to branches. However, commits are mostly permanent,1 and permanently tacked to where they appear in the commit graph. In order to make this work, commits aren't really made on any branches. Instead, a branch name is merely a label. Multiple labels can all point to the same commit, as in:

*   305f Merge branch 'develop' (HEAD->master, origin/master, origin/HEAD)

Here master and origin/master2 both identify commit 305f. If you create a new branch br2 now, that name will also point to commit 305f.

Commit 305f has two parents: 7a761b6 (its first parent) and d97b (its second parent). The full name of 305f is whatever its full hash ID is. This will never change; and that hash ID is forever reserved for that commit, which will always have the same two parents. That commit is frozen for all time and will never move.

The things in Git that do move are the branch names. Currently, master means 305f. A moment ago, however, master meant 7a761b6. Commits stay in place, forever, as found by their raw hash IDs. Branch names move about.

The result of all of this is that the set of branches that contain some commit changes, dynamically, as we create and destroy branch names. At this time, the name master lets you find all four commits. If you allow Git to move the name master in the ways that Git prefers, those four commits will continue to be reachable, by starting at 305f and viewing both of its parents, then viewing the parent of d97b (and then from d97b you get back to 7a761b6 which you already saw). Note that new commits only ever add to the graph. In general, each new commit will have some existing commit as its parent—its only parent if it's a typical commit, or one of two parents if it's a merge commit.3

If we draw these things sideways, it's a little easier. To make it even easier than that, we can use single uppercase letters to stand in for the incomprehensible hash IDs that Git uses:

        D--E   <-- br1
       /
A--B--C
       \
        F--G   <-- br2

Here, the name br1 allows us to find commit E, which finds D, then C, then B, then A (and stops because A is the first commit). The name br2 allows us to find G, then F, then C, then B, then A (and then stop). So the first three commits are on both branches, while two commits are each only one one branch.

Erasing the name br1 causes commits D and E to become unfindable. Eventually Git will throw them out for real. If we instead add a new commit by doing git checkout br1, make whatever changes we want, git add, and git commit, we get a new hash ID H and advance the name br1 to include it:

        D--E--H   <-- br1
       /
A--B--C
       \
        F--G   <-- br2

Now there are six commits reachable from br1, starting with H and working backwards. If we make a new name br3 to remember where E is, we'll want to redraw the graph a bit:

             H   <-- br2
            /
        D--E   <-- br3
       /
A--B--C
       \
        F--G   <-- br2

Note that none of the commits have actually changed: we just shoved H up to make room for the label br3.

If we remove the name br3 later, that's OK: there are no commits that are exlusively found through br3. Commit E won't go away because br1 finds H which finds E.


1The mostly comes about through reachability. You find commits by starting from some branch or tag name, which supplies a raw hash ID. Then, having found a commit, you use its parent hash ID(s) to find its predecessor commit(s). Then you use those commits to find their parents, and so on.

By doing this process from every reference—see footnote 2—Git finds all reachable commits. Any commits that exist in the repository, that are not reaachable by this process, eventually get garbage-collected and removed.

2The name master is a branch name. Its full name is really refs/heads/master, and names whose full name starts with refs/heads/ are branch names. By contrast, origin/master is actually a remote-tracking name: its full name starts with refs/remotes/ and then goes on to say origin/master. Git sometimes strips only the refs/ part from this name, so that you see remotes/origin/master.

Tags, if you have any, live in refs/tags/. These things are collectively called refs or references. There are additional, hidden references kept in reflogs. A reflog is simply a log file keeping the values—the previous hash IDs—that were stored in a ref before you, or Git, updated it. These reflog entries eventually expire, which is how deliberately-abandoned commits—those you've replaced with new-and-improved versions that have new hash IDs, for instance—are eventually cleaned out.

3The technical definition of a merge commit is that it is one with at least two parents. You can therefore create a merge commit with 3 or more parents, too—but there's rarely a reason to do that. I should also mention that it's possible to create a new commit with no parent. Except for the very first commit—the one I label A in the sideways graph drawings—you won't do that in normal practice either.


推荐阅读