合并两个 Git 存储库而不破坏文件历史记录

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/13040958/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-10 14:48:45  来源:igfitidea点击:

Merge two Git repositories without breaking file history

gitgit-subtree

提问by Eric Lee

I need to merge two Git repositories into a brand new, third repository. I've found many descriptions of how to do this using a subtree merge (for example Jakub Nar?bski's answeron How do you merge two Git repositories?) and following those instructions mostly works, except that when I commit the subtree merge all of the files from the old repositories are recorded as new added files. I can see the commit history from the old repositories when I do git log, but if I do git log <file>it shows only one commit for that file - the subtree merge. Judging from the comments on the above answer, I'm not alone in seeing this problem but I've found no published solutions for it.

我需要将两个 Git 存储库合并到一个全新的第三个存储库中。我发现了许多关于如何使用子树合并来执行此操作的描述(例如Jakub Nar?bski如何合并两个 Git 存储库的回答),并且按照这些说明进行操作大多有效,除了当我提交子树合并所有旧存储库中的文件被记录为新添加的文件。当我这样做时git log,我可以看到旧存储库中的提交历史记录,但是如果我这样做,git log <file>它只会显示该文件的一个提交 - 子树合并。从对上述答案的评论来看,我并不是唯一看到这个问题的人,但我没有找到针对它的已发布解决方案。

Is there any way do merge repositories and leave individual file history intact?

有没有办法合并存储库并保持单个文件历史记录完整?

回答by Eric Lee

It turns out that the answer is much simpler if you're simply trying to glue two repositories together and make it look like it was that way all along rather than manage an external dependency. You simply need to add remotes to your old repos, merge them to your new master, move the files and folders to a subdirectory, commit the move, and repeat for all additional repos. Submodules, subtree merges, and fancy rebases are intended to solve a slightly different problem and aren't suitable for what I was trying to do.

事实证明,如果您只是试图将两个存储库粘合在一起并使其看起来一直都是这样,而不是管理外部依赖项,那么答案要简单得多。您只需将遥控器添加到旧存储库,将它们合并到新主存储库,将文件和文件夹移动到子目录,提交移动,然后对所有其他存储库重复此操作。子模块、子树合并和花哨的变基旨在解决稍微不同的问题,不适合我尝试做的事情。

Here's an example Powershell script to glue two repositories together:

这是将两个存储库粘合在一起的示例 Powershell 脚本:

# Assume the current directory is where we want the new repository to be created
# Create the new repository
git init

# Before we do a merge, we have to have an initial commit, so we'll make a dummy commit
git commit --allow-empty -m "Initial dummy commit"

# Add a remote for and fetch the old repo
git remote add -f old_a <OldA repo URL>

# Merge the files from old_a/master into new/master
git merge old_a/master --allow-unrelated-histories

# Move the old_a repo files and folders into a subdirectory so they don't collide with the other repo coming later
mkdir old_a
dir -exclude old_a | %{git mv $_.Name old_a}

# Commit the move
git commit -m "Move old_a files into subdir"

# Do the same thing for old_b
git remote add -f old_b <OldB repo URL>
git merge old_b/master --allow-unrelated-histories
mkdir old_b
dir –exclude old_a,old_b | %{git mv $_.Name old_b}
git commit -m "Move old_b files into subdir"

Obviously you could instead merge old_b into old_a (which becomes the new combined repo) if you'd rather do that – modify the script to suit.

显然,如果您愿意,您可以将 old_b 合并到 old_a(它成为新的组合存储库)中 - 修改脚本以适应。

If you want to bring over in-progress feature branches as well, use this:

如果您还想引入正在进行的功能分支,请使用以下命令:

# Bring over a feature branch from one of the old repos
git checkout -b feature-in-progress
git merge -s recursive -Xsubtree=old_a old_a/feature-in-progress

That's the only non-obvious part of the process - that's not a subtree merge, but rather an argument to the normal recursive merge that tells Git that we renamed the target and that helps Git line everything up correctly.

这是该过程中唯一不明显的部分——这不是子树合并,而是正常递归合并的一个参数,它告诉 Git 我们重命名了目标并帮助 Git 正确排列所有内容。

I wrote up a slightly more detailed explanation here.

在这里写了一个更详细的解释。

回答by Flimm

Here's a way that doesn't rewrite any history, so all commit IDs will remain valid. The end-result is that the second repo's files will end up in a subdirectory.

这是一种不会重写任何历史记录的方法,因此所有提交 ID 都将保持有效。最终结果是第二个 repo 的文件将在子目录中结束。

  1. Add the second repo as a remote:

    cd firstgitrepo/
    git remote add secondrepo username@servername:andsoon
    
  2. Make sure that you've downloaded all of the secondrepo's commits:

    git fetch secondrepo
    
  3. Create a local branch from the second repo's branch:

    git branch branchfromsecondrepo secondrepo/master
    
  4. Move all its files into a subdirectory:

    git checkout branchfromsecondrepo
    mkdir subdir/
    git ls-tree -z --name-only HEAD | xargs -0 -I {} git mv {} subdir/
    git commit -m "Moved files to subdir/"
    
  5. Merge the second branch into the first repo's master branch:

    git checkout master
    git merge --allow-unrelated-histories branchfromsecondrepo
    
  1. 将第二个 repo 添加为远程:

    cd firstgitrepo/
    git remote add secondrepo username@servername:andsoon
    
  2. 确保你已经下载了第二个仓库的所有提交:

    git fetch secondrepo
    
  3. 从第二个 repo 的分支创建一个本地分支:

    git branch branchfromsecondrepo secondrepo/master
    
  4. 将其所有文件移动到一个子目录中:

    git checkout branchfromsecondrepo
    mkdir subdir/
    git ls-tree -z --name-only HEAD | xargs -0 -I {} git mv {} subdir/
    git commit -m "Moved files to subdir/"
    
  5. 将第二个分支合并到第一个 repo 的 master 分支中:

    git checkout master
    git merge --allow-unrelated-histories branchfromsecondrepo
    

Your repository will have more than one root commit, but that shouldn't pose a problem.

您的存储库将有多个根提交,但这应该不会造成问题。

回答by Fredrik Erlandsson

I turned the solutionfrom @Flimm this into a git aliaslike this (added to my ~/.gitconfig):

我将@Flimm this的解决方案变成了git alias这样的(添加到我的~/.gitconfig):

[alias]
 mergeRepo = "!mergeRepo() { \
  [ $# -ne 3 ] && echo \"Three parameters required, <remote URI> <new branch> <new dir>\" && exit 1; \
  git remote add newRepo ; \
  git fetch newRepo; \
  git branch \"\" newRepo/master; \
  git checkout \"\"; \
  mkdir -vp \"${GIT_PREFIX}\"; \
  git ls-tree -z --name-only HEAD | xargs -0 -I {} git mv {} \"${GIT_PREFIX}\"/; \
  git commit -m \"Moved files to '${GIT_PREFIX}'\"; \
  git checkout master; git merge --allow-unrelated-histories --no-edit -s recursive -X no-renames \"\"; \
  git branch -D \"\"; git remote remove newRepo; \
}; \
mergeRepo"

回答by Adam Dymitruk

please have a look at using

请看看使用

git rebase --root --preserve-merges --onto

to link two histories early on in their lives.

将他们生命早期的两段历史联系起来。

If you have paths that overlap, fix them up with

如果您有重叠的路径,请修复它们

git filter-branch --index-filter

when you use log, ensure you "find copies harder" with

当您使用日志时,请确保您“更难找到副本”

git log -CC

that way you will find any movements of files in the path.

这样你就会发现路径中文件的任何移动。

回答by abautista

A few years have passed and there are well-based up-voted solutions but I want to share mine because it was a bit different because I wanted to merge 2 remote repositories into a new one without deleting the history from the previous repositories.

几年过去了,有一些基于良好投票的解决方案,但我想分享我的解决方案,因为它有点不同,因为我想将 2 个远程存储库合并到一个新的存储库中,而不删除以前存储库的历史记录。

  1. Create a new repository in Github.

    enter image description here

  2. Download the newly created repo and add the old remote repository.

    git clone https://github.com/alexbr9007/Test.git
    cd Test
    git remote add OldRepo https://github.com/alexbr9007/Django-React.git
    git remote -v
    
  3. Fetch for all the files from the old repo so a new branch gets created.

    git fetch OldRepo
    git branch -a
    

    enter image description here

  4. In the master branch, do a merge to combine the old repo with the newly created one.

    git merge remotes/OldRepo/master --allow-unrelated-histories
    

    enter image description here

  5. Create a new folder to store all the new created content that was added from the OldRepo and move its files into this new folder.

  6. Lastly, you can upload the files from the combined repos and safely delete the OldRepo from GitHub.

  1. 在 Github 中创建一个新的存储库。

    在此处输入图片说明

  2. 下载新创建的 repo 并添加旧的远程存储库。

    git clone https://github.com/alexbr9007/Test.git
    cd Test
    git remote add OldRepo https://github.com/alexbr9007/Django-React.git
    git remote -v
    
  3. 从旧 repo 中获取所有文件,以便创建一个新分支。

    git fetch OldRepo
    git branch -a
    

    在此处输入图片说明

  4. 在 master 分支中,进行合并以将旧 repo 与新创建的 repo 合并。

    git merge remotes/OldRepo/master --allow-unrelated-histories
    

    在此处输入图片说明

  5. 创建一个新文件夹来存储从 OldRepo 添加的所有新创建的内容,并将其文件移动到这个新文件夹中。

  6. 最后,您可以从合并的存储库上传文件并从 GitHub 安全地删除 OldRepo。

Hope this can be useful for anyone dealing with merging remote repositories.

希望这对处理合并远程存储库的任何人都有用。

回答by Andrey Izman

This function will clone remote repo into local repo dir:

这个函数将远程仓库克隆到本地仓库目录:

function git-add-repo
{
    repo=""
    dir="$(echo "" | sed 's/\/$//')"
    path="$(pwd)"

    tmp="$(mktemp -d)"
    remote="$(echo "$tmp" | sed 's/\///g'| sed 's/\./_/g')"

    git clone "$repo" "$tmp"
    cd "$tmp"

    git filter-branch --index-filter '
        git ls-files -s |
        sed "s,\t,&'"$dir"'/," |
        GIT_INDEX_FILE="$GIT_INDEX_FILE.new" git update-index --index-info &&
        mv "$GIT_INDEX_FILE.new" "$GIT_INDEX_FILE"
    ' HEAD

    cd "$path"
    git remote add -f "$remote" "file://$tmp/.git"
    git pull "$remote/master"
    git merge --allow-unrelated-histories -m "Merge repo $repo into master" --edit "$remote/master"
    git remote remove "$remote"
    rm -rf "$tmp"
}

How to use:

如何使用:

cd current/package
git-add-repo https://github.com/example/example dir/to/save

Notice. This script can rewrite commits but will save all authors and dates, it means new commits will have another hashes, and if you try to push changes to remote server it can be able only with force key, also it will rewrite commits on server. So please make backups before to launch.

注意。该脚本可以重写提交,但会保存所有作者和日期,这意味着新提交将具有另一个哈希值,如果您尝试将更改推送到远程服务器,它只能使用强制键,它还会重写服务器上的提交。所以请在启动前做好备份。

Profit!

利润!

回答by AnoopGoudar

Follow the steps to embed one repo into another repo, having one single git history by merging both git histories.

按照步骤将一个 repo 嵌入到另一个 repo 中,通过合并两个 git 历史来拥有一个 git 历史。

  1. Clone both the repos you want to merge.
  1. 克隆要合并的两个存储库。

git clone [email protected]:user/parent-repo.git

git clone [email protected]:user/child-repo.git

git clone [email protected]:user/parent-repo.git

git clone [email protected]:user/child-repo.git

  1. Go to child repo
  1. 转到子仓库

cd child-repo/

cd 子仓库/

  1. run the below command, replace path my/new/subdir(3 occurences) with directory structure where you want to have the child repo.
  1. 运行以下命令,将路径my/new/subdir(出现 3 次)替换为您想要拥有子存储库的目录结构。

git filter-branch --prune-empty --tree-filter ' if [ ! -e my/new/subdir ]; then mkdir -p my/new/subdir git ls-tree --name-only $GIT_COMMIT | xargs -I files mv files my/new/subdir fi'

git filter-branch --prune-empty --tree-filter ' if [ ! -e 我的/新的/子目录 ]; 然后 mkdir -p my/new/subdir git ls-tree --name-only $GIT_COMMIT | xargs -I 文件 mv 文件 my/new/subdir fi'

  1. Go to parent repo
  1. 转到父仓库

cd ../parent-repo/

cd ../父仓库/

  1. Add a remote to parent repo, pointing path to child repo
  1. 将远程添加到父仓库,指向子仓库的路径

git remote add child-remote ../child-repo/

git remote add child-remote ../child-repo/

  1. Fetch the child repo
  1. 获取子仓库

git fetch child-remote

git fetch 子远程

  1. Merge the histories
  1. 合并历史

git merge --allow-unrelated-histories child-remote/master

git merge --allow-unrelated-history child-remote/master

If you check the git log in the parent repo now, it should have the child repo commits merged. You can also see the tag indicating from the commit source.

如果您现在检查父仓库中的 git log,它应该合并子仓库提交。您还可以看到来自提交源的标记。

Below article helped me in Embedding one repo into another repo, having one single git history by merging both git histories.

下面的文章帮助我将一个 repo 嵌入到另一个 repo 中,通过合并两个 git 历史来拥有一个 git 历史。

http://ericlathrop.com/2014/01/combining-git-repositories/

http://ericlathrop.com/2014/01/combining-git-repositories/

Hope this helps. Happy Coding!

希望这可以帮助。快乐编码!

回答by x-yuri

Say you want to merge repository ainto b(I'm assuming they're located alongside one another):

假设您想将存储库合并ab(我假设它们并排放置):

cd b
git remote add a ../a
git fetch a
git merge --allow-unrelated-histories a/master
git remote remove a

In case you want to put ainto a subdirectory do the following before the commands above:

如果您想放入a子目录,请在执行上述命令之前执行以下操作:

cd a
git filter-repo --to-subdirectory-filter a
cd ..

For this you need git-filter-repoinstalled (filter-branchis discouraged).

为此,您需要git-filter-repo安装(filter-branch劝阻)。

An example of merging 2 big repositories, putting one of them into a subdirectory: https://gist.github.com/x-yuri/9890ab1079cf4357d6f269d073fd9731

合并 2 个大型存储库,将其中一个放入子目录的示例:https: //gist.github.com/x-yuri/9890ab1079cf4357d6f269d073fd9731

More on it here.

更多关于它在这里