组合多个 git 存储库

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/277029/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-10 05:56:10  来源:igfitidea点击:

Combining multiple git repositories

git

提问by Will Robertson

Let's say I've got a setup that look something like

假设我有一个看起来像的设置

phd/code/
phd/figures/
phd/thesis/

For historical reasons, these all have their own git repositories. But I'd like to combine them into a single one to simplify things a little. For example, right now I might make two sets of changes and have to do something like

由于历史原因,这些都有自己的 git 存储库。但是我想将它们合并为一个以简化一些事情。例如,现在我可能会进行两组更改,并且必须执行类似的操作

cd phd/code
git commit 
cd ../figures
git commit

It'd be (now) nice to just to perform

(现在)只是表演就好了

cd phd
git commit

There seems to be a couple of ways of doing this using submodules or pulling from my sub-repositories, but that's a little more complex than I'm looking for. At the very least, I'd be happy with

似乎有几种方法可以使用子模块或从我的子存储库中提取,但这比我正在寻找的要复杂一些。至少,我会很高兴

cd phd
git init
git add [[everything that's already in my other repositories]]

but that doesn't seem like a one-liner. Is there anything in gitthat can help me out?

但这似乎不是单行的。有什么git可以帮助我的吗?

回答by MiniQuark

Here's a solution I gave here:

这是我在这里给出的解决方案:

  1. First do a complete backup of your phd directory: I don't want to be held responsible for your losing years of hard work! ;-)

    $ cp -r phd phd-backup
    
  2. Move the content of phd/codeto phd/code/code, and fix the history so that it looks like it has always been there (this uses git's filter-branchcommand):

    $ cd phd/code
    $ git filter-branch --index-filter \
        'git ls-files -s | sed "s#\t#&code/#" |
         GIT_INDEX_FILE=$GIT_INDEX_FILE.new \
         git update-index --index-info &&
         mv $GIT_INDEX_FILE.new $GIT_INDEX_FILE' HEAD
    
  3. Same for the content of phd/figuresand phd/thesis(just replace codewith figuresand thesis).

    Now your directory structure should look like this:

    phd
      |_code
      |    |_.git
      |    |_code
      |         |_(your code...)
      |_figures
      |    |_.git
      |    |_figures
      |         |_(your figures...)
      |_thesis
           |_.git
           |_thesis
                |_(your thesis...)
    
  4. Then create a git repository in the root directory, pull everything into it and remove the old repositories:

    $ cd phd
    $ git init
    
    $ git pull code
    $ rm -rf code/code
    $ rm -rf code/.git
    
    $ git pull figures --allow-unrelated-histories
    $ rm -rf figures/figures
    $ rm -rf figures/.git
    
    $ git pull thesis --allow-unrelated-histories
    $ rm -rf thesis/thesis
    $ rm -rf thesis/.git
    

    Finally, you should now have what you wanted:

    phd
      |_.git
      |_code
      |    |_(your code...)
      |_figures
      |    |_(your figures...)
      |_thesis
           |_(your thesis...)
    
  1. 首先对你的 phd 目录做一个完整的备份:我不想为你失去多年的辛勤工作负责!;-)

    $ cp -r phd phd-backup
    
  2. 移动phd/codeto的内容phd/code/code,并修复历史记录,使其看起来一直存在(这使用了 git 的filter-branch命令):

    $ cd phd/code
    $ git filter-branch --index-filter \
        'git ls-files -s | sed "s#\t#&code/#" |
         GIT_INDEX_FILE=$GIT_INDEX_FILE.new \
         git update-index --index-info &&
         mv $GIT_INDEX_FILE.new $GIT_INDEX_FILE' HEAD
    
  3. phd/figuresand的内容相同phd/thesis(只需替换codefiguresand thesis)。

    现在你的目录结构应该是这样的:

    phd
      |_code
      |    |_.git
      |    |_code
      |         |_(your code...)
      |_figures
      |    |_.git
      |    |_figures
      |         |_(your figures...)
      |_thesis
           |_.git
           |_thesis
                |_(your thesis...)
    
  4. 然后在根目录下创建一个 git 仓库,把所有东西都拉进去并删除旧的仓库:

    $ cd phd
    $ git init
    
    $ git pull code
    $ rm -rf code/code
    $ rm -rf code/.git
    
    $ git pull figures --allow-unrelated-histories
    $ rm -rf figures/figures
    $ rm -rf figures/.git
    
    $ git pull thesis --allow-unrelated-histories
    $ rm -rf thesis/thesis
    $ rm -rf thesis/.git
    

    最后,您现在应该拥有您想要的:

    phd
      |_.git
      |_code
      |    |_(your code...)
      |_figures
      |    |_(your figures...)
      |_thesis
           |_(your thesis...)
    

One nice side to this procedure is that it will leave non-versionedfiles and directories in place.

这个过程的一个好处是它将保留非版本化的文件和目录。

Hope this helps.

希望这可以帮助。



Just one word of warning though: if your codedirectory already has a codesubdirectory or file, things might go very wrong (same for figuresand thesisof course). If that's the case, just rename that directory or file before going through this whole procedure:

警告的就一个词是:如果你的code目录已经有一个code子目录或文件,事情可能会去非常错误的(同为figuresthesis的课程)。如果是这种情况,只需在完成整个过程之前重命名该目录或文件:

$ cd phd/code
$ git mv code code-repository-migration
$ git commit -m "preparing the code directory for migration"

And when the procedure is finished, add this final step:

程序完成后,添加最后一步:

$ cd phd
$ git mv code/code-repository-migration code/code
$ git commit -m "final step for code directory migration"

Of course, if the codesubdirectory or file is not versioned, just use mvinstead of git mv, and forget about the git commits.

当然,如果code子目录或文件没有版本化,只需使用mv代替git mv,而忽略git commits。

回答by Aristotle Pagaltzis

git-stitch-repowill process the output of git-fast-export --all --date-orderon the git repositories given on the command-line, and create a stream suitable for git-fast-importthat will create a new repository containing all the commits in a new commit tree that respects the history of all the source repositories.

git-stitch-repo将处理git-fast-export --all --date-order命令行上给出的 git 存储库上的输出,并创建一个适合该流的流,git-fast-import这将创建一个新存储库,其中包含尊重所有源存储库历史的新提交树中的所有提交。

回答by imz -- Ivan Zakharyaschev

Perhaps, simply (similarly to the previous answer, but using simpler commands) making in each of the separate old repositories a commit that moves the content into a suitably named subdir, e.g.:

也许,简单地(类似于上一个答案,但使用更简单的命令)在每个单独的旧存储库中进行一次提交,将内容移动到适当命名的子目录中,例如:

$ cd phd/code
$ mkdir code
# This won't work literally, because * would also match the new code/ subdir, but you understand what I mean:
$ git mv * code/
$ git commit -m "preparing the code directory for migration"

and then merging the three separate repos into one new, by doing smth like:

然后通过执行以下操作将三个单独的存储库合并为一个新存储库:

$ cd ../..
$ mkdir phd.all
$ cd phd.all
$ git init
$ git pull ../phd/code
...

Then you'll save your histories, but will go on with a single repo.

然后您将保存您的历史记录,但将继续执行单个回购。

回答by Leif Gruenwoldt

You could try the subtree merge strategy. It will let you merge repo B into repo A. The advantage over git-filter-branchis it doesn't require you to rewrite your history of repo A (breaking SHA1 sums).

您可以尝试子树合并策略。它将让您将 repo B 合并到 repo A 中。优点git-filter-branch在于它不需要您重写 repo A 的历史记录(打破 SHA1 总和)。

回答by Gareth

The git-filter-branch solution works well, but note that if your git repo comes from a SVN import it may fail with a message like:

git-filter-branch 解决方案运行良好,但请注意,如果您的 git repo 来自 SVN 导入,它可能会失败并显示如下消息:

Rewrite 422a38a0e9d2c61098b98e6c56213ac83b7bacc2 (1/42)mv: cannot stat `/home/.../wikis/nodows/.git-rewrite/t/../index.new': No such file or directory

In this case you need to exclude the initial revision from the filter-branch - i.e. change the HEADat the end to [SHA of 2nd revision]..HEAD- see:

在这种情况下,您需要从过滤器分支中排除初始修订版 -HEAD即将末尾更改为[SHA of 2nd revision]..HEAD- 请参阅:

http://www.git.code-experiments.com/blog/2010/03/merging-git-repositories.html

http://www.git.code-experiments.com/blog/2010/03/merging-git-repositories.html

回答by MichK

@MiniQuark solution helped me a lot, but unfortunately it doesn't take into account tags which are in source repositories (At least in my case). Below is my improvement to @MiniQuark answer.

@MiniQuark 解决方案对我帮助很大,但不幸的是它没有考虑源存储库中的标签(至少在我的情况下)。以下是我对@MiniQuark 答案的改进。

  1. First create directory which will contain composed repo and merged repos, create directory for each merged one.

    $ mkdir new_phd
    $ mkdir new_phd/code
    $ mkdir new_phd/figures
    $ mkdir new_phd/thesis

  2. Do a pull of each repository and fetch all tags. (Presenting instructions only for codesub-directory)

    $ cd new_phd/code
    $ git init
    $ git pull ../../original_phd/code master
    $ git fetch ../../original_phd/code refs/tags/*:refs/tags/*

  3. (This is improvement to point 2 in MiniQuark answer) Move the content of new_phd/codeto new_phd/code/codeand add code_prefeix before each tag

    $ git filter-branch --index-filter 'git ls-files -s | sed "s-\t\"*-&code/-" | GIT_INDEX_FILE=$GIT_INDEX_FILE.new git update-index --index-info && mv $GIT_INDEX_FILE.new $GIT_INDEX_FILE' --tag-name-filter 'sed "s-.*-code_&-"' HEAD

  4. After doing so there will be twice as many tags as it was before doing filter-branch. Old tags remain in repo and new tags with code_prefix are added.

    $ git tag
    mytag1
    code_mytag1

    Remove old tags manually:

    $ ls .git/refs/tags/* | grep -v "/code_" | xargs rm

    Repeat point 2,3,4 for other subdirectories

  5. Now we have structure of directories as in @MiniQuark anwser point 3.

  6. Do as in point 4 of MiniQuark anwser, but after doing a pull and before removing .gitdir, fetch tags:

    $ git fetch catalog refs/tags/*:refs/tags/*

    Continue..

  1. 首先创建包含组合存储库和合并存储库的目录,为每个合并的存储库创建目录。

    $ mkdir new_phd
    $ mkdir new_phd/代码
    $ mkdir new_phd/数字
    $ mkdir new_phd/论文

  2. 拉取每个存储库并获取所有标签。(仅显示code子目录的说明)

    $ cd new_phd/code
    $ git init
    $ git pull ../../original_phd/code master
    $ git fetch ../../original_phd/code refs/tags/*:refs/tags/*

  3. (这是对 MiniQuark 答案中第 2 点的改进)移动new_phd/codeto的内容new_phd/code/codecode_在每个标签前添加前缀

    $ git filter-branch --index-filter 'git ls-files -s | sed "s-\t\"*-&code/-" | GIT_INDEX_FILE=$GIT_INDEX_FILE.new git update-index --index-info && mv $GIT_INDEX_FILE.new $GIT_INDEX_FILE' --tag-name-filter 'sed "s -.*-code_&-"' HEAD

  4. 这样做之后,标签数量将是执行 filter-branch 之前的两倍。旧标签保留在 repo 中,并code_添加带有前缀的新标签。

    $ git tag
    mytag1
    code_mytag1

    手动删除旧标签:

    $ ls .git/refs/tags/* | grep -v "/code_" | xargs rm

    对其他子目录重复点 2,3,4

  5. 现在我们有了@MiniQuark anwser point 3 中的目录结构。

  6. 按照 MiniQuark anwser 的第 4 点进行操作,但是在执行 pull 之后,在删除.gitdir之前,获取标签:

    $ git fetch 目录 refs/tags/*:refs/tags/*

    继续..

This is just another solution. Hope it helps someone, it helped me :)

这只是另一种解决方案。希望它可以帮助某人,它帮助了我:)

回答by robinst

git-stitch-repo from Aristotle Pagaltzis' answeronly works for repositories with simple, linear history.

来自Aristotle Pagaltzis 的回答的git-stitch-repo仅适用于具有简单、线性历史的存储库。

MiniQuark's answerworks for all repositories, but it does not handle tags and branches.

MiniQuark 的答案适用于所有存储库,但它不处理标签和分支。

I created a program that works the same way as MiniQuark describes, but it uses one merge commit (with N parents) and also recreates all tags and branches to point to these merge commits.

我创建了一个程序,其工作方式与 MiniQuark 描述的方式相同,但它使用一个合并提交(具有 N 个父项)并重新创建所有标记和分支以指向这些合并提交。

See the git-merge-repos repositoryfor examples how to use it.

有关如何使用它的示例,请参阅git-merge-repos 存储库

回答by user3622356

Actually, git-stitch-repo now supports branches and tags, including annotated tags (I found there was a bug which I reported, and it got fixed). What i found useful is with tags. Since tags are attached to commits, and some of the solutions (like Eric Lee's approach) fails to deal with tags. You try to create a branch off an imported tag, and it will undo any git merges/moves and sends you back like the consolidated repository being near identical to the repository that the tag came from. Also, there are issues if you use the same tag across multiple repositories that you 'merged/consolidated'. For example, if you have repo's A ad B, both having tag rel_1.0. You merge repo A and repo B into repo AB. Since rel_1.0 tags are on two different commits (one for A and one for B), which tag will be visible in AB? Either the tag from the imported repo A or from imported repo B, but not both.

实际上,git-stitch-repo 现在支持分支和标签,包括带注释的标签(我发现我报告了一个错误,并已修复)。我发现有用的是标签。由于标签附加到提交,并且一些解决方案(如 Eric Lee 的方法)无法处理标签。您尝试从导入的标签创建一个分支,它会撤消任何 git 合并/移动并将您发送回来,就像统一存储库与标签来自的存储库几乎相同。此外,如果您在“合并/整合”的多个存储库中使用相同的标签,则会出现问题。例如,如果您有 repo 的 A 广告 B,两者都具有标签 rel_1.0。您将 repo A 和 repo B 合并到 repo AB 中。由于 rel_1.0 标签位于两个不同的提交上(一个用于 A,一个用于 B),哪个标签在 AB 中可见?来自导入的 repo A 或来自导入的 repo B 的标签,但不能同时使用两者。

git-stitch-repo helps to address that problem by creating rel_1.0-A and rel_1.0-B tags. You may not be able to checkout rel_1.0 tag and expect both, but at least you can see both, and theoretically, you can merge them into a common local branch then create a rel_1.0 tag on that merged branch (assuming you just merge and not change source code). It's better to work with branches, as you can merge like branches from each repo into local branches. (dev-a and dev-b can be merged into a local dev branch which can then be pushed to origin).

git-stitch-repo 通过创建 rel_1.0-A 和 rel_1.0-B 标签来帮助解决这个问题。您可能无法检出 rel_1.0 标记并期望两者都存在,但至少您可以同时看到两者,理论上,您可以将它们合并到一个公共本地分支中,然后在该合并分支上创建一个 rel_1.0 标记(假设您只是合并而不更改源代码)。最好使用分支,因为您可以将每个 repo 中的分支合并到本地分支中。(dev-a 和 dev-b 可以合并到一个本地 dev 分支,然后可以推送到原点)。

回答by Giuseppe Monteleone

I have created a tool that make this task. The method used is similar (internally make some things like --filter-branch) but is more friendly. Is GPL 2.0

我创建了一个工具来完成这个任务。使用的方法类似(内部制作一些类似 --filter-branch 的东西)但更友好。是 GPL 2.0

http://github.com/geppo12/GitCombineRepo

http://github.com/geppo12/GitCombineRepo

回答by Patrick_O

The sequence you suggested

你建议的顺序

git init
git add *
git commit -a -m "import everything"

will work, but you will lose your commit history.

会工作,但你会失去你的提交历史。