如何同步两个远程 Git 存储库?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/15056327/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-10 15:35:29  来源:igfitidea点击:

How do I synchronise two remote Git repositories?

gitversion-controlgithubdvcs

提问by Danny Tuppeny

I have two repository urls, and I want to synchronise them such that they both contain the same thing. In Mercurial, what I'm trying to do would be:

我有两个存储库 url,我想同步它们,以便它们都包含相同的内容。在 Mercurial 中,我想做的是:

hg pull {repo1}
hg pull {repo2}
hg push -f {repo1}
hg push -f {repo2}

This will result in two heads in both repos (I know it's not common to have two heads, but I'm doing this for synchornisation and it needs to be non-interactive. The heads will be merged manually from one of the repos and then the sync run again).

这将导致两个仓库中有两个头(我知道有两个头并不常见,但我这样做是为了同步,它需要是非交互式的。头将从其中一个仓库手动合并,然后同步再次运行)。

I'd like to do the same thing in Git. Eg., with no user interaction, get all of the changes into both repos, with multiple branches/heads/whatever to be merged later. I'm trying to do this using urls in the commands, rather than adding remotes(?), as there could be a number of repos involved, and having aliases for them all will just make my script more complicated.

我想在 Git 中做同样的事情。例如,在没有用户交互的情况下,将所有更改放入两个存储库中,多个分支/头/任何稍后要合并的内容。我正在尝试使用命令中的 url 来执行此操作,而不是添加遥控器(?),因为可能涉及许多存储库,并且为它们设置别名只会使我的脚本更加复杂。

I'm currently cloning the repo using git clone --bar {repo1}however I'm struggling to "update" it. I've tried get fetch {repo1}but that doesn't seem to pull my changes down; git logstill doesn't show the changeset that has been added in repo1.

我目前正在使用克隆存储库,git clone --bar {repo1}但是我正在努力“更新”它。我试过了,get fetch {repo1}但这似乎并没有拉低我的更改;git log仍然没有显示已添加到 repo1 中的变更集。

I also tried using --mirrorin my pushand clone, but that seemed to remote changesets from repo2 that didn't exist locally, whereas I need to keep changes from both repos :/

我也尝试--mirror在我的pushand 中使用clone,但这似乎是从本地不存在的 repo2 远程更改集,而我需要保留两个 repos 的更改:/

What's the best way to do this?

做到这一点的最佳方法是什么?

Edit:To make it a little clearer what I'm trying to do...

编辑:为了更清楚地说明我正在尝试做什么......

I have two repositories (eg. BitBucket and GitHub) and want people to be able to push to either (ultimately, one will be Git, one will be Mercurial, but let's assume they're both Git for now to simplify things). I need to be able to run a script that will "sync" the two repos in a way that they both contain both sets of changes, and may require merging manually later.

我有两个存储库(例如 BitBucket 和 GitHub),并希望人们能够推送到其中一个(最终,一个是 Git,一个是 Mercurial,但让我们假设它们现在都是 Git 以简化事情)。我需要能够运行一个脚本来“同步”两个存储库,它们都包含两组更改,并且可能需要稍后手动合并。

Eventually, this means I can just interact with one of the repos (eg. the Mercurial one), and my script will periodically pull in Git changes which I can merge in, and then they'll be pushed back.

最终,这意味着我可以只与其中一个存储库(例如 Mercurial 存储库)进行交互,并且我的脚本将定期拉入我可以合并的 Git 更改,然后它们将被推回。

In Mercurial this is trivial! I just pull from both repos, and push with -f/--forceto allow pushing multiple heads. Then anybody can clone one of the repos, merge the heads, and push back. I want to know how to do the closest similar thing in Git. It must be 100% non-interactive, and must keep both repos in a state that the process can be repeated infinitely (that means no rewriting history/changing changesets etc).

在 Mercurial 中,这是微不足道的!我只是从两个 repos 中拉出,然后 push-f/--force以允许推多个头。然后任何人都可以克隆其中一个 repos,合并头部,然后推回。我想知道如何在 Git 中做最接近的类似事情。它必须是 100% 非交互式的,并且必须使两个存储库都处于可以无限重复该过程的状态(这意味着没有重写历史记录/更改变更集等)。

回答by Eevee

Git branches do not have "heads" in the Mercurial sense. There is only one thing called HEAD, and it's effectively a symlink to the commit you currently have checked out. In the case of hosted repositories like GitHub, there is nocommit checked out—there's just the repository history itself. (Called a "bare" repo.)

Git 分支没有 Mercurial 意义上的“头”。只有一件事叫做HEAD,它实际上是您当前已签出的提交的符号链接。对于像 GitHub 这样的托管存储库,没有提交检出——只有存储库历史本身。(称为“裸”回购。)

The reason for this difference is that Git branch names are completely arbitrary; they don't have to match between copies of a repository, and you can create and destroy them on a whim.[1] Git branches are like Python variable names, which can be shuffled around and stuck to any value as you like; Mercurial branches are like C variables, which refer to fixed preallocated memory locations you then fill with data.

造成这种差异的原因是 Git 分支名称完全是任意的;它们不必在存储库的副本之间匹配,您可以随心所欲地创建和销毁它们。 [1] Git 分支就像 Python 变量名,可以随意打乱并固定为任何值;Mercurial 分支就像 C 变量,它指的是固定的预分配内存位置,然后你用数据填充。

So when you pull in Mercurial, you have two histories for the same branch, because the branch name is a fixed meaningful thing in both repositories. The leaf of each history is a "head", and you'd normally merge them to create a single head.

所以当你拉入 Mercurial 时,你有同一个分支的两个历史记录,因为分支名称在两个存储库中都是一个固定的有意义的东西。每个历史的叶子都是一个“头”,您通常会将它们合并以创建一个头。

But in Git, fetching a remote branch doesn't actually affect your branch at all. If you fetch the masterbranch from origin, it just goes into a branch called origin/master.[2] git pull origin masteris just thin sugar for two steps: fetching the remote branch into origin/master, and then merging that other branch into your current branch. But they don't have to have the same name; your branch could be called developmentor trunkor whatever else. You can pull or merge any other branch into it, and you can push it to any other branch. Git doesn't care.

但是在 Git 中,获取远程分支实际上根本不会影响您的分支。如果您master从 中获取分支origin,它只会进入一个名为origin/master.[2] 的分支,它只是git pull origin master两个步骤的细糖:将远程分支获取到 中origin/master,然后将该另一个分支合并到您当前的分支中。但它们不必具有相同的名称;您的分支可以被称为developmenttrunk或任何其他。您可以将任何其他分支拉入或合并到其中,也可以将其推送到任何其他分支。Git不在乎。

Which brings me back to your problem: you can't push a "second" branch head to a remote Git repository, because the concept doesn't exist. You couldpush to branches with mangled names (bitbucket_master?), but as far as I'm aware, you can't update a remote's remotes remotely.

这让我回到你的问题:你不能将“第二个”分支头推送到远程 Git 存储库,因为这个概念不存在。您可以推送到名称错误 ( bitbucket_master?) 的分支,但据我所知,您无法远程更新遥控器的遥控器。

I don't think your plan makes a lot of sense, though, since with unmerged branches exposed to both repositories, you'd either have to merge them both, or you'd merge one and then mirror it on top of the other... in which case you left the second repository in a useless state for no reason.

不过,我认为您的计划没有多大意义,因为未合并的分支暴露于两个存储库,您要么必须合并它们,要么合并一个然后将其镜像到另一个之上。 .. 在这种情况下,您无缘无故地将第二个存储库置于无用状态。

Is there a reason you can't just do this:

有什么理由不能这样做:

  1. Pick a repository to be canonical—I assume BitBucket. Clone it. It becomes origin.

  2. Add the other repository as a remote called, say, github.

  3. Have a simple script periodically fetch both remotes and attempt to merge the githubbranch(es) into the originbranches. If the merge fails, abort and send you an email or whatever. If the merge is trivial, push the result to both remotes.

  1. 选择一个规范的存储库——我假设是 BitBucket。克隆它。它变成origin.

  2. 将另一个存储库添加为名为的远程存储库,例如github.

  3. 有一个简单的脚本定期获取两个遥控器并尝试将github分支合并到origin分支中。如果合并失败,请中止并向您发送电子邮件或其他任何内容。如果合并是微不足道的,将结果推送到两个遥控器。

Of course, if you just do all your work on feature branches, this all becomes much less of a problem. :)

当然,如果你只是在功能分支上完成所有工作,这一切都变得不那么成问题了。:)



[1] It gets even better: you can merge together branches from different repositories that have no history whatsoeverin common. I've done this to consolidate projects that were started separatedly; they used different directory structures, so it works fine. GitHub uses a similar trick for its Pages feature: the history of your Pages is stored in a branch called gh-pagesthat lives in the same repository but has absolutely no history in common with the rest of your project.

[1] 它变得更好:您可以将没有任何共同历史的不同存储库中的分支合并在一起。我这样做是为了合并单独启动的项目;他们使用了不同的目录结构,所以它工作正常。GitHub 对其页面功能使用了类似的技巧:页面的历史记录存储在一个名为的分支中,该分支gh-pages位于同一个存储库中,但与项目的其余部分绝对没有共同的历史记录。

[2] This is a white lie. The branch is still called master, but it belongs to the remote called origin, and the slash is syntax for referring to it. The distinction can matter because Git has no qualms about slashes in branch names, so you could have a local branch named origin/master, and that would shadow the remote branch.

[2] 这是善意的谎言。分支仍然称为master,但它属于远程称为origin,斜线是引用它的语法。区别很重要,因为 Git 对分支名称中的斜杠没有疑虑,所以你可以有一个名为 的本地分支origin/master,这会影响远程分支。

回答by Bobík

For something similar I use this simple code trigerred by webhook in both repositories to sync GitLab and Bitbucket master branch:

对于类似的事情,我在两个存储库中使用这个由 webhook 触发的简单代码来同步 GitLab 和 Bitbucket 主分支:

git pull origin master
git pull gitlab master
git push origin master
git push gitlab master

It propably is not what you need in question, but it could be helpful for somebody else who needs to sync just one branch.

它可能不是您所需要的,但对于只需要同步一个分支的其他人来说可能会有所帮助。

回答by yorammi

Here's a tested solution for the issue: http://www.tikalk.com/devops/sync-remote-repositories/

这是针对该问题的经过测试的解决方案:http: //www.tikalk.com/devops/sync-remote-repositories/

The commands to run:

要运行的命令:

#!/bin/bash

# REPO_NAME=<repo>.git
# ORIGIN_URL=git@<host>:<project>/$REPO_NAME
# REPO1_URL=git@<host>:<project>/$REPO_NAME

rm -rf $REPO_NAME
git clone --bare $ORIGIN_URL
cd $REPO_NAME
git remote add --mirror=fetch repo1 $REPO1_URL
git fetch origin --tags ; git fetch repo1 --tags
git push origin --all ; git push origin --tags
git push repo1 --all ; git push repo1 --tags

回答by derekv

You might not have seen that the fetch did in fact work when you used git clone --mirror --bare, because by default git does not list it's remote branches. You can list them with git branch -a.

当您使用 时git clone --mirror --bare,您可能没有看到 fetch 实际上有效,因为默认情况下 git 不会列出它的远程分支。你可以用 列出它们git branch -a

I don't quite have the syntax worked out for unnamed remotes, but you could automatically add remotes based on some scheme from the url... in any case, it'll probably work best if you choose some unique and consistent name for each repo, so you can know what changes came from where

我不太了解未命名遥控器的语法,但是您可以根据 url 中的某种方案自动添加遥控器...无论如何,如果您为每个遥控器选择一些唯一且一致的名称,它可能会最有效repo,所以你可以知道什么变化来自哪里

However, you could try something like this:

但是,您可以尝试以下操作:

git clone --bare --mirror --origin thing1 {repo1} repo.git
cd repo.git
git fetch thing2 --mirror
git push thing1 --mirror
git push thing2 --mirror

After this was done, thing1 would have all of thing2's branches available to merge at any time, as remote branches. You can list the remote branches with git branch -a.

完成此操作后,thing1 将可以随时合并thing2 的所有分支,作为远程​​分支。您可以使用 列出远程分支git branch -a

On github or bitbucket, you will not be able to see these remote branches via the web interfaces, however you can see them if you clone with --mirror, so they do exist.

在 github 或 bitbucket 上,您将无法通过 Web 界面看到这些远程分支,但是如果您使用 --mirror 进行克隆,则可以看到它们,因此它们确实存在。

回答by adamdunson

Try git reset --hard HEADafter git fetch. However, I'm not sure I understand exactly what your goal is. You will need to cdinto the separate repository directories before running the fetch, reset, and push commands.

git reset --hard HEAD之后试试git fetch。但是,我不确定我是否完全理解您的目标。cd在运行 fetch、reset 和 push 命令之前,您需要进入单独的存储库目录。

回答by it3xl

As others I came here because this SO-question has so right header for this particular problem.
This was few years ago. I studied available answers, but nothing solved my situation.

和其他人一样,我来到这里是因为这个 SO-question 对这个特定问题有如此正确的标题。
这是几年前的事了。我研究了可用的答案,但没有解决我的情况。

I took a step further and created several solutions.
And my final solution posted here is for Git repositories only (sorry, I do not use Mercurial right now as it was asked in the body of the question).

我更进一步,创建了几个解决方案。
我在这里发布的最终解决方案仅适用于 Git 存储库(抱歉,我现在不使用 Mercurial,因为它在问题正文中被问到了)。

I had rather difficult conditions at the start than others here.

我一开始的条件比这里的其他人要困难得多。

  • I can't use Git-hooks because I always have a limited access to one or the other remote repository. I consider them a burden dependency.
  • Teams on both remote sides are big. They produce huge amount of commits and commits at the same Git-branches too.
  • We need 24/7 and fast synchronizing solution. This drastically decreases amount of possible Git-conflicts and converts them to simple local Git-merges.
  • It was important to have a capability to do Git-merges from any remote repository for any branches and by any member of both teams.
  • Sometimes one remote repository needs to be completely replaced by an empty new repository at another location. And I didn't want to do initial repository filling manually.
  • CI/CD managing branches should be migrated quite arbitrary.
  • Not all present syncing methods recognize subtle changes right. For example, moving back in history.
  • And some methods have a threatening tendency to occasionally delete branches or the entire repository.
  • 我无法使用 Git-hooks,因为我对一个或另一个远程存储库的访问权限始终有限。我认为它们是一种负担依赖。
  • 双方的球队都很大。它们也会在同一个 Git 分支上产生大量的提交和提交。
  • 我们需要 24/7 和快速同步解决方案。这大大减少了可能的 Git 冲突的数量,并将它们转换为简单的本地 Git 合并。
  • 具有从任何远程存储库为任何分支和两个团队的任何成员进行 Git 合并的能力非常重要。
  • 有时,一个远程存储库需要完全替换为另一个位置的空新存储库。而且我不想手动填充初始存储库。
  • CI/CD 管理分支应该非常随意地迁移。
  • 并非所有现有的同步方法都能正确识别细微的变化。例如,回到历史。
  • 并且某些方法具有偶尔删除分支或整个存储库的威胁趋势。

And just like the author of this SO question, I also wanted to have

就像这个问题的作者一样,我也想拥有

  • no user interaction, at all. That is, everything should be fully automated.
  • all changes in both repos for all branches should be migrated.
  • 根本没有用户交互。也就是说,一切都应该完全自动化。
  • 应该迁移所有分支的两个 repos 中的所有更改。

I quickly wrote some bash script and quickly realized that with Git this would not be enough.
The main problem is conflict solving.
Long story short, you can only solve conflicts by some conventions or by asking to repeat a conflicting merge (commit).

我很快写了一些 bash 脚本,并很快意识到使用 Git 这还不够。
主要问题是解决冲突。
长话短说,您只能通过某些约定或要求重复冲突合并(提交)来解决冲突。

So, my script became a compiled application.

所以,我的脚本变成了一个编译的应用程序。

Later arrived some requirements.

后来到了一些要求。

  • It was very desirable to remain some branches invisible to the other side repository.
  • We cannot allow arbitrary updating for all branches. Some branches should be updated in a limited way on one or the other side.
  • Some Git-servers like to create util- (garbage) branches and tags. No one wants them to be migrated.
  • More importantly, some Git servers, like Gitlab or Bitbucket, like to completely block some branches and tags. This is a catastrophe.
  • 保持某些分支对另一侧存储库不可见是非常可取的。
  • 我们不能允许对所有分支进行任意更新。一些分支应该在一侧或另一侧以有限的方式更新。
  • 一些 Git 服务器喜欢创建 util-(垃圾)分支和标签。没有人希望它们被迁移。
  • 更重要的是,一些 Git 服务器,比如 Gitlab 或 Bitbucket,喜欢完全屏蔽一些分支和标签。这是一场灾难。

The other thing is that people tend to forget the fact that they have two synchronized remote Git-repositories. They do it really fast and do not want to remember any rules.
Some teams may have some level of employee rotation.
A big application requires more documentation and support. And etc.

另一件事是人们往往忘记他们有两个同步的远程 Git 存储库的事实。他们做得非常快,不想记住任何规则。
一些团队可能会有一定程度的员工轮换。
大型应用程序需要更多文档和支持。等等。

Finally, I threw out some complicated conflict solving and state analysis.
A remaining subset of logic was pretty simple, which allowed me to convert my app into a bunch of bash and gAWK scripts.

最后,我抛出了一些复杂的冲突解决和状态分析。
剩下的逻辑子集非常简单,它允许我将我的应用程序转换为一堆 bash 和 gAWK 脚本。

And as a result of all, my tool only synchronizes branches with defined prefixes.
You can say, let's synchronize branches that begin with the @.
Or branches that start with my-company-and your-company-.

结果,我的工具只同步具有定义前缀的分支。
你可以说,让我们同步以@.
或以my-company-and开头的分支your-company-

Of course, my tool has a little learning curve.
But it is quite mature and gives a possibility to forget about some sync problems, at all.

当然,我的工具有一点学习曲线。
但它非常成熟,完全可以忘记一些同步问题。

Actually I forgot about my tool and came here two years later, because I've implemented a wish list.

实际上我忘记了我的工具,两年后来到这里,因为我已经实现了一个愿望清单。

My latest tool is here - git-repo-sync. I hope this will help someone else.

我的最新工具在这里 - git-repo-sync。我希望这会帮助别人。