您如何组织多个 git 存储库,以便将它们全部备份在一起?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/36862/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-10 05:43:56  来源:igfitidea点击:

How do you organise multiple git repositories, so that all of them are backed up together?

gitbackup

提问by dbr

With SVN, I had a single big repository I kept on a server, and checked-out on a few machines. This was a pretty good backup system, and allowed me easily work on any of the machines. I could checkout a specific project, commit and it updated the 'master' project, or I could checkout the entire thing.

使用 SVN,我在服务器上保存了一个大型存储库,并在几台机器上签出。这是一个非常好的备份系统,让我可以轻松地在任何机器上工作。我可以签出一个特定的项目,提交并更新“主”项目,或者我可以签出整个项目。

Now, I have a bunch of git repositories, for various projects, several of which are on github. I also have the SVN repository I mentioned, imported via the git-svn command..

现在,我有一堆 git 存储库,用于各种项目,其中几个在 github 上。我也有我提到的 SVN 存储库,通过 git-svn 命令导入..

Basically, I like having all my code (not just projects, but random snippets and scripts, some things like my CV, articles I've written, websites I've made and so on) in one big repository I can easily clone onto remote machines, or memory-sticks/harddrives as backup.

基本上,我喜欢将我所有的代码(不仅仅是项目,还有随机片段和脚本、一些东西,比如我的简历、我写的文章、我制作的网站等等)放在一个大的存储库中,我可以轻松地克隆到远程机器,或记忆棒/硬盘作为备份。

The problem is, since it's a private repository, and git doesn't allow checking out of a specific folder (that I could push to github as a separate project, but have the changes appear in both the master-repo, and the sub-repos)

问题是,由于它是一个私有存储库,并且 git 不允许检出特定文件夹(我可以将其作为单独的项目推送到 github,但更改同时出现在主存储库和子存储库中)回购)

I coulduse the git submodule system, but it doesn't act how I want it too (submodules are pointers to other repositories, and don't really contain the actual code, so it's useless for backup)

可以使用 git 子模块系统,但它也不会按照我想要的方式运行(子模块是指向其他存储库的指针,并不真正包含实际代码,因此它对备份毫无用处)

Currently I have a folder of git-repos (for example, ~/code_projects/proj1/.git/ ~/code_projects/proj2/.git/), and after doing changes to proj1 I do git push github, then I copy the files into ~/Documents/code/python/projects/proj1/ and do a single commit (instead of the numerous ones in the individual repos). Then do git push backupdrive1, git push mymemorysticketc

目前我有一个 git-repos 文件夹(例如,~/code_projects/proj1/.git/~/code_projects/proj2/.git/),在对 proj1 进行更改后git push github,我将文件复制到 ~/ Documents/code/python/projects/proj1/ 并执行一次提交(而不是单个存储库中的众多提交)。然后做git push backupdrive1git push mymemorystick等等

So, the question: How do your personal code and projects with git repositories, and keep them synced and backed-up?

所以,问题是:你的个人代码和项目如何使用 git 存储库,并保持它们同步和备份?

采纳答案by Damien Diederen

I would stronglyadvise against putting unrelated data in a given Git repository. The overhead of creating new repositories is quite low, and that is a featurethat makes it possible to keep different lineages completely separate.

强烈建议不要将不相关的数据放在给定的 Git 存储库中。创建新存储库的开销非常低,这是一项功能,可以将不同的谱系完全分开。

Fighting that idea means ending up with unnecessarily tangled history, which renders administration more difficult and--more importantly--"archeology" tools less useful because of the resulting dilution. Also, as you mentioned, Git assumes that the "unit of cloning" is the repository, and practically has to do so because of its distributed nature.

反对这个想法意味着结束不必要的纠结的历史,这使得管理更加困难——更重要的是——“考古”工具由于由此产生的稀释而变得不那么有用。此外,正如您所提到的,Git 假定“克隆单元”是存储库,并且由于其分布式性质,实际上必须这样做。

One solution is to keep every project/package/etc. as its own barerepository (i.e., without working tree) under a blessed hierarchy, like:

一种解决方案是保留每个项目/包/等。作为它自己的存储库(即,没有工作树)在一个有福的层次结构下,例如:

/repos/a.git
/repos/b.git
/repos/c.git

Once a few conventions have been established, it becomes trivial to apply administrative operations (backup, packing, web publishing) to the complete hierarchy, which serves a role not entirely dissimilar to "monolithic" SVN repositories. Working with these repositories also becomes somewhat similar to SVN workflows, with the addition that one canuse local commits and branches:

一旦建立了一些约定,将管理操作(备份、打包、Web 发布)应用于完整的层次结构就变得微不足道了,这与“单体”SVN 存储库的作用并不完全不同。使用这些存储库也变得有点类似于 SVN 工作流程,此外还 可以使用本地提交和分支:

svn checkout   --> git clone
svn update     --> git pull
svn commit     --> git push

You can have multiple remotes in each working clone, for the ease of synchronizing between the multiple parties:

您可以在每个工作克隆中拥有多个遥控器,以便于多方之间的同步:

$ cd ~/dev
$ git clone /repos/foo.git       # or the one from github, ...
$ cd foo
$ git remote add github ...
$ git remote add memorystick ...

You can then fetch/pull from each of the "sources", work and commit locally, and then push ("backup") to each of these remotes when you are ready with something like (note how that pushes the samecommits and history to each of the remotes!):

然后,您可以从每个“源”中获取/拉取,在本地工作和提交,然后当您准备好类似的东西时推送(“备份”)到这些遥控器中的每一个(注意这如何将相同的提交和历史推送到每个遥控器!):

$ for remote in origin github memorystick; do git push $remote; done

The easiest way to turn an existing working repository ~/dev/foointo such a bare repository is probably:

将现有的工作存储库~/dev/foo变成这样一个裸存储库的最简单方法可能是:

$ cd ~/dev
$ git clone --bare foo /repos/foo.git
$ mv foo foo.old
$ git clone /repos/foo.git

which is mostly equivalent to a svn import--but does not throw the existing, "local" history away.

这主要相当于 --svn import但不会丢弃现有的“本地”历史。

Note: submodulesare a mechanism to include shared relatedlineages, so I indeed wouldn't consider them an appropriate tool for the problem you are trying to solve.

注意:子模块是一种包含共享相关谱系的机制,因此我确实不会认为它们是您尝试解决的问题的合适工具。

回答by imz -- Ivan Zakharyaschev

I want to add to Damien's answerwhere he recommends:

我想在Damien 的回答中补充他的建议:

$ for remote in origin github memorystick; do git push $remote; done

You can set up a special remote to push to all the individual real remotes with 1 command; I found it at http://marc.info/?l=git&m=116231242118202&w=2:

您可以设置一个特殊的遥控器,用 1 个命令推送到所有单独的真实遥控器;我在http://marc.info/?l=git&m=116231242118202&w=2找到它:

So for "git push" (where it makes sense to push the same branches multiple times), you can actually do what I do:

  • .git/config contains:

    [remote "all"]
    url = master.kernel.org:/pub/scm/linux/kernel/git/torvalds/linux-2.6
    url = login.osdl.org:linux-2.6.git
    
  • and now git push all masterwill push the "master" branch to both
    of those remote repositories.

所以对于“git push”(多次推送相同的分支是有意义的),你实际上可以做我所做的:

  • .git/config 包含:

    [remote "all"]
    url = master.kernel.org:/pub/scm/linux/kernel/git/torvalds/linux-2.6
    url = login.osdl.org:linux-2.6.git
    
  • 现在git push all master将“主”分支推送到这两个
    远程存储库。

You can also save yourself typing the URLs twice by using the contruction:

您还可以使用以下结构来节省自己输入两次 URL 的时间:

[url "<actual url base>"]
    insteadOf = <other url base>
[url "<actual url base>"]
    insteadOf = <other url base>

回答by Danny G

I also am curious about suggested ways to handle this and will describe the current setup that I use (with SVN). I have basically created a repository that contains a mini-filesystem hierarchy including its own bin and lib dirs. There is script in the root of this tree that will setup your environment to add these bin, lib, etc... other dirs to the proper environment variables. So the root directory essentially looks like:

我也对处理此问题的建议方法感到好奇,并将描述我使用的当前设置(使用 SVN)。我基本上创建了一个存储库,其中包含一个迷你文件系统层次结构,包括它自己的 bin 和 lib 目录。这棵树的根目录中有一个脚本,它将设置您的环境以将这些 bin、lib 等...其他目录添加到适当的环境变量中。所以根目录本质上是这样的:

./bin/            # prepended to $PATH
./lib/            # prepended to $LD_LIBRARY_PATH
./lib/python/     # prepended to $PYTHONPATH
./setup_env.bash  # sets up the environment

Now inside /bin and /lib there are the multiple projects and and their corresponding libraries. I know this isn't a standard project, but it is very easy for someone else in my group to checkout the repo, run the 'setup_env.bash' script and have the most up to date versions of all of the projects locally in their checkout. They don't have to worry about installing/updating /usr/bin or /usr/lib and it keeps it simple to have multiple checkouts and a very localized environment per checkout. Someone can also just rm the entire repository and not worry about uninstalling any programs.

现在在 /bin 和 /lib 中有多个项目及其相应的库。我知道这不是一个标准项目,但我组中的其他人很容易签出 repo,运行“setup_env.bash”脚本并在他们的本地拥有所有项目的最新版本查看。他们不必担心安装/更新 /usr/bin 或 /usr/lib 并且每次结帐都有多个结帐和非常本地化的环境使事情变得简单。有人也可以 rm 整个存储库,而不必担心卸载任何程序。

This is working fine for us, and I'm not sure if we'll change it. The problem with this is that there are many projects in this one big repository. Is there a git/Hg/bzr standard way of creating an environment like this and breaking out the projects into their own repositories?

这对我们来说很好用,我不确定我们是否会改变它。问题在于这个大存储库中有很多项目。是否有 git/Hg/bzr 标准方法来创建这样的环境并将项目分解到自己的存储库中?

回答by Spoike

,I haven't tried nesting git repositories yet because I haven't run into a situation where I need to. As I've read on the #git channelgit seems to get confused by nesting the repositories, i.e. you're trying to git-init inside a git repository. The only way to manage a nested git structure is to either use git-submoduleor Android's repoutility.

,我还没有尝试嵌套 git 存储库,因为我还没有遇到需要的情况。正如我在#git 频道上读到的,git 似乎对嵌套存储库感到困惑,即您试图在 git 存储库中执行 git-init。管理嵌套 git 结构的唯一方法是使用git-submodule或 Android 的repo实用程序。

As for that backup responsibility you're describing I say delegateit... For me I usually put the "origin" repository for each project at a network drive at work that is backed up regularly by the IT-techs by their backup strategy of choice. It is simple and I don't have to worry about it. ;)

至于你所描述的备份责任,我说委托......对我来说,我通常将每个项目的“原始”存储库放在工作中的网络驱动器上,由 IT 技术人员通过他们的备份策略定期备份选择。这很简单,我不必担心。;)

回答by imz -- Ivan Zakharyaschev

What about using mrfor managing your multiple Git repos at once:

使用mr一次管理多个 Git 存储库怎么样:

The mr(1) command can checkout, update, or perform other actions on a set of repositories as if they were one combined respository. It supports any combination of subversion, git, cvs, mercurial, bzr, darcs, cvs, vcsh, fossil and veracity repositories, and support for other revision control systems can easily be added. [...]

It is extremely configurable via simple shell scripting. Some examples of things it can do include:

[...]

  • When updating a git repository, pull from two different upstreams and merge the two together.
  • Run several repository updates in parallel, greatly speeding up the update process.
  • Remember actions that failed due to a laptop being offline, so they can be retried when it comes back online.

mr(1) 命令可以检出、更新或对一组存储库执行其他操作,就好像它们是一个组合存储库一样。它支持 subversion、git、cvs、mercurial、bzr、darcs、cvs、vcsh、fossil 和 veracity 存储库的任意组合,并且可以轻松添加对其他版本控制系统的支持。[...]

它可以通过简单的 shell 脚本进行高度配置。它可以做的事情的一些例子包括:

[...]

  • 更新 git 存储库时,从两个不同的上游拉取并将两者合并在一起。
  • 并行运行多个存储库更新,大大加快了更新过程。
  • 记住由于笔记本电脑离线而失败的操作,以便在它重新上线时重试。

回答by arxpoetica

There is another method for having nested git repos, but it doesn't solve the problem you're after. Still, for others who are looking for the solution I was:

还有另一种嵌套 git repos 的方法,但它不能解决您所追求的问题。尽管如此,对于正在寻找解决方案的其他人,我是:

In the top level git repo just hide the folder in .gitignore containing the nested git repo. This makes it easy to have two separate (but nested!) git repos.

在顶级 git repo 中,只需隐藏 .gitignore 中包含嵌套 git repo 的文件夹。这使得拥有两个单独(但嵌套!)的 git 存储库变得容易。