git clone --shared 和 --reference 有什么区别?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/23304374/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-19 09:59:47  来源:igfitidea点击:

What are the differences between git clone --shared and --reference?

git

提问by

After reading the documentation, I still don't really understand what the differences are between --sharedand --reference <repo>. They seem so similar.

阅读文档后,我仍然不太明白--shared和之间的区别--reference <repo>。他们看起来如此相似。

  1. What are the differences between the--sharedand--reference <repo>options?

  2. Can they be used to save drive space when making multiple local clones of another local clone?

  3. Can each local clone have a different branch checked-out?

  1. --shared选项之间有什么区别--reference <repo>

  2. 在制作另一个本地克隆的多个本地克隆时,它们可以用来节省驱动器空间吗?

  3. 每个本地克隆可以签出不同的分支吗?

Note:I'm aware that I can use multiple shallow clones with truncated history by using git clone --depth <depth>, but each clone still has to duplicate at least somehistory in order to do that, so I was thinking that maybe it's not the most optimal way to save drive space (though it is better than nothing).

注意:我知道我可以通过 using 使用多个带有截断历史的浅层克隆git clone --depth <depth>,但是每个克隆仍然必须至少复制一些历史才能做到这一点,所以我想这可能不是保存的最佳方式驱动器空间(尽管总比没有好)。

Background

背景

Sometimes I like to have more than one checkout of my working copy in a repository, so I create multiple clones, where each clone has its own checkout.

有时我喜欢在存储库中对我的工作副本进行多次检出,因此我创建了多个克隆,其中每个克隆都有自己的检出。

However, I don't really need the whole history with each clone, just the most up-to-date versions of my branches, so I could possibly save a lot of drive space by having each clone use the tag, commit, tree, and blob objects from the original local clone (for example, via symlinks for something).

但是,我真的不需要每个克隆的整个历史记录,只需要我的分支的最新版本,所以我可以通过让每个克隆使用标签、提交、树来节省大量驱动器空间,和来自原始本地克隆的 blob 对象(例如,通过某些东西的符号链接)。

git clonedocumentation

git clone文件

I checked the git clonedocumentationto see if there's anything I can use.

我检查git clone文档,看看是否有我可以使用的东西。

--shared

--shared

I saw that there's a --sharedoption:

我看到有一个--shared选项:

When the repository to clone is on the local machine, instead of using hard links, automatically setup .git/objects/info/alternatesto share the objects with the source repository. The resulting repository starts out without any object of its own.

当要克隆的存储库在本地计算机上时,不使用硬链接,而是自动设置.git/objects/info/alternates为与源存储库共享对象。生成的存储库开始时没有任何自己的对象。

This looks like it mightbe useful for helping me to save drive space with multiple clones that have different checkouts, since each clone shares objects with the original local clone.

这看起来可能有助于帮助我节省具有不同检出的多个克隆的驱动器空间,因为每个克隆与原始本地克隆共享对象。

--reference <repository>

--reference <repository>

Then I also saw the --reference <repository>option:

然后我也看到了--reference <repository>选项:

If the reference repository is on the local machine, automatically setup .git/objects/info/alternatesto obtain objects from the reference repository. Using an already existing repository as an alternate will require fewer objects to be copied from the repository being cloned, reducing network and local storage costs.

NOTE:see the NOTE for the --sharedoption.

如果参考存储库在本地机器上,则自动设置 .git/objects/info/alternates以从参考存储库获取对象。使用现有的存储库作为替代将需要从被克隆的存储库复制更少的对象,从而降低网络和本地存储成本。

注意:请参阅该--shared选项的注意。

This says that it will reduce local storage costs, so this might be useful as well.

这表示它将降低本地存储成本,因此这也可能有用。

回答by DoubleWord

Both options update .git/objects/info/alternatesto point to the source repository, which could be dangerous hence the warning note is present on both options in documentation.

这两个选项都更新.git/objects/info/alternates为指向源存储库,这可能很危险,因此文档中的两个选项都存在警告说明。

The --sharedoption does not copy the objects into the clone. This is the main difference.

--shared选项不会将对象复制到克隆中。这是主要的区别。

The --referenceuses an additionalrepository parameter. Using --referencestill copies the objects into destination during the clone, however you are specifying objects be copied from an existing source when they are already available in the reference repository. This can reduce network time and IO from the source repository by passing the path to a repository on a faster/local device using --reference

--reference使用一个额外的存储库参数。--reference在克隆期间使用仍然将对象复制到目标中,但是您指定对象在参考存储库中已经可用时从现有源复制。这可以通过使用将路径传递到更快/本地设备上的存储库来减少来自源存储库的网络时间和 IO--reference

See for yourself

你自己看

Create a --sharedclone and a --referenceclone. Count the objects in each using git count-objects -v. You'll notice the shared clone has no objects, and the reference clone has the same number of objects as the source. Further, notice the size difference of each in your file system. If you were to move the source, and test git login both shared and reference repositories, the log is unavailable in the shared clone, but works fine in the reference clone.

创建一个--shared克隆和一个--reference克隆。使用 计算每个对象中的对象git count-objects -v。您会注意到共享克隆没有对象,而引用克隆具有与源相同数量的对象。此外,请注意文件系统中每个文件的大小差异。如果您要移动源,并git log在共享和参考存储库中进行测试,则共享克隆中的日志不可用,但在参考克隆中运行良好。

回答by Sam Brightman

The link in the comments to your questionis really a clearer answer: --referenceimplies --shared. The point of --referenceis to optimise network I/O during the initial clone of a remote repository.

评论中对您的问题链接确实是一个更清晰的答案:--reference暗示--shared重点--reference是在远程存储库的初始克隆期间优化网络 I/O。

Contrary to the answer above, I find that the --sharedand --referencerepositories -- from the same source -- have the same size and both have zero objects. Of course, if you use --referencefor some other repository which is based off a common source, the size and objects will reflect the difference between the repositories. Notethat in both cases we are not saving space in the work tree, only the .git/objects.

与上面的答案相反,我发现--shared--reference存储库 - 来自同一来源 - 具有相同的大小并且都具有零个对象。当然,如果您--reference用于其他基于公共源的存储库,则大小和对象将反映存储库之间的差异。请注意,在这两种情况下,我们都没有节省工作树中的空间,只有.git/objects.

There is some nuance to maintaining this setup going forward - read the thread for more details. Essentially it sounds like the two should be treated as public repositories, with care around history re-writing in the presence of repacking/pruning/garbage collection.

继续维护此设置有一些细微差别 - 阅读线程以获取更多详细信息。从本质上讲,这两个应该被视为公共存储库,在重新打包/修剪/垃圾收集的情况下注意重写历史记录。

The workflow around maintaining an optimal disk-space usage after the initial clone seems to be:

在初始克隆后保持最佳磁盘空间使用的工作流程似乎是:

  1. pull source
  2. repack source
  3. pull secondary
  4. git gcin secondary
  1. 拉源
  2. 重新打包源码
  3. 拉次要
  4. git gc在中学

Probably best to read the discussion in that thread though.

可能最好阅读该线程中的讨论。

You can add an alternate to an existing repository by putting the absolute path to the source's objectsdirectory into secondary/.git/objects/info/alternatesand running git gc(many people use git repack -a -d -l, which is done by git gc).

您可以通过将源objects目录的绝对路径放入secondary/.git/objects/info/alternates并运行git gc(许多人使用git repack -a -d -l,这是由 完成的git gc)来向现有存储库添加替代。

You can remove an alternate by running git repack -a -d(no -l) in the secondary and then removing the line from the alternatesfile. As described in the thread, it is possible to have more than one alternate.

您可以通过在辅助节点中运行git repack -a -d(no -l) 然后从alternates文件中删除该行来删除替代项。如线程中所述,可能有多个替代项。

I've not used this much myself, so I don't know how error-prone it is to manage.

我自己没用过这么多,所以我不知道管理起来有多容易出错。

回答by Paul Van Camp

The link in the commentsto your question is now dead.

您问题的评论中链接现已失效。

https://www.oreilly.com/library/view/git-pocket-guide/9781449327507/ch06.htmlhas some great information on the subject. Here is some of what is there:

https://www.oreilly.com/library/view/git-pocket-guide/9781449327507/ch06.html有一些关于这个主题的重要信息。以下是其中的一些内容:

first, we make a bare clone of the remote repository, to be shared locally as a reference repository (hence named “refrep”):
$ git clone --bare http://foo/bar.gitrefrep

Then, we clone the remote again, but this time giving refrep as a reference:
$ git clone --reference refrep http://foo/bar.git

The key difference between this and the --shared option is that you are still tracking the remote repository, not the refrep clone. When you pull, you still contact http://foo/, but you don't need to wait for it to send any objects that are already stored locally in refrep; when you push, you are updating the branches and other refs of the foo repository directly.

Of course, as soon as you and others start pushing new commits, the reference repository will become out of date, and you'll start to lose some of the benefit. Periodically, you can run git fetch --all in refrep to pull in any new objects. A single reference repository can be a cache for the objects of any number of others; just add them as remotes in the reference:

$ git remote add zeus http://olympus/zeus.git
$ git fetch --all zeus

首先,我们制作远程存储库的裸克隆,作为参考存储库在本地共享(因此命名为“refrep”):
$ git clone --bare http://foo/bar.gitrefrep

然后,我们再次克隆遥控器,但这次将 refrep 作为参考:
$ git clone --reference refrep http://foo/bar.git

这与 --shared 选项之间的主要区别在于您仍在跟踪远程存储库,而不是 refrep 克隆。在 pull 的时候,你仍然联系http://foo/,但是你不需要等待它发送任何已经在本地存储在 refrep 中的对象;当您推送时,您将直接更新 foo 存储库的分支和其他引用。

当然,一旦您和其他人开始推送新的提交,参考存储库就会过时,您将开始失去一些好处。您可以定期在 refrep 中运行 git fetch --all 以拉入任何新对象。单个参考存储库可以是任意数量其他对象的缓存;只需在参考中将它们添加为遥控器:

$ git remote add zeus http://olympus/zeus.git
$ git fetch --all zeus