git clone --shared 和 --reference 有什么区别?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/23304374/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
What are the differences between git clone --shared and --reference?
提问by
After reading the documentation, I still don't really understand what the
differences are between --shared
and --reference <repo>
. They seem so similar.
阅读文档后,我仍然不太明白--shared
和之间的区别--reference <repo>
。他们看起来如此相似。
What are the differences between the
--shared
and--reference <repo>
options?Can they be used to save drive space when making multiple local clones of another local clone?
Can each local clone have a different branch checked-out?
--shared
和选项之间有什么区别--reference <repo>
?在制作另一个本地克隆的多个本地克隆时,它们可以用来节省驱动器空间吗?
每个本地克隆可以签出不同的分支吗?
Note:I'm aware that I can use multiple shallow clones with truncated
history by using git clone --depth <depth>
, but each clone still has to
duplicate at least somehistory in order to do that, so I was
thinking that maybe it's not the most optimal way to save drive space (though it
is better than nothing).
注意:我知道我可以通过 using 使用多个带有截断历史的浅层克隆git clone --depth <depth>
,但是每个克隆仍然必须至少复制一些历史才能做到这一点,所以我想这可能不是保存的最佳方式驱动器空间(尽管总比没有好)。
Background
背景
Sometimes I like to have more than one checkout of my working copy in a repository, so I create multiple clones, where each clone has its own checkout.
有时我喜欢在存储库中对我的工作副本进行多次检出,因此我创建了多个克隆,其中每个克隆都有自己的检出。
However, I don't really need the whole history with each clone, just the most up-to-date versions of my branches, so I could possibly save a lot of drive space by having each clone use the tag, commit, tree, and blob objects from the original local clone (for example, via symlinks for something).
但是,我真的不需要每个克隆的整个历史记录,只需要我的分支的最新版本,所以我可以通过让每个克隆使用标签、提交、树来节省大量驱动器空间,和来自原始本地克隆的 blob 对象(例如,通过某些东西的符号链接)。
git clone
documentation
git clone
文件
I checked the git clone
documentationto see if there's anything I
can use.
我检查了git clone
文档,看看是否有我可以使用的东西。
--shared
--shared
I saw that there's a --shared
option:
我看到有一个--shared
选项:
When the repository to clone is on the local machine, instead of using hard links, automatically setup
.git/objects/info/alternates
to share the objects with the source repository. The resulting repository starts out without any object of its own.
当要克隆的存储库在本地计算机上时,不使用硬链接,而是自动设置
.git/objects/info/alternates
为与源存储库共享对象。生成的存储库开始时没有任何自己的对象。
This looks like it mightbe useful for helping me to save drive space with multiple clones that have different checkouts, since each clone shares objects with the original local clone.
这看起来可能有助于帮助我节省具有不同检出的多个克隆的驱动器空间,因为每个克隆与原始本地克隆共享对象。
--reference <repository>
--reference <repository>
Then I also saw the --reference <repository>
option:
然后我也看到了--reference <repository>
选项:
If the reference repository is on the local machine, automatically setup
.git/objects/info/alternates
to obtain objects from the reference repository. Using an already existing repository as an alternate will require fewer objects to be copied from the repository being cloned, reducing network and local storage costs.NOTE:see the NOTE for the
--shared
option.
如果参考存储库在本地机器上,则自动设置
.git/objects/info/alternates
以从参考存储库获取对象。使用现有的存储库作为替代将需要从被克隆的存储库复制更少的对象,从而降低网络和本地存储成本。注意:请参阅该
--shared
选项的注意。
This says that it will reduce local storage costs, so this might be useful as well.
这表示它将降低本地存储成本,因此这也可能有用。
回答by DoubleWord
Both options update .git/objects/info/alternates
to point to the source repository, which could be dangerous hence the warning note is present on both options in documentation.
这两个选项都更新.git/objects/info/alternates
为指向源存储库,这可能很危险,因此文档中的两个选项都存在警告说明。
The --shared
option does not copy the objects into the clone. This is the main difference.
该--shared
选项不会将对象复制到克隆中。这是主要的区别。
The --reference
uses an additionalrepository parameter. Using --reference
still copies the objects into destination during the clone, however you are specifying objects be copied from an existing source when they are already available in the reference repository. This can reduce network time and IO from the source repository by passing the path to a repository on a faster/local device using --reference
在--reference
使用一个额外的存储库参数。--reference
在克隆期间使用仍然将对象复制到目标中,但是您指定对象在参考存储库中已经可用时从现有源复制。这可以通过使用将路径传递到更快/本地设备上的存储库来减少来自源存储库的网络时间和 IO--reference
See for yourself
你自己看
Create a --shared
clone and a --reference
clone. Count the objects in each using git count-objects -v
. You'll notice the shared clone has no objects, and the reference clone has the same number of objects as the source. Further, notice the size difference of each in your file system. If you were to move the source, and test git log
in both shared and reference repositories, the log is unavailable in the shared clone, but works fine in the reference clone.
创建一个--shared
克隆和一个--reference
克隆。使用 计算每个对象中的对象git count-objects -v
。您会注意到共享克隆没有对象,而引用克隆具有与源相同数量的对象。此外,请注意文件系统中每个文件的大小差异。如果您要移动源,并git log
在共享和参考存储库中进行测试,则共享克隆中的日志不可用,但在参考克隆中运行良好。
回答by Sam Brightman
The link in the comments to your questionis really a clearer answer: --reference
implies --shared
. The point of --reference
is to optimise network I/O during the initial clone of a remote repository.
评论中对您的问题的链接确实是一个更清晰的答案:--reference
暗示--shared
。重点--reference
是在远程存储库的初始克隆期间优化网络 I/O。
Contrary to the answer above, I find that the --shared
and --reference
repositories -- from the same source -- have the same size and both have zero objects. Of course, if you use --reference
for some other repository which is based off a common source, the size and objects will reflect the difference between the repositories. Notethat in both cases we are not saving space in the work tree, only the .git/objects
.
与上面的答案相反,我发现--shared
和--reference
存储库 - 来自同一来源 - 具有相同的大小并且都具有零个对象。当然,如果您--reference
用于其他基于公共源的存储库,则大小和对象将反映存储库之间的差异。请注意,在这两种情况下,我们都没有节省工作树中的空间,只有.git/objects
.
There is some nuance to maintaining this setup going forward - read the thread for more details. Essentially it sounds like the two should be treated as public repositories, with care around history re-writing in the presence of repacking/pruning/garbage collection.
继续维护此设置有一些细微差别 - 阅读线程以获取更多详细信息。从本质上讲,这两个应该被视为公共存储库,在重新打包/修剪/垃圾收集的情况下注意重写历史记录。
The workflow around maintaining an optimal disk-space usage after the initial clone seems to be:
在初始克隆后保持最佳磁盘空间使用的工作流程似乎是:
- pull source
- repack source
- pull secondary
git gc
in secondary
- 拉源
- 重新打包源码
- 拉次要
git gc
在中学
Probably best to read the discussion in that thread though.
可能最好阅读该线程中的讨论。
You can add an alternate to an existing repository by putting the absolute path to the source's objects
directory into secondary/.git/objects/info/alternates
and running git gc
(many people use git repack -a -d -l
, which is done by git gc
).
您可以通过将源objects
目录的绝对路径放入secondary/.git/objects/info/alternates
并运行git gc
(许多人使用git repack -a -d -l
,这是由 完成的git gc
)来向现有存储库添加替代。
You can remove an alternate by running git repack -a -d
(no -l
) in the secondary and then removing the line from the alternates
file. As described in the thread, it is possible to have more than one alternate.
您可以通过在辅助节点中运行git repack -a -d
(no -l
) 然后从alternates
文件中删除该行来删除替代项。如线程中所述,可能有多个替代项。
I've not used this much myself, so I don't know how error-prone it is to manage.
我自己没用过这么多,所以我不知道管理起来有多容易出错。
回答by Paul Van Camp
The link in the commentsto your question is now dead.
https://www.oreilly.com/library/view/git-pocket-guide/9781449327507/ch06.htmlhas some great information on the subject. Here is some of what is there:
https://www.oreilly.com/library/view/git-pocket-guide/9781449327507/ch06.html有一些关于这个主题的重要信息。以下是其中的一些内容:
first, we make a bare clone of the remote repository, to be shared locally as a reference repository (hence named “refrep”):
$ git clone --bare http://foo/bar.gitrefrepThen, we clone the remote again, but this time giving refrep as a reference:
$ git clone --reference refrep http://foo/bar.gitThe key difference between this and the --shared option is that you are still tracking the remote repository, not the refrep clone. When you pull, you still contact http://foo/, but you don't need to wait for it to send any objects that are already stored locally in refrep; when you push, you are updating the branches and other refs of the foo repository directly.
Of course, as soon as you and others start pushing new commits, the reference repository will become out of date, and you'll start to lose some of the benefit. Periodically, you can run git fetch --all in refrep to pull in any new objects. A single reference repository can be a cache for the objects of any number of others; just add them as remotes in the reference:
$ git remote add zeus http://olympus/zeus.git
$ git fetch --all zeus
首先,我们制作远程存储库的裸克隆,作为参考存储库在本地共享(因此命名为“refrep”):
$ git clone --bare http://foo/bar.gitrefrep然后,我们再次克隆遥控器,但这次将 refrep 作为参考:
$ git clone --reference refrep http://foo/bar.git这与 --shared 选项之间的主要区别在于您仍在跟踪远程存储库,而不是 refrep 克隆。在 pull 的时候,你仍然联系http://foo/,但是你不需要等待它发送任何已经在本地存储在 refrep 中的对象;当您推送时,您将直接更新 foo 存储库的分支和其他引用。
当然,一旦您和其他人开始推送新的提交,参考存储库就会过时,您将开始失去一些好处。您可以定期在 refrep 中运行 git fetch --all 以拉入任何新对象。单个参考存储库可以是任意数量其他对象的缓存;只需在参考中将它们添加为遥控器:
$ git remote add zeus http://olympus/zeus.git
$ git fetch --all zeus