git gc --aggressive 与 git repack
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/28720151/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
git gc --aggressive vs git repack
提问by Ajith R Nayak
I'm looking for ways to reduce the size of a git
repository. Searching leads me to git gc --aggressive
most of the times. I have also read that this isn't the preferred approach.
我正在寻找减少git
存储库大小的方法。git gc --aggressive
大多数时候,搜索会引导我。我还读到这不是首选方法。
Why? what should I be aware of if I'm running gc --aggressive
?
为什么?如果我正在跑步,我应该注意什么gc --aggressive
?
git repack -a -d --depth=250 --window=250
is recommended over gc --aggressive
. Why? How does repack
reduce the size of a repository? Also, I'm not quite clear about the flags --depth
and --window
.
git repack -a -d --depth=250 --window=250
推荐过gc --aggressive
。为什么?如何repack
减小存储库的大小?另外,我不太清楚标志--depth
和--window
.
What should I choose between gc
and repack
? When should I use gc
and repack
?
我应该在gc
和之间选择什么repack
?我什么时候应该使用gc
和repack
?
回答by Greg Bacon
Nowadays there is no difference: git gc --aggressive
operates according to the suggestion Linus made in 2007; see below. As of version 2.11 (Q4 2016), git defaults to a depth of 50. A window of size 250 is good because it scans a larger section of each object, but depth at 250 is bad because it makes every chain refer to very deep old objects, which slows down allfuture git operations for marginally lower disk usage.
现在没有区别:git gc --aggressive
按照Linus 2007年提出的建议操作;见下文。从 2.11 版(2016 年第 4 季度)开始,git 默认深度为 50。大小为 250 的窗口很好,因为它扫描每个对象的更大部分,但深度为 250 很糟糕,因为它使每个链都指向非常深的旧链对象,这会减慢所有未来的 git 操作,从而略微降低磁盘使用率。
Historical Background
历史背景
Linus suggested (see below for the full mailing list post) using git gc --aggressive
only when you have, in his words, “a reallybad pack” or “really horribly bad deltas,” however “almost always, in other cases, it's actually a really bad thing to do.” The result may even leave your repository in worse condition than when you started!
Linus 建议(请参阅下面的完整邮件列表帖子)git gc --aggressive
仅当你有,用他的话来说,“一个非常糟糕的包”或“非常糟糕的增量”,但是“几乎总是,在其他情况下,它实际上是一个非常糟糕的包”要做的事。” 结果甚至可能使您的存储库状况比开始时更糟!
The command he suggests for doing this properly after having imported “a long and involved history” is
在导入了“漫长而复杂的历史”之后,他建议正确执行此操作的命令是
git repack -a -d -f --depth=250 --window=250
But this assumes you have already removed unwanted gunkfrom your repository history and that you have followed the checklist for shrinking a repository found in the git filter-branch
documentation.
但这假设您已经从您的存储库历史记录中删除了不需要的垃圾,并且您已经遵循了在git filter-branch
文档中找到的收缩存储库的清单。
git-filter-branch can be used to get rid of a subset of files, usually with some combination of
--index-filter
and--subdirectory-filter
. People expect the resulting repository to be smaller than the original, but you need a few more steps to actually make it smaller, because Git tries hard not to lose your objects until you tell it to. First make sure that:
You really removed all variants of a filename, if a blob was moved over its lifetime.
git log --name-only --follow --all -- filename
can help you find renames.You really filtered all refs: use
--tag-name-filter cat -- --all
when callinggit filter-branch
.Then there are two ways to get a smaller repository. A safer way is to clone, that keeps your original intact.
- Clone it with
git clone file:///path/to/repo
. The clone will not have the removed objects. See git-clone. (Note that cloning with a plain path just hardlinks everything!)If you really don't want to clone it, for whatever reasons, check the following points instead (in this order). This is a very destructive approach, so make a backup or go back to cloning it. You have been warned.
Remove the original refs backed up by git-filter-branch: say
git for-each-ref --format="%(refname)" refs/original/ | xargs -n 1 git update-ref -d
Expire all reflogs with
git reflog expire --expire=now --all
.Garbage collect all unreferenced objects with
git gc --prune=now
(or if yourgit gc
is not new enough to support arguments to--prune
, usegit repack -ad; git prune
instead).
git的过滤分支可以用来摆脱文件的一个子集,通常用的一些组合
--index-filter
和--subdirectory-filter
。人们期望生成的存储库比原始存储库小,但是您需要更多的步骤才能真正缩小它,因为 Git 会努力不丢失您的对象,直到您告诉它为止。首先确保:
如果 blob 在其生命周期内被移动,您就真的删除了文件名的所有变体。
git log --name-only --follow --all -- filename
可以帮助您找到重命名。你真的过滤了所有的 refs: use
--tag-name-filter cat -- --all
when callgit filter-branch
.那么有两种方法可以获得较小的存储库。更安全的方法是克隆,这样可以保持原件完好无损。
- 克隆它
git clone file:///path/to/repo
。克隆不会有被移除的对象。参见 git-clone。(请注意,使用普通路径进行克隆只会硬链接所有内容!)如果您真的不想克隆它,无论出于何种原因,请改为检查以下几点(按此顺序)。这是一种非常具有破坏性的方法,因此请进行备份或返回克隆它。你被警告了。
删除由 git-filter-branch 备份的原始引用:说
git for-each-ref --format="%(refname)" refs/original/ | xargs -n 1 git update-ref -d
使所有引用日志过期
git reflog expire --expire=now --all
。垃圾收集所有未引用的对象
git gc --prune=now
(或者,如果您git gc
不够新,无法支持 的参数--prune
,请git repack -ad; git prune
改用)。
Date: Wed, 5 Dec 2007 22:09:12 -0800 (PST) From: Linus Torvalds <torvalds at linux-foundation dot org> To: Daniel Berlin <dberlin at dberlin dot org> cc: David Miller <davem at davemloft dot net>, ismail at pardus dot org dot tr, gcc at gcc dot gnu dot org, git at vger dot kernel dot org Subject: Re: Git and GCC In-Reply-To: <[email protected]> Message-ID: <[email protected]> References: <[email protected]> <[email protected]> <[email protected]> <[email protected]> <[email protected]>
On Thu, 6 Dec 2007, Daniel Berlin wrote:
Actually, it turns out that
git-gc --aggressive
does this dumb thing to pack files sometimes regardless of whether you converted from an SVN repo or not.Absolutely.
git --aggressive
is mostly dumb. It's really only useful for the case of “I know I have a reallybad pack, and I want to throw away all the bad packing decisions I have done.”To explain this, it's worth explaining (you are probably aware of it, but let me go through the basics anyway) how git delta-chains work, and how they are so different from most other systems.
In other SCMs, a delta-chain is generally fixed. It might be “forwards” or “backwards,” and it might evolve a bit as you work with the repository, but generally it's a chain of changes to a single file represented as some kind of single SCM entity. In CVS, it's obviously the
*,v
file, and a lot of other systems do rather similar things.Git also does delta-chains, but it does them a lot more “loosely.” There is no fixed entity. Deltas are generated against any random other version that git deems to be a good delta candidate (with various fairly successful heuristics), and there are absolutely no hard grouping rules.
This is generally a very good thing. It's good for various conceptual reasons (i.e., git internally never really even needs to care about the whole revision chain — it doesn't really think in terms of deltas at all), but it's also great because getting rid of the inflexible delta rules means that git doesn't have any problems at all with merging two files together, for example — there simply are no arbitrary
*,v
“revision files” that have some hidden meaning.It also means that the choice of deltas is a much more open-ended question. If you limit the delta chain to just one file, you really don't have a lot of choices on what to do about deltas, but in git, it really can be a totally different issue.
And this is where the really badly named
--aggressive
comes in. While git generally tries to re-use delta information (because it's a good idea, and it doesn't waste CPU time re-finding all the good deltas we found earlier), sometimes you want to say “let's start all over, with a blank slate, and ignore all the previous delta information, and try to generate a new set of deltas.”So
--aggressive
is not really about being aggressive, but about wasting CPU time re-doing a decision we already did earlier!Sometimesthat is a good thing. Some import tools in particular could generate really horribly bad deltas. Anything that uses
git fast-import
, for example, likely doesn't have much of a great delta layout, so it might be worth saying “I want to start from a clean slate.”But almost always, in other cases, it's actually a really bad thing to do. It's going to waste CPU time, and especially if you had actually done a good job at deltaing earlier, the end result isn't going to re-use all those gooddeltas you already found, so you'll actually end up with a much worse end result too!
I'll send a patch to Junio to just remove the
git gc --aggressive
documentation. It can be useful, but it generally is useful only when you really understand at a very deep level what it's doing, and that documentation doesn't help you do that.Generally, doing incremental
git gc
is the right approach, and better than doinggit gc --aggressive
. It's going to re-use old deltas, and when those old deltas can't be found (the reason for doing incremental GC in the first place!) it's going to create new ones.On the other hand, it's definitely true that an “initial import of a long and involved history” is a point where it can be worth spending a lot of time finding the really gooddeltas. Then, every user ever after (as long as they don't use
git gc --aggressive
to undo it!) will get the advantage of that one-time event. So especially for big projects with a long history, it's probably worth doing some extra work, telling the delta finding code to go wild.So the equivalent of
git gc --aggressive
— but done properly— is to do (overnight) something likegit repack -a -d --depth=250 --window=250
where that depth thing is just about how deep the delta chains can be (make them longer for old history — it's worth the space overhead), and the window thing is about how big an object window we want each delta candidate to scan.
And here, you might well want to add the
-f
flag (which is the “drop all old deltas,” since you now are actually trying to make sure that this one actually finds good candidates.And then it's going to take forever and a day (i.e., a “do it overnight” thing). But the end result is that everybody downstream from that repository will get much better packs, without having to spend any effort on it themselves.
Linus
Date: Wed, 5 Dec 2007 22:09:12 -0800 (PST) From: Linus Torvalds <torvalds at linux-foundation dot org> To: Daniel Berlin <dberlin at dberlin dot org> cc: David Miller <davem at davemloft dot net>, ismail at pardus dot org dot tr, gcc at gcc dot gnu dot org, git at vger dot kernel dot org Subject: Re: Git and GCC In-Reply-To: <[email protected]> Message-ID: <[email protected]> References: <[email protected]> <[email protected]> <[email protected]> <[email protected]> <[email protected]>
2007 年 12 月 6 日星期四,Daniel Berlin 写道:
实际上,事实证明,
git-gc --aggressive
无论您是否从 SVN 存储库转换而来,有时打包文件都会做这种愚蠢的事情。绝对地。
git --aggressive
大多是哑巴。它真的只对“我知道我有一个非常糟糕的包装,我想扔掉我所做的所有糟糕的包装决定”的情况有用。为了解释这一点,值得解释一下(您可能已经知道了,但无论如何让我来了解一下基础知识)git delta-chains 是如何工作的,以及它们与大多数其他系统有何不同。
在其他 SCM 中,delta 链通常是固定的。它可能是“向前”或“向后”,并且在您使用存储库时可能会发生一些变化,但通常它是对单个文件的一系列更改,表示为某种单一的 SCM 实体。在 CVS 中,显然是
*,v
文件,许多其他系统也做类似的事情。Git 也做 delta-chains,但它做的更“松散”。没有固定的实体。Delta 是针对任何 git 认为是一个好的 delta 候选的随机其他版本生成的(具有各种相当成功的启发式方法),并且绝对没有硬性分组规则。
这通常是一件非常好的事情。出于各种概念上的原因这很好(即,git 内部从不需要真正关心整个修订链——它根本没有真正考虑增量),但它也很棒,因为摆脱不灵活的增量规则意味着例如,git 将两个文件合并在一起根本没有任何问题——根本就没有
*,v
具有某些隐藏含义的任意“修订文件”。这也意味着增量的选择是一个更加开放的问题。如果您将 delta 链限制为一个文件,那么您对于如何处理 delta 确实没有太多选择,但在 git 中,这确实可能是一个完全不同的问题。
这就是真正糟糕的名字
--aggressive
出现的地方。虽然 git 通常尝试重用增量信息(因为这是一个好主意,而且它不会浪费 CPU 时间来重新查找我们之前找到的所有好的增量),但有时您想说“让我们从头开始,用一张空白的石板,忽略所有以前的增量信息,并尝试生成一组新的增量。”所以
--aggressive
并不是要积极进取,而是要浪费 CPU 时间重新做我们之前已经做过的决定!有时这是一件好事。特别是一些导入工具可能会产生非常糟糕的增量。
git fast-import
例如,任何使用.但几乎总是,在其他情况下,这实际上是一件非常糟糕的事情。这会浪费 CPU 时间,尤其是如果您之前确实在增量方面做得很好,那么最终结果将不会重复使用您已经找到的所有那些好的增量,因此您实际上最终会得到很多更糟糕的最终结果!
我将向 Junio 发送补丁以删除
git gc --aggressive
文档。它可能很有用,但通常只有当您真正深入了解它在做什么时才有用,而该文档并不能帮助您做到这一点。一般来说,做增量
git gc
是正确的方法,而且比做更好git gc --aggressive
。它将重新使用旧的增量,当无法找到那些旧的增量时(首先进行增量 GC 的原因!)它将创建新的增量。另一方面,“一段漫长而复杂的历史的初始导入”确实是值得花费大量时间寻找真正好的增量的点。然后,此后的每个用户(只要他们不使用
git gc --aggressive
撤消它!)都将获得该一次性事件的优势。因此,特别是对于历史悠久的大型项目,可能值得做一些额外的工作,告诉增量查找代码变得疯狂。所以相当于
git gc --aggressive
- 但做得正确- 是做(一夜之间)类似的事情git repack -a -d --depth=250 --window=250
深度是关于 delta 链的深度(让它们在旧历史中更长——值得空间开销),而窗口则是关于我们希望每个 delta 候选者扫描的对象窗口有多大。
在这里,您很可能想添加
-f
标志(即“删除所有旧增量”,因为您现在实际上是在尝试确保该标志确实找到了合适的候选者。然后它将需要永远和一天(即,“一夜之间”的事情)。但最终的结果是,该存储库下游的每个人都将获得更好的包,而无需自己为此付出任何努力。
Linus
回答by VonC
When should I use gc & repack?
我什么时候应该使用 gc & repack?
As I mentioned in "Git Garbage collection doesn't seem to fully work", a git gc --aggressive
is neither sufficient or even enough on its own.
And, as I explain below, often not needed.
正如我在“ Git Garbage collection 似乎不能完全工作”中提到的,agit gc --aggressive
本身是不够的,甚至不够。
而且,正如我在下面解释的那样,通常不需要。
The most effective combination would be adding git repack
, but also git prune
:
最有效的组合是添加git repack
,但也包括git prune
:
git gc
git repack -Ad # kills in-pack garbage
git prune # kills loose garbage
Note: Git 2.11 (Q4 2016) will set the default gc aggressive
depth to 50
注意:Git 2.11(2016 年第四季度)将默认gc aggressive
深度设置为 50
See commit 07e7dbf(11 Aug 2016) by Jeff King (peff
).
(Merged by Junio C Hamano -- gitster
--in commit 0952ca8, 21 Sep 2016)
请参阅Jeff King ( ) 的commit 07e7dbf(2016 年 8 月 11 日)。(由Junio C Hamano合并-- --在commit 0952ca8,2016 年 9 月 21 日)peff
gitster
gc
: default aggressive depth to 50"
git gc --aggressive
" used to limit the delta-chain length to 250, which is way too deep for gaining additional space savings and is detrimental for runtime performance.
The limit has been reduced to 50.The summary is: the current default of 250 doesn't save much space, and costs CPU. It's not a good tradeoff.
The "
--aggressive
" flag togit-gc
does three things:
- use "
-f
" to throw out existing deltas and recompute from scratch- use "--window=250" to look harder for deltas
- use "--depth=250" to make longer delta chains
Items (1) and (2) are good matches for an "aggressive" repack.
They ask the repack to do more computation work in the hopes of getting a better pack. You pay the costs during the repack, and other operations see only the benefit.Item (3) is not so clear.
Allowing longer chains means fewer restrictions on the deltas, which means potentially finding better ones and saving some space.
But it also means that operations which access the deltas have to follow longer chains, which affects their performance.
So it's a tradeoff, and it's not clear that the tradeoff is even a good one.
gc
: 默认攻击深度为 50"
git gc --aggressive
" 用于将 delta-chain 长度限制为 250,这对于获得额外的空间节省来说太深了,并且不利于运行时性能。
限制已降至 50。总结是:当前的默认值 250 并没有节省多少空间,而且会消耗 CPU。这不是一个好的权衡。
"
--aggressive
" 标志git-gc
做三件事:
- 使用“
-f
”丢弃现有的增量并从头开始重新计算- 使用“--window=250”更难寻找增量
- 使用“--depth=250”来制作更长的delta链
项目 (1) 和 (2) 非常适合“积极的”重新打包。
他们要求重新包装做更多的计算工作,以期得到更好的包装。您在重新包装期间支付费用,其他操作只看到收益。第(3)项不是很清楚。
允许更长的链意味着对增量的限制更少,这意味着可能会找到更好的链并节省一些空间。
但这也意味着访问增量的操作必须遵循更长的链,这会影响它们的性能。
所以这是一种权衡,目前尚不清楚这种权衡是否是好的。
(See commit for study)
(见提交研究)
You can see that that the CPU savings for regular operations improves as we decrease the depth.
But we can also see that the space savings are not that great as the depth goes higher. Saving 5-10% between 10 and 50 is probably worth the CPU tradeoff. Saving 1% to go from 50 to 100, or another 0.5% to go from 100 to 250 is probably not.
您可以看到,随着我们减少深度,常规操作的 CPU 节省有所提高。
但是我们也可以看到,随着深度的增加,节省的空间并不是那么大。在 10 到 50 之间节省 5-10% 可能值得 CPU 权衡。从 50 到 100 节省 1%,或者从 100 到 250 节省 0.5% 可能不是。
Speaking of CPU saving, "git repack
" learned to accept the --threads=<n>
option and pass it to pack-objects.
说到 CPU 节省,“ git repack
”学会了接受--threads=<n>
选项并将其传递给 pack-objects。
See commit 40bcf31(26 Apr 2017) by Junio C Hamano (gitster
).
(Merged by Junio C Hamano -- gitster
--in commit 31fb6f4, 29 May 2017)
请参阅Junio C Hamano() 的commit 40bcf31(2017 年 4 月 26 日)。(由Junio C Hamano合并-- --in commit 31fb6f4,2017 年 5 月 29 日)gitster
gitster
repack: accept
--threads=<n>
and pass it down topack-objects
重新打包:接受
--threads=<n>
并将其传递给pack-objects
We already do so for --window=<n>
and --depth=<n>
; this will help
when the user wants to force --threads=1
for reproducible testing
without getting affected by racing multiple threads.
我们已经为--window=<n>
and这样做了--depth=<n>
;当用户想要强制--threads=1
进行可重复测试而不受到多线程竞争的影响时,这将有所帮助。
回答by Sascha Wolf
The problem with git gc --aggressive
is that the option name and documentation is misleading.
问题git gc --aggressive
在于选项名称和文档具有误导性。
As Linus himself explains in this mail, what git gc --aggressive
basicly does is this:
正如Linus 自己在这封邮件中解释的那样,git gc --aggressive
基本上是这样的:
While git generally tries to re-use delta information (because it's a good idea, and it doesn't waste CPU time re-finding all the good deltas we found earlier), sometimes you want to say "let's start all over, with a blank slate, and ignore all the previous delta information, and try to generate a new set of deltas".
虽然 git 通常会尝试重用 delta 信息(因为这是一个好主意,而且不会浪费 CPU 时间重新查找我们之前找到的所有好的 delta),但有时您想说“让我们从头开始,用一个空白石板,并忽略所有先前的增量信息,并尝试生成一组新的增量”。
Usually there is no need to recalculate deltas in git, since git determines these deltas very flexible. It only makes sense if you know that you have really, really bad deltas. As Linus explains, mainly tools which make use of git fast-import
fall into this category.
通常不需要在 git 中重新计算增量,因为 git 非常灵活地确定这些增量。只有当你知道你有非常非常糟糕的增量时才有意义。正如 Linus 所解释的,主要使用的工具git fast-import
属于这一类。
Most of the time git does a pretty good job at determining useful deltas and using git gc --aggressive
will leave you with deltas which are potentially even worse while wasting a lot of CPU time.
大多数时候 git 在确定有用的增量方面做得非常好,使用git gc --aggressive
会给你留下可能更糟的增量,同时浪费大量 CPU 时间。
Linus ends his mail with the conclusion that git repack
with a large --depth
and --window
is the better choice in most of time; especially after you imported a large project and want to make sure that git finds good deltas.
Linus 以这样的结论结束了他的邮件:在大多数情况下,git repack
使用大--depth
和--window
是更好的选择;特别是在您导入了一个大型项目并希望确保 git 找到好的增量之后。
So the equivalent of
git gc --aggressive
- but done properly- is to do (overnight) something like
git repack -a -d --depth=250 --window=250
where that depth thing is just about how deep the delta chains can be (make them longer for old history - it's worth the space overhead), and the window thing is about how big an object window we want each delta candidate to scan.
And here, you might well want to add the
-f
flag (which is the "drop all old deltas", since you now are actually trying to make sure that this one actually finds good candidates.
所以相当于
git gc --aggressive
- 但做得正确- 是做(一夜之间)类似的事情
git repack -a -d --depth=250 --window=250
深度是关于 delta 链的深度(让它们在旧历史中更长 - 空间开销是值得的),而窗口则是关于我们希望每个 delta 候选者扫描的对象窗口有多大。
在这里,您很可能想添加
-f
标志(即“删除所有旧增量”,因为您现在实际上是在尝试确保该标志确实找到了好的候选者。
回答by Sage Pointer
Caution. Do not run git gc --agressive
with repository which is not synchronized with remote if you have no backups.
警告。git gc --agressive
如果您没有备份,请不要使用未与远程同步的存储库运行。
This operation recreates deltas from scratch and could lead to data loss if gracefully interrupted.
此操作从头开始重新创建增量,如果正常中断可能会导致数据丢失。
For my 8GB computer aggressive gc ran out of memory on 1Gb repository with 10k small commits. When OOM killer terminated git process - it left me with almost empty repository, only working tree and few deltas survived.
对于我的 8GB 计算机,激进的 gc 在 1Gb 存储库上耗尽了内存,并且有 10k 次小提交。当 OOM 杀手终止 git 进程时——它给我留下了几乎空的存储库,只有工作树和很少的增量幸存下来。
Of course, it was not the only copy of repository so I just recreated it and pulled from remote (fetch did not work on broken repo and deadlocked on 'resolving deltas' step few times I tried to do so), but if your repo is single-developer local repo without remotes at all - back it up first.
当然,它不是存储库的唯一副本,所以我只是重新创建了它并从远程拉取(fetch 在损坏的存储库上不起作用,并且我尝试这样做了几次在“解析增量”步骤中陷入僵局),但是如果您的存储库是完全没有遥控器的单一开发人员本地存储库 - 首先备份它。
回答by VonC
Note: beware of using git gc --aggressive
, as Git 2.22 (Q2 2019) clarifies.
注意:请注意使用git gc --aggressive
,正如 Git 2.22(2019 年第二季度)所阐明的那样。
See commit 0044f77, commit daecbf2, commit 7384504, commit 22d4e3b, commit 080a448, commit 54d56f5, commit d257e0f, commit b6a8d09(07 Apr 2019), and commit fc559fb, commit cf9cd77, commit b11e856(22 Mar 2019) by ?var Arnfj?re Bjarmason (avar
).
(Merged by Junio C Hamano -- gitster
--in commit ac70c53, 25 Apr 2019)
见提交0044f77,提交daecbf2,提交7384504,提交22d4e3b,提交080a448,提交54d56f5,提交d257e0f,提交b6a8d09(2019年4月7日),和fc559fb提交,提交cf9cd77,提交b11e856通过(2019年3月22日)?VAR Arnfj?再贾马森 ( avar
)。
(由Junio C gitster
Hamano合并-- --在ac70c53 提交中,2019 年 4 月 25 日)
gc
docs: downplay the usefulness of--aggressive
The existing "
gc --aggressive
" docs come just short of recommending to users that they run it regularly.
I've personally talked to many users who've taken these docs as an advice to use this option, and have, usually it's (mostly) a waste of time.So let's clarify what it really does, and let the user draw their own conclusions.
Let's also clarify the "The effects [...] are persistent" to paraphrase a brief version of Jeff King's explanation.
gc
文档:淡化了--aggressive
现有的“
gc --aggressive
”文档不足以向用户推荐他们定期运行它。
我亲自与许多将这些文档作为使用此选项的建议的用户进行了交谈,并且通常(大部分)是在浪费时间。因此,让我们澄清它的真正作用,并让用户得出自己的结论。
让我们也澄清一下“效果 [...] 是持久的”,以解释Jeff King解释的简短版本。
That means the git-gc documentation now includes:
这意味着git-gc 文档现在包括:
AGGRESSIVE
When the
--aggressive
option is supplied,git-repack
will be invoked with the-f
flag, which in turn will pass--no-reuse-delta
to git-pack-objects.
This will throw away any existing deltas and re-compute them, at the expense of spending much more time on the repacking.The effects of this are mostly persistent, e.g. when packs and loose objects are coalesced into one another pack the existing deltas in that pack might get re-used, but there are also various cases where we might pick a sub-optimal delta from a newer pack instead.
Furthermore, supplying
--aggressive
will tweak the--depth
and--window
options passed togit-repack
.
See thegc.aggressiveDepth
andgc.aggressiveWindow
settings below.
By using a larger window size we're more likely to find more optimal deltas.It's probably not worth it to use this option on a given repository without running tailored performance benchmarks on it.
It takes a lot more time, and the resulting space/delta optimization may or may not be worth it. Not using this at all is the right trade-off for most users and their repositories.
挑衅的
当
--aggressive
选项被提供时,git-repack
将与被调用-f
的标志,这反过来将传递--no-reuse-delta
到GIT中包对象。
这将丢弃任何现有的增量并重新计算它们,代价是在重新打包上花费更多的时间。这种影响大多是持久的,例如,当包和松散的对象合并到另一个包中时,该包中现有的增量可能会被重新使用,但也有各种情况,我们可能会从较新的增量中选择次优增量包代替。
此外,提供
--aggressive
将调整传递给的--depth
和--window
选项git-repack
。
请参阅下面的gc.aggressiveDepth
和gc.aggressiveWindow
设置。
通过使用更大的窗口大小,我们更有可能找到更多的最佳增量。在给定的存储库上使用此选项而不在其上运行定制的性能基准测试可能不值得。
这需要更多的时间,并且由此产生的空间/增量优化可能值得也可能不值得。对于大多数用户及其存储库来说,根本不使用它是正确的权衡。
And (commit 080a448):
并且(提交 080a448):
gc
docs: note how--aggressive
impacts--window
&--depth
Since 07e7dbf(
gc
: default aggressive depth to 50, 2016-08-11, Git v2.10.1) we somewhat confusingly use the same depth under--aggressive
as we do by default.As noted in that commit that makes sense, it was wrong to make more depth the default for "aggressive", and thus save disk space at the expense of runtime performance, which is usually the opposite of someone who'd like "aggressive gc" wants.
gc
文档:注意如何--aggressive
影响--window
&--depth
自07e7dbf(
gc
:默认激进深度为 50,2016 年 8 月 11 日,Git v2.10.1)以来,我们有些混淆地使用与--aggressive
默认情况下相同的深度。正如在那个有意义的提交中指出的那样,将更多深度作为“积极”的默认设置是错误的,从而以运行时性能为代价节省磁盘空间,这通常与喜欢“积极gc”的人相反想要。