git gc --aggressive 与 git repack

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/28720151/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-09 02:57:48  来源:igfitidea点击:

git gc --aggressive vs git repack

gitgithubversion-control

提问by Ajith R Nayak

I'm looking for ways to reduce the size of a gitrepository. Searching leads me to git gc --aggressivemost of the times. I have also read that this isn't the preferred approach.

我正在寻找减少git存储库大小的方法。git gc --aggressive大多数时候,搜索会引导我。我还读到这不是首选方法。

Why? what should I be aware of if I'm running gc --aggressive?

为什么?如果我正在跑步,我应该注意什么gc --aggressive

git repack -a -d --depth=250 --window=250is recommended over gc --aggressive. Why? How does repackreduce the size of a repository? Also, I'm not quite clear about the flags --depthand --window.

git repack -a -d --depth=250 --window=250推荐过gc --aggressive。为什么?如何repack减小存储库的大小?另外,我不太清楚标志--depth--window.

What should I choose between gcand repack? When should I use gcand repack?

我应该在gc和之间选择什么repack?我什么时候应该使用gcrepack

回答by Greg Bacon

Nowadays there is no difference: git gc --aggressiveoperates according to the suggestion Linus made in 2007; see below. As of version 2.11 (Q4 2016), git defaults to a depth of 50. A window of size 250 is good because it scans a larger section of each object, but depth at 250 is bad because it makes every chain refer to very deep old objects, which slows down allfuture git operations for marginally lower disk usage.

现在没有区别:git gc --aggressive按照Linus 2007年提出的建议操作;见下文。从 2.11 版(2016 年第 4 季度)开始,git 默认深度为 50。大小为 250 的窗口很好,因为它扫描每个对象的更大部分,但深度为 250 很糟糕,因为它使每个链都指向非常深的旧链对象,这会减慢所有未来的 git 操作,从而略微降低磁盘使用率。



Historical Background

历史背景

Linus suggested (see below for the full mailing list post) using git gc --aggressiveonly when you have, in his words, “a reallybad pack” or “really horribly bad deltas,” however “almost always, in other cases, it's actually a really bad thing to do.” The result may even leave your repository in worse condition than when you started!

Linus 建议(请参阅下面的完整邮件列表帖子)git gc --aggressive仅当你有,用他的话来说,“一个非常糟糕的包”或“非常糟糕的增量”,但是“几乎总是,在其他情况下,它实际上是一个非常糟糕的包”要做的事。” 结果甚至可能使您的存储库状况比开始时更糟!

The command he suggests for doing this properly after having imported “a long and involved history” is

在导入了“漫长而复杂的历史”之后,他建议正确执行此操作的命令是

git repack -a -d -f --depth=250 --window=250

But this assumes you have already removed unwanted gunkfrom your repository history and that you have followed the checklist for shrinking a repository found in the git filter-branchdocumentation.

但这假设您已经从您的存储库历史记录中删除了不需要的垃圾,并且您已经遵循了在git filter-branch文档中找到的收缩存储库的清单。

git-filter-branch can be used to get rid of a subset of files, usually with some combination of --index-filterand --subdirectory-filter. People expect the resulting repository to be smaller than the original, but you need a few more steps to actually make it smaller, because Git tries hard not to lose your objects until you tell it to. First make sure that:

  • You really removed all variants of a filename, if a blob was moved over its lifetime. git log --name-only --follow --all -- filenamecan help you find renames.

  • You really filtered all refs: use --tag-name-filter cat -- --allwhen calling git filter-branch.

Then there are two ways to get a smaller repository. A safer way is to clone, that keeps your original intact.

  • Clone it with git clone file:///path/to/repo. The clone will not have the removed objects. See git-clone. (Note that cloning with a plain path just hardlinks everything!)

If you really don't want to clone it, for whatever reasons, check the following points instead (in this order). This is a very destructive approach, so make a backup or go back to cloning it. You have been warned.

  • Remove the original refs backed up by git-filter-branch: say

    git for-each-ref --format="%(refname)" refs/original/ |
      xargs -n 1 git update-ref -d
    
  • Expire all reflogs with git reflog expire --expire=now --all.

  • Garbage collect all unreferenced objects with git gc --prune=now(or if your git gcis not new enough to support arguments to --prune, use git repack -ad; git pruneinstead).

git的过滤分支可以用来摆脱文件的一个子集,通常用的一些组合--index-filter--subdirectory-filter。人们期望生成的存储库比原始存储库小,但是您需要更多的步骤才能真正缩小它,因为 Git 会努力不丢失您的对象,直到您告诉它为止。首先确保:

  • 如果 blob 在其生命周期内被移动,您就真的删除了文件名的所有变体。git log --name-only --follow --all -- filename可以帮助您找到重命名。

  • 你真的过滤了所有的 refs: use --tag-name-filter cat -- --allwhen call git filter-branch.

那么有两种方法可以获得较小的存储库。更安全的方法是克隆,这样可以保持原件完好无损。

  • 克隆它git clone file:///path/to/repo。克隆不会有被移除的对象。参见 git-clone。(请注意,使用普通路径进行克隆只会硬链接所有内容!)

如果您真的不想克隆它,无论出于何种原因,请改为检查以下几点(按此顺序)。这是一种非常具有破坏性的方法,因此请进行备份或返回克隆它。你被警告了。

  • 删除由 git-filter-branch 备份的原始引用:说

    git for-each-ref --format="%(refname)" refs/original/ |
      xargs -n 1 git update-ref -d
    
  • 使所有引用日志过期git reflog expire --expire=now --all

  • 垃圾收集所有未引用的对象git gc --prune=now(或者,如果您git gc不够新,无法支持 的参数--prune,请git repack -ad; git prune改用)。



Date: Wed, 5 Dec 2007 22:09:12 -0800 (PST)
From: Linus Torvalds <torvalds at linux-foundation dot org>
To: Daniel Berlin <dberlin at dberlin dot org>
cc: David Miller <davem at davemloft dot net>,
    ismail at pardus dot org dot tr,
    gcc at gcc dot gnu dot org,
    git at vger dot kernel dot org
Subject: Re: Git and GCC
In-Reply-To: <[email protected]>
Message-ID: <[email protected]>
References: <[email protected]>
            <[email protected]>
            <[email protected]>
            <[email protected]>
            <[email protected]>

On Thu, 6 Dec 2007, Daniel Berlin wrote:

Actually, it turns out that git-gc --aggressivedoes this dumb thing to pack files sometimes regardless of whether you converted from an SVN repo or not.

Absolutely. git --aggressiveis mostly dumb. It's really only useful for the case of “I know I have a reallybad pack, and I want to throw away all the bad packing decisions I have done.”

To explain this, it's worth explaining (you are probably aware of it, but let me go through the basics anyway) how git delta-chains work, and how they are so different from most other systems.

In other SCMs, a delta-chain is generally fixed. It might be “forwards” or “backwards,” and it might evolve a bit as you work with the repository, but generally it's a chain of changes to a single file represented as some kind of single SCM entity. In CVS, it's obviously the *,vfile, and a lot of other systems do rather similar things.

Git also does delta-chains, but it does them a lot more “loosely.” There is no fixed entity. Deltas are generated against any random other version that git deems to be a good delta candidate (with various fairly successful heuristics), and there are absolutely no hard grouping rules.

This is generally a very good thing. It's good for various conceptual reasons (i.e., git internally never really even needs to care about the whole revision chain — it doesn't really think in terms of deltas at all), but it's also great because getting rid of the inflexible delta rules means that git doesn't have any problems at all with merging two files together, for example — there simply are no arbitrary *,v“revision files” that have some hidden meaning.

It also means that the choice of deltas is a much more open-ended question. If you limit the delta chain to just one file, you really don't have a lot of choices on what to do about deltas, but in git, it really can be a totally different issue.

And this is where the really badly named --aggressivecomes in. While git generally tries to re-use delta information (because it's a good idea, and it doesn't waste CPU time re-finding all the good deltas we found earlier), sometimes you want to say “let's start all over, with a blank slate, and ignore all the previous delta information, and try to generate a new set of deltas.”

So --aggressiveis not really about being aggressive, but about wasting CPU time re-doing a decision we already did earlier!

Sometimesthat is a good thing. Some import tools in particular could generate really horribly bad deltas. Anything that uses git fast-import, for example, likely doesn't have much of a great delta layout, so it might be worth saying “I want to start from a clean slate.”

But almost always, in other cases, it's actually a really bad thing to do. It's going to waste CPU time, and especially if you had actually done a good job at deltaing earlier, the end result isn't going to re-use all those gooddeltas you already found, so you'll actually end up with a much worse end result too!

I'll send a patch to Junio to just remove the git gc --aggressivedocumentation. It can be useful, but it generally is useful only when you really understand at a very deep level what it's doing, and that documentation doesn't help you do that.

Generally, doing incremental git gcis the right approach, and better than doing git gc --aggressive. It's going to re-use old deltas, and when those old deltas can't be found (the reason for doing incremental GC in the first place!) it's going to create new ones.

On the other hand, it's definitely true that an “initial import of a long and involved history” is a point where it can be worth spending a lot of time finding the really gooddeltas. Then, every user ever after (as long as they don't use git gc --aggressiveto undo it!) will get the advantage of that one-time event. So especially for big projects with a long history, it's probably worth doing some extra work, telling the delta finding code to go wild.

So the equivalent of git gc --aggressive— but done properly— is to do (overnight) something like

git repack -a -d --depth=250 --window=250

where that depth thing is just about how deep the delta chains can be (make them longer for old history — it's worth the space overhead), and the window thing is about how big an object window we want each delta candidate to scan.

And here, you might well want to add the -fflag (which is the “drop all old deltas,” since you now are actually trying to make sure that this one actually finds good candidates.

And then it's going to take forever and a day (i.e., a “do it overnight” thing). But the end result is that everybody downstream from that repository will get much better packs, without having to spend any effort on it themselves.

          Linus
Date: Wed, 5 Dec 2007 22:09:12 -0800 (PST)
From: Linus Torvalds <torvalds at linux-foundation dot org>
To: Daniel Berlin <dberlin at dberlin dot org>
cc: David Miller <davem at davemloft dot net>,
    ismail at pardus dot org dot tr,
    gcc at gcc dot gnu dot org,
    git at vger dot kernel dot org
Subject: Re: Git and GCC
In-Reply-To: <[email protected]>
Message-ID: <[email protected]>
References: <[email protected]>
            <[email protected]>
            <[email protected]>
            <[email protected]>
            <[email protected]>

2007 年 12 月 6 日星期四,Daniel Berlin 写道:

实际上,事实证明,git-gc --aggressive无论您是否从 SVN 存储库转换而来,有时打包文件都会做这种愚蠢的事情。

绝对地。git --aggressive大多是哑巴。它真的只对“我知道我有一个非常糟糕的包装,我想扔掉我所做的所有糟糕的包装决定”的情况有用。

为了解释这一点,值得解释一下(您可能已经知道了,但无论如何让我来了解一下基础知识)git delta-chains 是如何工作的,以及它们与大多数其他系统有何不同。

在其他 SCM 中,delta 链通常是固定的。它可能是“向前”或“向后”,并且在您使用存储库时可能会发生一些变化,但通常它是对单个文件的一系列更改,表示为某种单一的 SCM 实体。在 CVS 中,显然是*,v文件,许多其他系统也做类似的事情。

Git 也做 delta-chains,但它做的更“松散”。没有固定的实体。Delta 是针对任何 git 认为是一个好的 delta 候选的随机其他版本生成的(具有各种相当成功的启发式方法),并且绝对没有硬性分组规则。

这通常是一件非常好的事情。出于各种概念上的原因这很好(,git 内部从不需要真正关心整个修订链——它根本没有真正考虑增量),但它也很棒,因为摆脱不灵活的增量规则意味着例如,git 将两个文件合并在一起根本没有任何问题——根本就没有*,v具有某些隐藏含义的任意“修订文件”。

这也意味着增量的选择是一个更加开放的问题。如果您将 delta 链限制为一个文件,那么您对于如何处理 delta 确实没有太多选择,但在 git 中,这确实可能是一个完全不同的问题。

这就是真正糟糕的名字--aggressive出现的地方。虽然 git 通常尝试重用增量信息(因为这是一个好主意,而且它不会浪费 CPU 时间来重新查找我们之前找到的所有好的增量),但有时您想说“让我们从头开始,用一张空白的石板,忽略所有以前的增量信息,并尝试生成一组新的增量。”

所以--aggressive并不是要积极进取,而是要浪费 CPU 时间重新做我们之前已经做过的决定!

有时这是一件好事。特别是一些导入工具可能会产生非常糟糕的增量。git fast-import例如,任何使用.

但几乎总是,在其他情况下,这实际上是一件非常糟糕的事情。这会浪费 CPU 时间,尤其是如果您之前确实在增量方面做得很好,那么最终结果将不会重复使用您已经找到的所有那些好的增量,因此您实际上最终会得到很多更糟糕的最终结果!

我将向 Junio 发送补丁以删除git gc --aggressive文档。它可能很有用,但通常只有当您真正深入了解它在做什么时才有用,而该文档并不能帮助您做到这一点。

一般来说,做增量git gc是正确的方法,而且比做更好git gc --aggressive。它将重新使用旧的增量,当无法找到那些旧的增量时​​(首先进行增量 GC 的原因!)它将创建新的增量。

另一方面,“一段漫长而复杂的历史的初始导入”确实是值得花费大量时间寻找真正好的增量的点。然后,此后的每个用户(只要他们不使用git gc --aggressive撤消它!)都将获得该一次性事件的优势。因此,特别是对于历史悠久的大型项目,可能值得做一些额外的工作,告诉增量查找代码变得疯狂。

所以相当于git gc --aggressive- 但做得正确- 是做(一夜之间)类似的事情

git repack -a -d --depth=250 --window=250

深度是关于 delta 链的深度(让它们在旧历史中更长——值得空间开销),而窗口则是关于我们希望每个 delta 候选者扫描的对象窗口有多大。

在这里,您很可能想添加-f标志(即“删除所有旧增量”,因为您现在实际上是在尝试确保该标志确实找到了合适的候选者。

然后它将需要永远和一天(,“一夜之间”的事情)。但最终的结果是,该存储库下游的每个人都将获得更好的包,而无需自己为此付出任何努力。

          Linus

回答by VonC

When should I use gc & repack?

我什么时候应该使用 gc & repack?

As I mentioned in "Git Garbage collection doesn't seem to fully work", a git gc --aggressiveis neither sufficient or even enough on its own.
And, as I explain below, often not needed.

正如我在“ Git Garbage collection 似乎不能完全工作”中提到的,agit gc --aggressive本身是不够的,甚至不够。
而且,正如我在下面解释的那样,通常不需要。

The most effective combination would be adding git repack, but also git prune:

最有效的组合是添加git repack,但也包括git prune

git gc
git repack -Ad      # kills in-pack garbage
git prune           # kills loose garbage


Note: Git 2.11 (Q4 2016) will set the default gc aggressivedepth to 50

注意:Git 2.11(2016 年第四季度)将默认gc aggressive深度设置为 50

See commit 07e7dbf(11 Aug 2016) by Jeff King (peff).
(Merged by Junio C Hamano -- gitster--in commit 0952ca8, 21 Sep 2016)

请参阅Jeff King ( ) 的commit 07e7dbf(2016 年 8 月 11 日(由Junio C Hamano合并-- --commit 0952ca8,2016 年 9 月 21 日)peff
gitster

gc: default aggressive depth to 50

"git gc --aggressive" used to limit the delta-chain length to 250, which is way too deep for gaining additional space savings and is detrimental for runtime performance.
The limit has been reduced to 50.

The summary is: the current default of 250 doesn't save much space, and costs CPU. It's not a good tradeoff.

The "--aggressive" flag to git-gcdoes three things:

  1. use "-f" to throw out existing deltas and recompute from scratch
  2. use "--window=250" to look harder for deltas
  3. use "--depth=250" to make longer delta chains

Items (1) and (2) are good matches for an "aggressive" repack.
They ask the repack to do more computation work in the hopes of getting a better pack. You pay the costs during the repack, and other operations see only the benefit.

Item (3) is not so clear.
Allowing longer chains means fewer restrictions on the deltas, which means potentially finding better ones and saving some space.
But it also means that operations which access the deltas have to follow longer chains, which affects their performance.
So it's a tradeoff, and it's not clear that the tradeoff is even a good one.

gc: 默认攻击深度为 50

" git gc --aggressive" 用于将 delta-chain 长度限制为 250,这对于获得额外的空间节省来说太深了,并且不利于运行时性能。
限制已降至 50。

总结是:当前的默认值 250 并没有节省多少空间,而且会消耗 CPU。这不是一个好的权衡。

" --aggressive" 标志git-gc做三件事:

  1. 使用“ -f”丢弃现有的增量并从头开始重新计算
  2. 使用“--window=250”更难寻找增量
  3. 使用“--depth=250”来制作更长的delta链

项目 (1) 和 (2) 非常适合“积极的”重新打包。
他们要求重新包装做更多的计算工作,以期得到更好的包装。您在重新包装期间支付费用,其他操作只看到收益。

第(3)项不是很清楚。
允许更长的链意味着对增量的限制更少,这意味着可能会找到更好的链并节省一些空间。
但这也意味着访问增量的操作必须遵循更长的链,这会影响它们的性能。
所以这是一种权衡,目前尚不清楚这种权衡是否是好的。

(See commit for study)

(见提交研究

You can see that that the CPU savings for regular operations improves as we decrease the depth.
But we can also see that the space savings are not that great as the depth goes higher. Saving 5-10% between 10 and 50 is probably worth the CPU tradeoff. Saving 1% to go from 50 to 100, or another 0.5% to go from 100 to 250 is probably not.

您可以看到,随着我们减少深度,常规操作的 CPU 节省有所提高。
但是我们也可以看到,随着深度的增加,节省的空间并不是那么大。在 10 到 50 之间节省 5-10% 可能值得 CPU 权衡。从 50 到 100 节省 1%,或者从 100 到 250 节省 0.5% 可能不是。



Speaking of CPU saving, "git repack" learned to accept the --threads=<n>option and pass it to pack-objects.

说到 CPU 节省,“ git repack”学会了接受--threads=<n>选项并将其传递给 pack-objects。

See commit 40bcf31(26 Apr 2017) by Junio C Hamano (gitster).
(Merged by Junio C Hamano -- gitster--in commit 31fb6f4, 29 May 2017)

请参阅Junio C Hamano() 的commit 40bcf31(2017 年 4 月 26 日(由Junio C Hamano合并-- --in commit 31fb6f4,2017 年 5 月 29 日)gitster
gitster

repack: accept --threads=<n>and pass it down to pack-objects

重新打包:接受--threads=<n>并将其传递给pack-objects

We already do so for --window=<n>and --depth=<n>; this will help when the user wants to force --threads=1for reproducible testing without getting affected by racing multiple threads.

我们已经为--window=<n>and这样做了--depth=<n>;当用户想要强制--threads=1进行可重复测试而不受到多线程竞争的影响时,这将有所帮助。

回答by Sascha Wolf

The problem with git gc --aggressiveis that the option name and documentation is misleading.

问题git gc --aggressive在于选项名称和文档具有误导性。

As Linus himself explains in this mail, what git gc --aggressivebasicly does is this:

正如Linus 自己在这封邮件中解释的那样git gc --aggressive基本上是这样的:

While git generally tries to re-use delta information (because it's a good idea, and it doesn't waste CPU time re-finding all the good deltas we found earlier), sometimes you want to say "let's start all over, with a blank slate, and ignore all the previous delta information, and try to generate a new set of deltas".

虽然 git 通常会尝试重用 delta 信息(因为这是一个好主意,而且不会浪费 CPU 时间重新查找我们之前找到的所有好的 delta),但有时您想说“让我们从头开始,用一个空白石板,并忽略所有先前的增量信息,并尝试生成一组新的增量”。

Usually there is no need to recalculate deltas in git, since git determines these deltas very flexible. It only makes sense if you know that you have really, really bad deltas. As Linus explains, mainly tools which make use of git fast-importfall into this category.

通常不需要在 git 中重新计算增量,因为 git 非常灵活地确定这些增量。只有当你知道你有非常非常糟糕的增量时才有意义。正如 Linus 所解释的,主要使用的工具git fast-import属于这一类。

Most of the time git does a pretty good job at determining useful deltas and using git gc --aggressivewill leave you with deltas which are potentially even worse while wasting a lot of CPU time.

大多数时候 git 在确定有用的增量方面做得非常好,使用git gc --aggressive会给你留下可能更糟的增量,同时浪费大量 CPU 时间。



Linus ends his mail with the conclusion that git repackwith a large --depthand --windowis the better choice in most of time; especially after you imported a large project and want to make sure that git finds good deltas.

Linus 以这样的结论结束了他的邮件:在大多数情况下,git repack使用大--depth--window是更好的选择;特别是在您导入了一个大型项目并希望确保 git 找到好的增量之后。

So the equivalent of git gc --aggressive- but done properly- is to do (overnight) something like

git repack -a -d --depth=250 --window=250

where that depth thing is just about how deep the delta chains can be (make them longer for old history - it's worth the space overhead), and the window thing is about how big an object window we want each delta candidate to scan.

And here, you might well want to add the -fflag (which is the "drop all old deltas", since you now are actually trying to make sure that this one actually finds good candidates.

所以相当于git gc --aggressive- 但做得正确- 是做(一夜之间)类似的事情

git repack -a -d --depth=250 --window=250

深度是关于 delta 链的深度(让它们在旧历史中更长 - 空间开销是值得的),而窗口则是关于我们希望每个 delta 候选者扫描的对象窗口有多大。

在这里,您很可能想添加-f标志(即“删除所有旧增量”,因为您现在实际上是在尝试确保该标志确实找到了好的候选者。

回答by Sage Pointer

Caution. Do not run git gc --agressivewith repository which is not synchronized with remote if you have no backups.

警告。git gc --agressive如果您没有备份,请不要使用未与远程同步的存储库运行。

This operation recreates deltas from scratch and could lead to data loss if gracefully interrupted.

此操作从头开始重新创建增量,如果正常中断可能会导致数据丢失。

For my 8GB computer aggressive gc ran out of memory on 1Gb repository with 10k small commits. When OOM killer terminated git process - it left me with almost empty repository, only working tree and few deltas survived.

对于我的 8GB 计算机,激进的 gc 在 1Gb 存储库上耗尽了内存,并且有 10k 次小提交。当 OOM 杀手终止 git 进程时——它给我留下了几乎空的存储库,只有工作树和很少的增量幸存下来。

Of course, it was not the only copy of repository so I just recreated it and pulled from remote (fetch did not work on broken repo and deadlocked on 'resolving deltas' step few times I tried to do so), but if your repo is single-developer local repo without remotes at all - back it up first.

当然,它不是存储库的唯一副本,所以我只是重新创建了它并从远程拉取(fetch 在损坏的存储库上不起作用,并且我尝试这样做了几次在“解析增量”步骤中陷入僵局),但是如果您的存储库是完全没有遥控器的单一开发人员本地存储库 - 首先备份它。

回答by VonC

Note: beware of using git gc --aggressive, as Git 2.22 (Q2 2019) clarifies.

注意:请注意使用git gc --aggressive,正如 Git 2.22(2019 年第二季度)所阐明的那样。

See commit 0044f77, commit daecbf2, commit 7384504, commit 22d4e3b, commit 080a448, commit 54d56f5, commit d257e0f, commit b6a8d09(07 Apr 2019), and commit fc559fb, commit cf9cd77, commit b11e856(22 Mar 2019) by ?var Arnfj?re Bjarmason (avar).
(Merged by Junio C Hamano -- gitster--in commit ac70c53, 25 Apr 2019)

提交0044f77提交daecbf2提交7384504提交22d4e3b提交080a448提交54d56f5提交d257e0f提交b6a8d09(2019年4月7日),和fc559fb提交提交cf9cd77提交b11e856通过(2019年3月22日)?VAR Arnfj?再贾马森 ( avar)
(由Junio C gitsterHamano合并-- --ac70c53 提交中,2019 年 4 月 25 日)

gcdocs: downplay the usefulness of --aggressive

The existing "gc --aggressive" docs come just short of recommending to users that they run it regularly.
I've personally talked to many users who've taken these docs as an advice to use this option, and have, usually it's (mostly) a waste of time.

So let's clarify what it really does, and let the user draw their own conclusions.

Let's also clarify the "The effects [...] are persistent" to paraphrase a brief version of Jeff King's explanation.

gc文档:淡化了 --aggressive

现有的“ gc --aggressive”文档不足以向用户推荐他们定期运行它。
我亲自与许多将这些文档作为使用此选项的建议的用户进行了交谈,并且通常(大部分)是在浪费时间

因此,让我们澄清它的真正作用,并让用户得出自己的结论。

让我们也澄清一下“效果 [...] 是持久的”,以解释Jeff King解释的简短版本。

That means the git-gc documentation now includes:

这意味着git-gc 文档现在包括

AGGRESSIVE

When the --aggressiveoption is supplied, git-repackwill be invoked with the -fflag, which in turn will pass --no-reuse-deltato git-pack-objects.
This will throw away any existing deltas and re-compute them, at the expense of spending much more time on the repacking.

The effects of this are mostly persistent, e.g. when packs and loose objects are coalesced into one another pack the existing deltas in that pack might get re-used, but there are also various cases where we might pick a sub-optimal delta from a newer pack instead.

Furthermore, supplying --aggressivewill tweak the --depthand --windowoptions passed to git-repack.
See the gc.aggressiveDepthand gc.aggressiveWindowsettings below.
By using a larger window size we're more likely to find more optimal deltas.

It's probably not worth it to use this option on a given repository without running tailored performance benchmarks on it.
It takes a lot more time, and the resulting space/delta optimization may or may not be worth it. Not using this at all is the right trade-off for most users and their repositories.

挑衅的

--aggressive选项被提供时,git-repack将与被调用-f的标志,这反过来将传递--no-reuse-deltaGIT中包对象
这将丢弃任何现有的增量并重新计算它们,代价是在重新打包上花费更多的时间。

这种影响大多是持久的,例如,当包和松散的对象合并到另一个包中时,该包中现有的增量可能会被重新使用,但也有各种情况,我们可能会从较新的增量中选择次优增量包代替。

此外,提供--aggressive将调整传递给的--depth--window选项git-repack
请参阅下面的gc.aggressiveDepthgc.aggressiveWindow设置。
通过使用更大的窗口大小,我们更有可能找到更多的最佳增量。

在给定的存储库上使用此选项而不在其上运行定制的性能基准测试可能不值得
这需要更多的时间,并且由此产生的空间/增量优化可能值得也可能不值得。对于大多数用户及其存储库来说,根本不使用它是正确的权衡。

And (commit 080a448):

并且(提交 080a448):

gcdocs: note how --aggressiveimpacts --window& --depth

Since 07e7dbf(gc: default aggressive depth to 50, 2016-08-11, Git v2.10.1) we somewhat confusingly use the same depth under --aggressiveas we do by default.

As noted in that commit that makes sense, it was wrong to make more depth the default for "aggressive", and thus save disk space at the expense of runtime performance, which is usually the opposite of someone who'd like "aggressive gc" wants.

gc文档:注意如何--aggressive影响--window&--depth

07e7dbfgc:默认激进深度为 50,2016 年 8 月 11 日,Git v2.10.1)以来,我们有些混淆地使用与--aggressive默认情况下相同的深度。

正如在那个有意义的提交中指出的那样,将更多深度作为“积极”的默认设置是错误的,从而以运行时性能为代价节省磁盘空间,这通常与喜欢“积极gc”的人相反想要。