git 查找胖提交

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/1286183/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-19 03:47:18  来源:igfitidea点击:

git find fat commit

gitstatisticsfindcommit

提问by tig

Is it possible to get info about how much space is wasted by changes in every commit — so I can find commits which added big files or a lot of files. This is all to try to reduce git repo size (rebasing and maybe filtering commits)

是否有可能获得关于每次提交中的更改浪费了多少空间的信息——这样我就可以找到添加了大文件或大量文件的提交。这一切都是为了尝试减少 git repo 的大小(rebase 并可能过滤提交)

采纳答案by tig

Forgot to reply, my answer is:

忘记回复了,我的答案是:

git rev-list --all --pretty=format:'%H%n%an%n%s'    # get all commits
git diff-tree -r -c -M -C --no-commit-id #{sha}     # get new blobs for each commit
git cat-file --batch-check << blob ids              # get size of each blob

回答by Pat Notz

You could do this:

你可以这样做:

git ls-tree -r -t -l --full-name HEAD | sort -n -k 4

This will show the largest files at the bottom (fourth column is the file (blob) size.

这将在底部显示最大的文件(第四列是文件(blob)大小。

If you need to look at different branches you'll want to change HEAD to those branch names. Or, put this in a loop over the branches, tags, or revs you are interested in.

如果您需要查看不同的分支,您需要将 HEAD 更改为这些分支名称。或者,将它放在您感兴趣的分支、标签或转数上的循环中。

回答by knocte

All of the solutions provided here focus on file sizesbut the original question asked was about commit sizes, which in my opinion, and in my case in point, was more important to find (because what I wanted is to get rid of many small binaries introduced in a single commit, which summed up accounted for a lot of size, but small size if measured individually by file).

这里提供的所有解决方案都专注于文件大小,但最初提出的问题是关于提交大小,在我看来,在我看来,找到更重要的(因为我想要的是摆脱许多小二进制文件在单个提交中引入,总结起来占了很多大小,但如果按文件单独衡量则小)。

A solution that focuses on commit sizes is the provided here, which is this perl script:

这里提供一个专注于提交大小的解决方案,这是这个 perl 脚本:

#!/usr/bin/perl
foreach my $rev (`git rev-list --all --pretty=oneline`) {
  my $tot = 0;
  ($sha = $rev) =~ s/\s.*$//;
  foreach my $blob (`git diff-tree -r -c -M -C --no-commit-id $sha`) {
    $blob = (split /\s/, $blob)[3];
    next if $blob == "0000000000000000000000000000000000000000"; # Deleted
    my $size = `echo $blob | git cat-file --batch-check`;
    $size = (split /\s/, $size)[2];
    $tot += int($size);
  }
  my $revn = substr($rev, 0, 40);
#  if ($tot > 1000000) {
    print "$tot $revn " . `git show --pretty="format:" --name-only $revn | wc -l`  ;
#  }
}

And which I call like this:

我这样称呼它:

./git-commit-sizes.pl | sort -n -k 1

回答by Michael Baltaks

Personally, I found this answer to be most helpful when trying to find large files in the history of a git repo: Find files in git repo over x megabytes, that don't exist in HEAD

就个人而言,我发现这个答案在尝试在 git repo 的历史记录中查找大文件时最有帮助:Find files in git repo over x megabytes, that don't exist in HEAD

回答by Stas Dashkovsky

#!/bin/bash
COMMITSHA=

CURRENTSIZE=$(git ls-tree -lrt $COMMITSHA | grep blob | sed -E "s/.{53} *([0-9]*).*//g" | paste -sd+ - | bc)
PREVSIZE=$(git ls-tree -lrt $COMMITSHA^ | grep blob | sed -E "s/.{53} *([0-9]*).*//g" | paste -sd+ - | bc)
echo "$CURRENTSIZE - $PREVSIZE" | bc

回答by Caustic

git fat find Nwhere N is in bytes will return all the files in the whole history which are larger than N bytes.

git fat find N其中 N 以字节为单位将返回整个历史记录中大于 N 字节的所有文件。

You can find out more about git-fat here: https://github.com/cyaninc/git-fat

你可以在这里找到更多关于 git-fat 的信息:https: //github.com/cyaninc/git-fat

回答by artagnon

git cat-file -s <object>where <object>can refer to a commit, blob, tree, or tag.

git cat-file -s <object>where<object>可以指提交、blob、树或标签。