git 查找胖提交
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/1286183/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
git find fat commit
提问by tig
Is it possible to get info about how much space is wasted by changes in every commit — so I can find commits which added big files or a lot of files. This is all to try to reduce git repo size (rebasing and maybe filtering commits)
是否有可能获得关于每次提交中的更改浪费了多少空间的信息——这样我就可以找到添加了大文件或大量文件的提交。这一切都是为了尝试减少 git repo 的大小(rebase 并可能过滤提交)
采纳答案by tig
Forgot to reply, my answer is:
忘记回复了,我的答案是:
git rev-list --all --pretty=format:'%H%n%an%n%s' # get all commits
git diff-tree -r -c -M -C --no-commit-id #{sha} # get new blobs for each commit
git cat-file --batch-check << blob ids # get size of each blob
回答by Pat Notz
You could do this:
你可以这样做:
git ls-tree -r -t -l --full-name HEAD | sort -n -k 4
This will show the largest files at the bottom (fourth column is the file (blob) size.
这将在底部显示最大的文件(第四列是文件(blob)大小。
If you need to look at different branches you'll want to change HEAD to those branch names. Or, put this in a loop over the branches, tags, or revs you are interested in.
如果您需要查看不同的分支,您需要将 HEAD 更改为这些分支名称。或者,将它放在您感兴趣的分支、标签或转数上的循环中。
回答by knocte
All of the solutions provided here focus on file sizesbut the original question asked was about commit sizes, which in my opinion, and in my case in point, was more important to find (because what I wanted is to get rid of many small binaries introduced in a single commit, which summed up accounted for a lot of size, but small size if measured individually by file).
这里提供的所有解决方案都专注于文件大小,但最初提出的问题是关于提交大小,在我看来,在我看来,找到更重要的(因为我想要的是摆脱许多小二进制文件在单个提交中引入,总结起来占了很多大小,但如果按文件单独衡量则小)。
A solution that focuses on commit sizes is the provided here, which is this perl script:
这里提供了一个专注于提交大小的解决方案,这是这个 perl 脚本:
#!/usr/bin/perl
foreach my $rev (`git rev-list --all --pretty=oneline`) {
my $tot = 0;
($sha = $rev) =~ s/\s.*$//;
foreach my $blob (`git diff-tree -r -c -M -C --no-commit-id $sha`) {
$blob = (split /\s/, $blob)[3];
next if $blob == "0000000000000000000000000000000000000000"; # Deleted
my $size = `echo $blob | git cat-file --batch-check`;
$size = (split /\s/, $size)[2];
$tot += int($size);
}
my $revn = substr($rev, 0, 40);
# if ($tot > 1000000) {
print "$tot $revn " . `git show --pretty="format:" --name-only $revn | wc -l` ;
# }
}
And which I call like this:
我这样称呼它:
./git-commit-sizes.pl | sort -n -k 1
回答by Michael Baltaks
Personally, I found this answer to be most helpful when trying to find large files in the history of a git repo: Find files in git repo over x megabytes, that don't exist in HEAD
就个人而言,我发现这个答案在尝试在 git repo 的历史记录中查找大文件时最有帮助:Find files in git repo over x megabytes, that don't exist in HEAD
回答by Stas Dashkovsky
#!/bin/bash
COMMITSHA=
CURRENTSIZE=$(git ls-tree -lrt $COMMITSHA | grep blob | sed -E "s/.{53} *([0-9]*).*//g" | paste -sd+ - | bc)
PREVSIZE=$(git ls-tree -lrt $COMMITSHA^ | grep blob | sed -E "s/.{53} *([0-9]*).*//g" | paste -sd+ - | bc)
echo "$CURRENTSIZE - $PREVSIZE" | bc
回答by Caustic
git fat find N
where N is in bytes will return all the files in the whole history which are larger than N bytes.
git fat find N
其中 N 以字节为单位将返回整个历史记录中大于 N 字节的所有文件。
You can find out more about git-fat here: https://github.com/cyaninc/git-fat
你可以在这里找到更多关于 git-fat 的信息:https: //github.com/cyaninc/git-fat
回答by artagnon
git cat-file -s <object>
where <object>
can refer to a commit, blob, tree, or tag.
git cat-file -s <object>
where<object>
可以指提交、blob、树或标签。