Git Blame 提交统计
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/4589731/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Git Blame Commit Statistics
提问by Erik Aigner
How can I "abuse" blame (or some better suited function, and/or in conjunction with shell commands) to give me a statistic of how much lines (of code) are currently in the repository originating from each committer?
我如何“滥用”blame(或一些更合适的函数,和/或与 shell 命令结合)来统计存储库中当前来自每个提交者的(代码)行数?
Example Output:
示例输出:
Committer 1: 8046 Lines
Committer 2: 4378 Lines
采纳答案by Alex
Update
更新
git ls-tree -r -z --name-only HEAD -- */*.c | xargs -0 -n1 git blame \
--line-porcelain HEAD |grep "^author "|sort|uniq -c|sort -nr
I updated some things on the way.
我在路上更新了一些东西。
For convenience, you can also put this into its own command:
为方便起见,您也可以将其放入自己的命令中:
#!/bin/bash
# save as i.e.: git-authors and set the executable flag
git ls-tree -r -z --name-only HEAD -- | xargs -0 -n1 git blame \
--line-porcelain HEAD |grep "^author "|sort|uniq -c|sort -nr
store this somewhere in your path or modify your path and use it like
将此存储在您的路径中的某个位置或修改您的路径并使用它
git authors '*/*.c' # look for all files recursively ending in .c
git authors '*/*.[ch]' # look for all files recursively ending in .c or .h
git authors 'Makefile' # just count lines of authors in the Makefile
git authors '*/*.c' # look for all files recursively ending in .c
git authors '*/*.[ch]' # look for all files recursively ending in .c or .h
git authors 'Makefile' # just count lines of authors in the Makefile
Original Answer
原答案
While the accepted answer does the job it's very slow.
虽然接受的答案可以完成工作,但速度非常慢。
$ git ls-tree --name-only -z -r HEAD|egrep -z -Z -E '\.(cc|h|cpp|hpp|c|txt)$' \
|xargs -0 -n1 git blame --line-porcelain|grep "^author "|sort|uniq -c|sort -nr
is almost instantaneous.
几乎是瞬间的。
To get a list of files currently tracked you can use
要获取当前跟踪的文件列表,您可以使用
git ls-tree --name-only -r HEAD
This solution avoids calling file
to determine the filetype and uses grep to match the wanted extension for performance reasons. If all files should be included, just remove this from the line.
此解决方案避免调用file
以确定文件类型,并出于性能原因使用 grep 匹配所需的扩展名。如果应该包含所有文件,只需从行中删除它。
grep -E '\.(cc|h|cpp|hpp|c)$' # for C/C++ files
grep -E '\.py$' # for Python files
if the files can contain spaces, which are bad for shells you can use:
如果文件可以包含空格,这对 shell 不利,您可以使用:
git ls-tree -z --name-only -r HEAD | egrep -Z -z '\.py'|xargs -0 ... # passes newlines as 'xargs -n1 git blame --line-porcelain
'
Give a list of files (through a pipe) one can use xargs to call a command and distribute the arguments. Commands that allow multiple files to be processed obmit the -n1
. In this case we call git blame --line-porcelain
and for every call we use exactly 1 argument.
给出一个文件列表(通过管道),可以使用 xargs 调用命令并分发参数。允许处理多个文件的命令省略-n1
. 在这种情况下,我们调用,git blame --line-porcelain
并且对于每个调用,我们只使用 1 个参数。
grep "^author "|sort|uniq -c|sort -nr
We then filter the output for occurences of "author " sort the list and count duplicate lines by:
然后我们过滤输出以查找“author”的出现,对列表进行排序并按以下方式计算重复行数:
grep -Pzo "author [^\n]*\n([^\n]*\n){10}[\w]*[^\w]"|grep "author "
Note
笔记
Other answers actually filter out lines that contain only whitespaces.
其他答案实际上会过滤掉仅包含空格的行。
Statistics based on master
Active files: 21
Active lines: 967
Total commits: 109
Note: Files matching MIME type image, binary has been ignored
+----------------+-----+---------+-------+---------------------+
| name | loc | commits | files | distribution (%) |
+----------------+-----+---------+-------+---------------------+
| Linus Oleander | 914 | 106 | 21 | 94.5 / 97.2 / 100.0 |
| f1yegor | 47 | 2 | 7 | 4.9 / 1.8 / 33.3 |
| David Selassie | 6 | 1 | 2 | 0.6 / 0.9 / 9.5 |
+----------------+-----+---------+-------+---------------------+
The command above will print authors of lines containing at least one non-whitespace character. You can also use match \w*[^\w#]
which will also exclude lines where the first non-whitespace character isn't a #
(comment in many scripting languages).
上面的命令将打印包含至少一个非空白字符的行的作者。您还可以使用 match \w*[^\w#]
,它也将排除第一个非空白字符不是 a 的行#
(许多脚本语言中的注释)。
回答by Linus Oleander
I wrote a gem called git-famethat might be useful.
我写了一个可能有用的名为git-fame的 gem 。
Installation and usage:
安装及使用:
$ gem install git_fame
$ cd /path/to/gitdir
$ git fame
$ gem install git_fame
$ cd /path/to/gitdir
$ git fame
Output:
输出:
git ls-tree -r HEAD|sed -re 's/^.{53}//'|while read filename; do file "$filename"; done|grep -E ': .*text'|sed -r -e 's/: .*//'|while read filename; do git blame -w "$filename"; done|sed -r -e 's/.*\((.*)[0-9]{4}-[0-9]{2}-[0-9]{2} .*//' -e 's/ +$//'|sort|uniq -c
回答by Edward Anderson
git ls-tree -r HEAD|sed -re 's/^.{53}//'
Step by step explanation:
分步说明:
List all the files under version control
列出版本控制下的所有文件
|while read filename; do file "$filename"; done|grep -E ': .*text'|sed -r -e 's/: .*//'
Prune the list down to only text files
将列表修剪为仅文本文件
|while read filename; do git blame -w "$filename"; done
Git blame all the text files, ignoring whitespace changes
Git 归咎于所有文本文件,忽略空格更改
|sed -r -e 's/.*\((.*)[0-9]{4}-[0-9]{2}-[0-9]{2} .*//' -e 's/ +$//'
Pull out the author names
拉出作者姓名
|sort|uniq -c
Sort the list of authors, and have uniq count the number of consecutively repeating lines
对作者列表进行排序,并让 uniq 计算连续重复行的数量
1334 Maneater
1924 Another guy
37195 Brian Ruby
1482 Anna Lambda
Example output:
示例输出:
git summary --line
回答by adius
git summary
provided by the git-extraspackage is exactly what you need. Checkout the documentation at git-extras - git-summary:
git summary
通过所提供的混帐额外包正是你所需要的。在git-extras - git-summary查看文档:
project : TestProject
lines : 13397
authors :
8927 John Doe 66.6%
4447 Jane Smith 33.2%
23 Not Committed Yet 0.2%
Gives output that looks like this:
给出如下所示的输出:
git ls-tree -r HEAD | gsed -re 's/^.{53}//' | \
while read filename; do file "$filename"; done | \
grep -E ': .*text' | gsed -r -e 's/: .*//' | \
while read filename; do git blame "$filename"; done | \
ruby -ne 'puts .strip if $_ =~ /^\w{8} \((.*?)\s*\d{4}-\d{2}-\d{2}/' | \
sort | uniq -c | sort -rg
回答by gtd
Erik's solution was awesome, but I had some problems with diacritics (despite my LC_*
environment variables being set ostensibly correctly) and noise leaking through on lines of code that actually had dates in them. My sed-fu is poor, so I ended up with this frankenstein snippet with ruby in it, but it works for me flawlessly on 200,000+ LOC, and it sorts the results:
Erik 的解决方案很棒,但我在变音符号方面遇到了一些问题(尽管我的LC_*
环境变量表面上是正确设置的)和噪音在实际包含日期的代码行中泄漏。我的 sed-fu 很差,所以我最终得到了这个带有 ruby 的科学怪人片段,但它在 200,000+ LOC 上对我来说完美无缺,并且它对结果进行了排序:
git blame --line-porcelain path/to/file.txt | grep "^author " | sort | uniq -c | sort -nr
Also note gsed
instead of sed
because that's the binary homebrew installs, leaving the system sed intact.
还要注意,gsed
而不是sed
因为这是二进制自制软件安装,使系统 sed 保持完整。
回答by moinudin
回答by ThorSummoner
Here is the primary snippet from @Alex 's answer that actually does the operation of aggregating the blame lines. I've cut it down to operate on a single filerather than a set of files.
这是@Alex 回答中的主要片段,它实际上执行了聚合责备线的操作。我已将其缩减为对单个文件而不是一组文件进行操作。
for file in $(git ls-files); do \
echo $file; \
git blame --line-porcelain $file \
| grep "^author " | sort | uniq -c | sort -nr; \
echo; \
done
I post this here because I come back to this answer often and re-reading the post and re-digesting the examples to extract the portion I value it is taxing. Nor is it generic enough for my use case; its scope is for a whole C project.
我在这里发布这个是因为我经常回到这个答案并重新阅读帖子并重新消化示例以提取我认为它征税的部分。对于我的用例来说,它也不够通用;它的范围是整个 C 项目。
I like to list stats per file, achived via with a bash for
iterator instead of xargs
as I find xargs less readable and hard to use/memorize, The advantage/disadvantages xargs vs forshould be discussed elsewhere.
我喜欢列出每个文件的统计信息,通过使用 bashfor
迭代器而不是xargs
因为我发现 xargs 可读性较差且难以使用/记忆而实现,应该在别处讨论xargs 与 for的优点/缺点。
Here is a practical snippet that will show results for each file individually:
这是一个实用的片段,将分别显示每个文件的结果:
IFS=$'\n'
for file in $(git ls-files); do
git blame `git symbolic-ref --short HEAD` --line-porcelain "$file" | \
grep "^author " | \
grep -v "Binary file (standard input) matches" | \
grep -v "Not Committed Yet" | \
cut -d " " -f 2-
done | \
sort | \
uniq -c | \
sort -nr
And I tested, running this stright in a bash shell is ctrl+c safe, if you need to put this inside a bash script you might need to Trap on SIGINT and SIGTERMif you want the user to be able to break your for loop.
我测试过,在 bash shell 中直接运行是 ctrl+c 安全的,如果您需要将其放入 bash 脚本中,如果您希望用户能够中断您的 for 循环,您可能需要在 SIGINT 和 SIGTERM 上设置陷阱。
回答by Ivan
Check out the gitstats command available from http://gitstats.sourceforge.net/
查看可从http://gitstats.sourceforge.net/获得的 gitstats 命令
回答by Gabriel Diego
I have this solution that counts the blamed lines in all text files (excluding the binary files, even the versioned ones):
我有这个解决方案,可以计算所有文本文件(不包括二进制文件,甚至是版本化文件)中被指责的行:
find . -name '*.c' | xargs -n1 git blame --line-porcelain | grep "^author "|sort|uniq -c|sort -nr
回答by Martin G
This works in any directory of the source structure of the repo, in case you want to inspect a certain source module.
这适用于 repo 源结构的任何目录,以防您想检查某个源模块。
##代码##