Git Blame 提交统计

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/4589731/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-10 09:44:11  来源:igfitidea点击:

Git Blame Commit Statistics

git

提问by Erik Aigner

How can I "abuse" blame (or some better suited function, and/or in conjunction with shell commands) to give me a statistic of how much lines (of code) are currently in the repository originating from each committer?

我如何“滥用”blame(或一些更合适的函数,和/或与 shell 命令结合)来统计存储库中当前来自每个提交者的(代码)行数?

Example Output:

示例输出:

Committer 1: 8046 Lines
Committer 2: 4378 Lines

采纳答案by Alex

Update

更新

git ls-tree -r -z --name-only HEAD -- */*.c | xargs -0 -n1 git blame \
--line-porcelain HEAD |grep  "^author "|sort|uniq -c|sort -nr

I updated some things on the way.

我在路上更新了一些东西。

For convenience, you can also put this into its own command:

为方便起见,您也可以将其放入自己的命令中:

#!/bin/bash

# save as i.e.: git-authors and set the executable flag
git ls-tree -r -z --name-only HEAD --  | xargs -0 -n1 git blame \
 --line-porcelain HEAD |grep  "^author "|sort|uniq -c|sort -nr

store this somewhere in your path or modify your path and use it like

将此存储在您的路径中的某个位置或修改您的路径并使用它

  • git authors '*/*.c' # look for all files recursively ending in .c
  • git authors '*/*.[ch]' # look for all files recursively ending in .c or .h
  • git authors 'Makefile' # just count lines of authors in the Makefile
  • git authors '*/*.c' # look for all files recursively ending in .c
  • git authors '*/*.[ch]' # look for all files recursively ending in .c or .h
  • git authors 'Makefile' # just count lines of authors in the Makefile

Original Answer

原答案

While the accepted answer does the job it's very slow.

虽然接受的答案可以完成工作,但速度非常慢。

$ git ls-tree --name-only -z -r HEAD|egrep -z -Z -E '\.(cc|h|cpp|hpp|c|txt)$' \
  |xargs -0 -n1 git blame --line-porcelain|grep "^author "|sort|uniq -c|sort -nr

is almost instantaneous.

几乎是瞬间的。

To get a list of files currently tracked you can use

要获取当前跟踪的文件列表,您可以使用

git ls-tree --name-only -r HEAD

This solution avoids calling fileto determine the filetype and uses grep to match the wanted extension for performance reasons. If all files should be included, just remove this from the line.

此解决方案避免调用file以确定文件类型,并出于性能原因使用 grep 匹配所需的扩展名。如果应该包含所有文件,只需从行中删除它。

grep -E '\.(cc|h|cpp|hpp|c)$' # for C/C++ files
grep -E '\.py$'               # for Python files

if the files can contain spaces, which are bad for shells you can use:

如果文件可以包含空格,这对 shell 不利,您可以使用:

git ls-tree -z --name-only -r HEAD | egrep -Z -z '\.py'|xargs -0 ... # passes newlines as '
xargs -n1 git blame --line-porcelain
'

Give a list of files (through a pipe) one can use xargs to call a command and distribute the arguments. Commands that allow multiple files to be processed obmit the -n1. In this case we call git blame --line-porcelainand for every call we use exactly 1 argument.

给出一个文件列表(通过管道),可以使用 xargs 调用命令并分发参数。允许处理多个文件的命令省略-n1. 在这种情况下,我们调用,git blame --line-porcelain并且对于每个调用,我们只使用 1 个参数。

grep "^author "|sort|uniq -c|sort -nr

We then filter the output for occurences of "author " sort the list and count duplicate lines by:

然后我们过滤输出以查找“author”的出现,对列表进行排序并按以下方式计算重复行数:

grep -Pzo "author [^\n]*\n([^\n]*\n){10}[\w]*[^\w]"|grep "author "

Note

笔记

Other answers actually filter out lines that contain only whitespaces.

其他答案实际上会过滤掉仅包含空格的行。

Statistics based on master
Active files: 21
Active lines: 967
Total commits: 109

Note: Files matching MIME type image, binary has been ignored

+----------------+-----+---------+-------+---------------------+
| name           | loc | commits | files | distribution (%)    |
+----------------+-----+---------+-------+---------------------+
| Linus Oleander | 914 | 106     | 21    | 94.5 / 97.2 / 100.0 |
| f1yegor        | 47  | 2       | 7     |  4.9 /  1.8 / 33.3  |
| David Selassie | 6   | 1       | 2     |  0.6 /  0.9 /  9.5  |
+----------------+-----+---------+-------+---------------------+

The command above will print authors of lines containing at least one non-whitespace character. You can also use match \w*[^\w#]which will also exclude lines where the first non-whitespace character isn't a #(comment in many scripting languages).

上面的命令将打印包含至少一个非空白字符的行的作者。您还可以使用 match \w*[^\w#],它也将排除第一个非空白字符不是 a 的行#(许多脚本语言中的注释)。

回答by Linus Oleander

I wrote a gem called git-famethat might be useful.

我写了一个可能有用的名为git-fame的 gem 。

Installation and usage:

安装及使用:

  1. $ gem install git_fame
  2. $ cd /path/to/gitdir
  3. $ git fame
  1. $ gem install git_fame
  2. $ cd /path/to/gitdir
  3. $ git fame

Output:

输出:

git ls-tree -r HEAD|sed -re 's/^.{53}//'|while read filename; do file "$filename"; done|grep -E ': .*text'|sed -r -e 's/: .*//'|while read filename; do git blame -w "$filename"; done|sed -r -e 's/.*\((.*)[0-9]{4}-[0-9]{2}-[0-9]{2} .*//' -e 's/ +$//'|sort|uniq -c

回答by Edward Anderson

git ls-tree -r HEAD|sed -re 's/^.{53}//'

Step by step explanation:

分步说明:

List all the files under version control

列出版本控制下的所有文件

|while read filename; do file "$filename"; done|grep -E ': .*text'|sed -r -e 's/: .*//'

Prune the list down to only text files

将列表修剪为仅文本文件

|while read filename; do git blame -w "$filename"; done

Git blame all the text files, ignoring whitespace changes

Git 归咎于所有文本文件,忽略空格更改

|sed -r -e 's/.*\((.*)[0-9]{4}-[0-9]{2}-[0-9]{2} .*//' -e 's/ +$//'

Pull out the author names

拉出作者姓名

|sort|uniq -c

Sort the list of authors, and have uniq count the number of consecutively repeating lines

对作者列表进行排序,并让 uniq 计算连续重复行的数量

   1334 Maneater
   1924 Another guy
  37195 Brian Ruby
   1482 Anna Lambda

Example output:

示例输出:

git summary --line

回答by adius

git summaryprovided by the git-extraspackage is exactly what you need. Checkout the documentation at git-extras - git-summary:

git summary通过所提供的混帐额外包正是你所需要的。在git-extras - git-summary查看文档:

project  : TestProject
lines    : 13397
authors  :
8927 John Doe            66.6%
4447 Jane Smith          33.2%
  23 Not Committed Yet   0.2%

Gives output that looks like this:

给出如下所示的输出:

git ls-tree -r HEAD | gsed -re 's/^.{53}//' | \
while read filename; do file "$filename"; done | \
grep -E ': .*text' | gsed -r -e 's/: .*//' | \
while read filename; do git blame "$filename"; done | \
ruby -ne 'puts .strip if $_ =~ /^\w{8} \((.*?)\s*\d{4}-\d{2}-\d{2}/' | \
sort | uniq -c | sort -rg

回答by gtd

Erik's solution was awesome, but I had some problems with diacritics (despite my LC_*environment variables being set ostensibly correctly) and noise leaking through on lines of code that actually had dates in them. My sed-fu is poor, so I ended up with this frankenstein snippet with ruby in it, but it works for me flawlessly on 200,000+ LOC, and it sorts the results:

Erik 的解决方案很棒,但我在变音符号方面遇到了一些问题(尽管我的LC_*环境变量表面上是正确设置的)和噪音在实际包含日期的代码行中泄漏。我的 sed-fu 很差,所以我最终得到了这个带有 ruby​​ 的科学怪人片段,但它在 200,000+ LOC 上对我来说完美无缺,并且它对结果进行了排序:

git blame --line-porcelain path/to/file.txt | grep  "^author " | sort | uniq -c | sort -nr

Also note gsedinstead of sedbecause that's the binary homebrew installs, leaving the system sed intact.

还要注意,gsed而不是sed因为这是二进制自制软件安装,使系统 sed 保持完整。

回答by moinudin

git shortlog -sn

git shortlog -sn

This will show a list of commits per author.

这将显示每个作者的提交列表。

回答by ThorSummoner

Here is the primary snippet from @Alex 's answer that actually does the operation of aggregating the blame lines. I've cut it down to operate on a single filerather than a set of files.

这是@Alex 回答中的主要片段,它实际上执行了聚合责备线的操作。我已将其缩减为对单个文件而不是一组文件进行操作。

for file in $(git ls-files); do \
    echo $file; \
    git blame --line-porcelain $file \
        | grep  "^author " | sort | uniq -c | sort -nr; \
    echo; \
done

I post this here because I come back to this answer often and re-reading the post and re-digesting the examples to extract the portion I value it is taxing. Nor is it generic enough for my use case; its scope is for a whole C project.

我在这里发布这个是因为我经常回到这个答案并重新阅读帖子并重新消化示例以提取我认为它征税的部分。对于我的用例来说,它也不够通用;它的范围是整个 C 项目。



I like to list stats per file, achived via with a bash foriterator instead of xargsas I find xargs less readable and hard to use/memorize, The advantage/disadvantages xargs vs forshould be discussed elsewhere.

我喜欢列出每个文件的统计信息,通过使用 bashfor迭代器而不是xargs因为我发现 xargs 可读性较差且难以使用/记忆而实现,应该在别处讨论xargs 与 for的优点/缺点。

Here is a practical snippet that will show results for each file individually:

这是一个实用的片段,将分别显示每个文件的结果:

IFS=$'\n'
for file in $(git ls-files); do
    git blame `git symbolic-ref --short HEAD` --line-porcelain "$file" | \
        grep  "^author " | \
        grep -v "Binary file (standard input) matches" | \
        grep -v "Not Committed Yet" | \
        cut -d " " -f 2-
    done | \
        sort | \
        uniq -c | \
        sort -nr

And I tested, running this stright in a bash shell is ctrl+c safe, if you need to put this inside a bash script you might need to Trap on SIGINT and SIGTERMif you want the user to be able to break your for loop.

我测试过,在 bash shell 中直接运行是 ctrl+c 安全的,如果您需要将其放入 bash 脚本中如果您希望用户能够中断您的 for 循环,您可能需要在 SIGINT 和 SIGTERM 上设置陷阱

回答by Ivan

Check out the gitstats command available from http://gitstats.sourceforge.net/

查看可从http://gitstats.sourceforge.net/获得的 gitstats 命令

回答by Gabriel Diego

I have this solution that counts the blamed lines in all text files (excluding the binary files, even the versioned ones):

我有这个解决方案,可以计算所有文本文件(不包括二进制文件,甚至是版本化文件)中被指责的行:

find . -name '*.c' | xargs -n1 git blame --line-porcelain | grep "^author "|sort|uniq -c|sort -nr

回答by Martin G

This works in any directory of the source structure of the repo, in case you want to inspect a certain source module.

这适用于 repo 源结构的任何目录,以防您想检查某个源模块。

##代码##