git 如何使用 GitHub V3 API 获取 repo 的提交计数？

Question

提问by SteveCoffman

I am trying to count commits for many large github reposusing the API, so I would like to avoid getting the entire list of commits (this way as an example: api.github.com/repos/jasonrudolph/keyboard/commits ) and counting them.

我正在尝试使用 API计算许多大型 github 存储库的提交，因此我想避免获取整个提交列表（例如： api.github.com/repos/jasonrudolph/keyboard/commits ）和计数他们。

If I had the hash of the first (initial) commit , I could use this technique to compare the first commit to the latestand it happily reports the total_commits in between (so I'd need to add one) that way. Unfortunately, I cannot see how to elegantly get the first commit using the API.

如果我有第一次（初始）提交的哈希值，我可以使用这种技术将第一次提交与最新提交进行比较，它会很高兴地以这种方式报告两者之间的 total_commits（因此我需要添加一个）。不幸的是，我看不到如何使用 API 优雅地获得第一次提交。

The base repo URL does give me the created_at (this url is an example: api.github.com/repos/jasonrudolph/keyboard ), so I could get a reduced commit set by limiting the commits to be until the create date (this url is an example: api.github.com/repos/jasonrudolph/keyboard/commits?until=2013-03-30T16:01:43Z) and using the earliest one (always listed last?) or maybe the one with an empty parent (not sure about if forked projects have initial parent commits).

基本 repo URL 确实给了我 created_at（这个 url 是一个例子：api.github.com/repos/jasonrudolph/keyboard），所以我可以通过将提交限制到创建日期（这个 url是一个例子：api.github.com/repos/jasonrudolph/keyboard/commits?until=2013-03-30T16:01:43Z）并使用最早的（总是列在最后？）不确定分叉项目是否具有初始父提交）。

Any better way to get the first commit hash for a repo?

有没有更好的方法来获取 repo 的第一个提交哈希？

Better yet, this whole thing seems convoluted for a simple statistic, and I wonder if I'm missing something. Any better ideas for using the API to get the repo commit count?

更好的是，这整个事情对于一个简单的统计数据来说似乎很复杂，我想知道我是否遗漏了什么。关于使用 API 获取 repo 提交计数的任何更好的想法？

Edit: This somewhat similar questionis trying to filter by certain files (" and within them to specific files."), so has a different answer.

编辑：这个有点类似的问题试图按某些文件（“并在其中过滤到特定文件。”）进行过滤，因此有不同的答案。

Answer 1

采纳答案by Bertrand Martel

You can consider using GraphQL API v4to perform commit count for multiple repositories at the same times using aliases. The following will fetch commit count for all branches of 3 distinct repositories (up to 100 branches per repo) :

您可以考虑使用GraphQL API v4使用aliases同时对多个存储库执行提交计数。以下将获取 3 个不同存储库的所有分支的提交计数（每个存储库最多 100 个分支）：

{
  gson: repository(owner: "google", name: "gson") {
    ...RepoFragment
  }
  martian: repository(owner: "google", name: "martian") {
    ...RepoFragment
  }
  keyboard: repository(owner: "jasonrudolph", name: "keyboard") {
    ...RepoFragment
  }
}

fragment RepoFragment on Repository {
  name
  refs(first: 100, refPrefix: "refs/heads/") {
    edges {
      node {
        name
        target {
          ... on Commit {
            id
            history(first: 0) {
              totalCount
            }
          }
        }
      }
    }
  }
}

Try it in the explorer

在资源管理器中尝试

RepoFragmentis a fragmentwhich helps to avoid the duplicate query fields for each of those repo

RepoFragment是一个片段，有助于避免每个 repo 的重复查询字段

If you only need commit count on the default branch, it's more straightforward :

如果您只需要在默认分支上提交计数，则更简单：

{
  gson: repository(owner: "google", name: "gson") {
    ...RepoFragment
  }
  martian: repository(owner: "google", name: "martian") {
    ...RepoFragment
  }
  keyboard: repository(owner: "jasonrudolph", name: "keyboard") {
    ...RepoFragment
  }
}

fragment RepoFragment on Repository {
  name
  defaultBranchRef {
    name
    target {
      ... on Commit {
        id
        history(first: 0) {
          totalCount
        }
      }
    }
  }
}

Try it in the explorer

在资源管理器中尝试

Answer 2

回答by Ivan Zuzak

If you're looking for the total number of commits in the default branch, you might consider a different approach.

如果您正在寻找默认分支中的提交总数，您可能会考虑采用不同的方法。

Use the Repo Contributors API to fetch a list of all contributors:

使用 Repo Contributors API 获取所有贡献者的列表：

https://developer.github.com/v3/repos/#list-contributors

Each item in the list will contain a contributionsfield which tells you how many commits the user authored in the default branch. Sum those fields across all contributors and you should get the total number of commits in the default branch.

列表中的每一项都将包含一个contributions字段，该字段告诉您用户在默认分支中创作的提交次数。将所有贡献者的这些字段相加，您应该得到默认分支中的提交总数。

The list of contributors if often much shorter than the list of commits, so it should take fewer requests to compute the total number of commits in the default branch.

贡献者列表通常比提交列表短得多，因此计算默认分支中的提交总数应该需要更少的请求。

Answer 3

回答by snowe

Simple solution: Look at the page number. Github paginates for you. so you can easily calculate the number of commits by just getting the last page number from the Link header, subtracting one (you'll need to add up the last page manually), multiplying by the page size, grabbing the last page of results and getting the size of that array and adding the two numbers together. It's a max of two API calls!

简单的解决办法：看页码。Github 为你分页。因此，您只需从链接标题中获取最后一个页码，减去一个（您需要手动添加最后一页），乘以页面大小，获取最后一页结果和获取该数组的大小并将两个数字相加。最多两个 API 调用！

Here is my implementation of grabbing the total number of commits for an entire organization using the octokit gem in ruby:

这是我使用 ruby 中的 octokit gem 获取整个组织的提交总数的实现：

@github = Octokit::Client.new access_token: key, auto_traversal: true, per_page: 100

Octokit.auto_paginate = true
repos = @github.org_repos('my_company', per_page: 100)

# * take the pagination number
# * get the last page
# * see how many items are on it
# * multiply the number of pages - 1 by the page size
# * and add the two together. Boom. Commit count in 2 api calls
def calc_total_commits(repos)
    total_sum_commits = 0

    repos.each do |e| 
        repo = Octokit::Repository.from_url(e.url)
        number_of_commits_in_first_page = @github.commits(repo).size
        repo_sum = 0
        if number_of_commits_in_first_page >= 100
            links = @github.last_response.rels

            unless links.empty?
                last_page_url = links[:last].href

                /.*page=(?<page_num>\d+)/ =~ last_page_url
                repo_sum += (page_num.to_i - 1) * 100 # we add the last page manually
                repo_sum += links[:last].get.data.size
            end
        else
            repo_sum += number_of_commits_in_first_page
        end
        puts "Commits for #{e.name} : #{repo_sum}"
        total_sum_commits += repo_sum
    end
    puts "TOTAL COMMITS #{total_sum_commits}"
end

and yes I know the code is dirty, this was just thrown together in a few minutes.

是的，我知道代码很脏，这只是在几分钟内拼凑起来的。

Answer 4

回答by buckley

Using the GraphQL API v4 is probably the way to handle this if you're starting out in a new project, but if you're still using the REST API v3 you can get around the pagination issue by limiting the request to just 1 result per page. By setting that limit, the number of pagesreturned in the last link will be equal to the total.

如果您刚开始一个新项目，使用 GraphQL API v4 可能是处理此问题的方法，但如果您仍在使用 REST API v3，您可以通过将请求限制为每个结果 1 个来解决分页问题页。通过设置该限制，pages最后一个链接中返回的数量将等于总数。

For example using python3 and the requests library

例如使用 python3 和请求库

def commit_count(project, sha='master', token=None):
    """
    Return the number of commits to a project
    """
    token = token or os.environ.get('GITHUB_API_TOKEN')
    url = f'https://api.github.com/repos/{project}/commits'
    headers = {
        'Accept': 'application/json',
        'Content-Type': 'application/json',
        'Authorization': f'token {token}',
    }
    params = {
        'sha': sha,
        'per_page': 1,
    }
    resp = requests.request('GET', url, params=params, headers=headers)
    if (resp.status_code // 100) != 2:
        raise Exception(f'invalid github response: {resp.content}')
    # check the resp count, just in case there are 0 commits
    commit_count = len(resp.json())
    last_page = resp.links.get('last')
    # if there are no more pages, the count must be 0 or 1
    if last_page:
        # extract the query string from the last page url
        qs = urllib.parse.urlparse(last_page['url']).query
        # extract the page number from the query string
        commit_count = int(dict(urllib.parse.parse_qsl(qs))['page'])
    return commit_count

Answer 5

回答by fnkr

I just made a little script to do this. It may not work with large repositories since it does not handle GitHub's rate limits. Also it requires the Python requestspackage.

我只是做了一个小脚本来做到这一点。它可能不适用于大型存储库，因为它不处理 GitHub 的速率限制。它还需要 Python requests包。

#!/bin/env python3.4
import requests

GITHUB_API_BRANCHES = 'https://%(token)[email protected]/repos/%(namespace)s/%(repository)s/branches'
GUTHUB_API_COMMITS = 'https://%(token)[email protected]/repos/%(namespace)s/%(repository)s/commits?sha=%(sha)s&page=%(page)i'


def github_commit_counter(namespace, repository, access_token=''):
    commit_store = list()

    branches = requests.get(GITHUB_API_BRANCHES % {
        'token': access_token,
        'namespace': namespace,
        'repository': repository,
    }).json()

    print('Branch'.ljust(47), 'Commits')
    print('-' * 55)

    for branch in branches:
        page = 1
        branch_commits = 0

        while True:
            commits = requests.get(GUTHUB_API_COMMITS % {
                'token': access_token,
                'namespace': namespace,
                'repository': repository,
                'sha': branch['name'],
                'page': page
            }).json()

            page_commits = len(commits)

            for commit in commits:
                commit_store.append(commit['sha'])

            branch_commits += page_commits

            if page_commits == 0:
                break

            page += 1

        print(branch['name'].ljust(45), str(branch_commits).rjust(9))

    commit_store = set(commit_store)
    print('-' * 55)
    print('Total'.ljust(42), str(len(commit_store)).rjust(12))

# for private repositories, get your own token from
# https://github.com/settings/tokens
# github_commit_counter('github', 'gitignore', access_token='fnkr:xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx')
github_commit_counter('github', 'gitignore')

Answer 6

回答by Arcsector

I used python to create a generator which returns a list of contributors, sums up the total commit count, and then checks if it is valid. Returns Trueif it has less, and Falseif the same or greater commits. The only thing you have to fill in is the requests session that uses your credentials. Here's what I wrote for you:

我使用 python 创建了一个生成器，它返回一个贡献者列表，总结总提交计数，然后检查它是否有效。True如果它有更少，并且False如果提交相同或更多，则返回。您唯一需要填写的是使用您的凭据的请求会话。这是我为你写的：

from requests import session
def login()
    sess = session()

    # login here and return session with valid creds
    return sess

def generateList(link):
    # you need to login before you do anything
    sess = login()

    # because of the way that requests works, you must start out by creating an object to
    # imitate the response object. This will help you to cleanly while-loop through
    # github's pagination
    class response_immitator:
        links = {'next': {'url':link}}
    response = response_immitator() 
    while 'next' in response.links:
        response = sess.get(response.links['next']['url'])
        for repo in response.json():
            yield repo

def check_commit_count(baseurl, user_name, repo_name, max_commit_count=None):
    # login first
    sess = login()
    if max_commit_count != None:
        totalcommits = 0

        # construct url to paginate
        url = baseurl+"repos/" + user_name + '/' + repo_name + "/stats/contributors"
        for stats in generateList(url):
            totalcommits+=stats['total']

        if totalcommits >= max_commit_count:
            return False
        else:
            return True

def main():
    # what user do you want to check for commits
    user_name = "arcsector"

    # what repo do you want to check for commits
    repo_name = "EyeWitness"

    # github's base api url
    baseurl = "https://api.github.com/"

    # call function
    check_commit_count(baseurl, user_name, repo_name, 30)

if __name__ == "__main__":
    main()

git 如何使用 GitHub V3 API 获取 repo 的提交计数？

提问by SteveCoffman

采纳答案by Bertrand Martel

回答by Ivan Zuzak

回答by snowe

回答by buckley

回答by fnkr

回答by Arcsector

相关推荐

最近更新

标签

git 如何使用 GitHub V3 API 获取 repo 的提交计数？

提问by SteveCoffman

采纳答案by Bertrand Martel

回答by Ivan Zuzak

回答by snowe

回答by buckley

回答by fnkr

回答by Arcsector

相关推荐

git rebase 后丢失提交

GIT、GitHub 和源码树的关系

git：为空项目推送一个新的空分支？

Eclipse luna - 内部错误，尝试从 git 导入

相关推荐

最近更新

标签