如何检查有效的 Git 分支名称?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/12093748/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-19 07:24:32  来源:igfitidea点击:

How do I check for valid Git branch names?

pythonregexgitgithooks

提问by Alex Chamberlain

I'm developing a gitpost-receivehook in Python. Data is supplied on stdinwith lines similar to

我正在gitpost-receive用 Python开发一个钩子。数据提供的stdin行类似于

ef4d4037f8568e386629457d4d960915a85da2ae 61a4033ccf9159ae69f951f709d9c987d3c9f580 refs/heads/master

The first hash is the old-ref, the second the new-ref and the third column is the reference being updated.

第一个哈希是旧引用,第二个是新引用,第三列是正在更新的引用。

I want to split this into 3 variables, whilst also validating input. How do I validate the branch name?

我想将其拆分为 3 个变量,同时还要验证输入。如何验证分支名称?

I am currently using the following regular expression

我目前正在使用以下正则表达式

^([0-9a-f]{40}) ([0-9a-f]{40}) refs/heads/([0-9a-zA-Z]+)$

This doesn't accept all possible branch names, as set out by man git-check-ref-format. For example, it excludes a branch by the name of build-master, which is valid.

这不接受所有可能的分支名称,如man git-check-ref-format 所述。例如,它排除了名称为 的分支build-master,这是有效的。

Bonus marks

加分

I actually want to exclude any branch that starts with "build-". Can this be done in the same regex?

我实际上想排除任何以“build-”开头的分支。这可以在同一个正则表达式中完成吗?

Tests

测试

Given the great answers below, I wrote some tests, which can be found at https://github.com/alexchamberlain/githooks/blob/master/miscellaneous/git-branch-re-test.py.

鉴于下面的好答案,我写了一些测试,可以在https://github.com/alexchamberlain/githooks/blob/master/miscellaneous/git-branch-re-test.py找到 。

Status: All the regexes below are failing to compile. This could indicate there's a problem with my script or incompatible syntaxes.

状态:以下所有正则表达式都无法编译。这可能表明我的脚本存在问题或语法不兼容。

回答by Joey

Let's dissect the various rules and build regex parts from them:

让我们剖析各种规则并从中构建正则表达式部分:

  1. They can include slash /for hierarchical (directory) grouping, but no slash-separated component can begin with a dot .or end with the sequence .lock.

    # must not contain /.
    (?!.*/\.)
    # must not end with .lock
    (?<!\.lock)$
    
  2. They must contain at least one /. This enforces the presence of a category like heads/, tags/ etc. but the actual names are not restricted. If the --allow-oneleveloption is used, this rule is waived.

    .+/.+  # may get more precise later
    
  3. They cannot have two consecutive dots ..anywhere.

    (?!.*\.\.)
    
  4. They cannot have ASCII control characters (i.e. bytes whose values are lower than \040, or \177 DEL), space, tilde ~, caret ^, or colon :anywhere.

    [^
    [^
    ^(?!/)
    (?<!/)$
    (?!.*//)
    
    0-77 ~^:?*[]+ # new pattern for allowed characters
    0-77 ~^:]+ # pattern for allowed characters
  5. They cannot have question-mark ?, asterisk *, or open bracket [anywhere. See the --refspec-patternoption below for an exception to this rule.

    (?<!\.)$
    
  6. They cannot begin or end with a slash /or contain multiple consecutive slashes (see the --normalizeoption below for an exception to this rule)

    (?!.*@\{)
    
  7. They cannot end with a dot ..

    (?!@$)
    
  8. They cannot contain a sequence @{.

    (?!.*\)
    
  9. They cannot be the single character @.

    # must not contain /.
    (?!.*/\.)
    # must not end with .lock
    (?<!\.lock)$
    
  10. They cannot contain a \.

    .+/.+  # may get more precise later
    
  1. 它们可以包含/用于分层(目录)分组的斜杠,但斜杠分隔的组件不能以点开头.或以序列 结尾.lock

    (?!.*\.\.)
    
  2. 它们必须至少包含一个/. 这会强制存在像 head/、tags/ 等类别,但实际名称不受限制。如果使用该--allow-onelevel选项,则放弃此规则。

    [^
    [^
    ^(?!/)
    (?<!/)$
    (?!.*//)
    
    0-77 ~^:?*[]+ # new pattern for allowed characters
    0-77 ~^:]+ # pattern for allowed characters
  3. 他们不能在..任何地方有两个连续的点。

    (?<!\.)$
    
  4. 它们在任何地方都不能有 ASCII 控制字符(即值小于\040, 或 的字节\177 DEL)、空格、波浪号~、插入符号^或冒号:

    (?!.*@\{)
    
  5. 它们不能在任何地方有问号?、星号*或左括号[。有关--refspec-pattern此规则的例外情况,请参阅下面的选项。

    (?!@$)
    
  6. 它们不能以斜杠开头或结尾,也不能/包含多个连续的斜杠(请参阅--normalize下面的选项以了解此规则的例外情况)

    (?!.*\)
    
  7. 它们不能以点结尾.

    ^(?!.*/\.)(?!.*\.\.)(?!/)(?!.*//)(?!.*@\{)(?!@$)(?!.*\)[^
    ^(?!build-)(?!.*/\.)(?!.*\.\.)(?!/)(?!.*//)(?!.*@\{)(?!@$)(?!.*\)[^
    ^(?!@$|build-|/|.*([/.]\.|//|@\{|\))[^
    # RegExp rules based on git-check-ref-format
    my $valid_ref_name = qr%
       ^
       (?!
          # begins with
          /|                # (from #6)   cannot begin with /
          # contains
          .*(?:
             [/.]\.|        # (from #1,3) cannot contain /. or ..
             //|            # (from #6)   cannot contain multiple consecutive slashes
             @\{|           # (from #8)   cannot contain a sequence @{
             \             # (from #9)   cannot contain a \
          )
       )
                            # (from #2)   (waiving this rule; too strict)
       [^07 ~^:?*[]+  # (from #4-5) valid character rules
    
       # ends with
       (?<!\.lock)          # (from #1)   cannot end with .lock
       (?<![/.])            # (from #6-7) cannot end with / or .
       $
    %x;
    
    foreach my $branch (qw(
       master
       .master
       build/master
       ref/HEAD/blah
       /HEAD/blah
       HEAD/blah/
       master.lock
       head/@{block}
       master.
       build//master
       build\master
       build\master
    ),
       'master blaster',
    ) {
       print "$branch --> ".($branch =~ $valid_ref_name)."\n";
    }
    
    0-77 ~^:?*[]+/[^
    refs/heads/(?!.)(?!build-)((?!\.\.)(?!@{)[^\cA-\cZ ~^:?*[\])+))(?<!\.)(?<!\.lock)
    
    0-77 ~^:?*[]+(?<!\.lock|[/.])$
    0-77 ~^:?*[]+/[^
    (?!.)((?!\.\.)(?!@{)[^\cA-\cZ ~^:?*[\])+))(/(?!.)((?!\.\.)(?!@{)[^\cA-\cZ ~^:?*[\])+)))*?/(?!.)(?!build-)((?!\.\.)(?!@{)[^\cA-\cZ ~^:?*[\])+))(?<!\.)(?<!\.lock)
    
    0-77 ~^:?*[]+(?<!\.lock)(?<!/)(?<!\.)$
    0-77 ~^:?*[]+/[^
    ^(?!/|.*([/.]\.|//|@\{|\\))[^07 ~^:?*\[]+(?<!\.lock|[/.])$
    
    0-77 ~^:?*[]+(?<!\.lock)(?<!/)(?<!\.)$
  8. 它们不能包含序列@{

    ##代码##
  9. 它们不能是单个字符@

    ##代码##
  10. 它们不能包含\.

    ##代码##

Piecing it all together we arrive at the following monstrosity:

拼凑起来,我们得出了以下怪物:

##代码##

And if you want to exclude those that start with build-then just add another lookahead:

如果你想排除那些以开头的,build-那么只需添加另一个前瞻:

##代码##

This can be optimized a bit as well by conflating a few things that look for common patterns:

这也可以通过合并一些寻找常见模式的东西来优化:

##代码##

回答by Brendan Byrd

There's no need to write monstrosities in Perl. Just use /x:

没有必要在 Perl 中编写怪物。只需使用 /x:

##代码##

Joey++ for some of the code, though I made some corrections.

Joey++ 的一些代码,虽然我做了一些更正。

回答by murgatroid99

Taking the rules directly from the linked page, the following regular expression should match only valid branch names in refs/headsnot starting with "build-":

直接从链接页面获取规则,以下正则表达式应仅匹配refs/heads不以“build-”开头的有效分支名称:

##代码##

This starts with refs/headsas yours does.

这从refs/heads你的开始。

Then (?!build-)checks that the next 6 characters are not build-and (?!.)checks that the branch does not start with a ..

然后(?!build-)检查接下来的 6 个字符是否不是,build-(?!.)检查分支是否不以 开头.

The entire group (((?!\.\.)(?!@{)[^\cA-\cZ ~^:?*[\\])+)matches the branch name.

整个组(((?!\.\.)(?!@{)[^\cA-\cZ ~^:?*[\\])+)匹配分支名称。

(?!\.\.)checks that there are no instances of two periods in a row, and (?!@{)checks that the branch does not contain @{.

(?!\.\.)检查在一行中没有两个句点的实例,并(?!@{)检查分支是否不包含@{.

Then [^\cA-\cZ ~^:?*[\\]matches any of the allowed characters by excluding control characters \cA-\cZand all of the rest of the characters that are specifically forbidden.

然后[^\cA-\cZ ~^:?*[\\]通过排除控制字符\cA-\cZ和所有其他特别禁止的字符来匹配任何允许的字符。

Finally, (?<!\.)makes sure that the branch name did not end with a period and (?<!.lock)checks that it did not end with .\lock.

最后,(?<!\.)确保分支名称不以句点结尾并(?<!.lock)检查它是否不以.\lock.

This can be extended to similarly match valid branch names in arbitrary folders, you can use

这可以扩展为类似地匹配任意文件夹中的有效分支名称,您可以使用

##代码##

This applies basically the same rules to each piece of the branch name, but only checks that the last one does not start with build-

这对分支名称的每一部分应用基本相同的规则,但只检查最后一个不以 build-

回答by zanbaldwin

For anyone coming to this question looking for the PCRE regular expression to match a valid Git branch name, it is the following:

对于任何来到这个问题寻找 PCRE 正则表达式以匹配有效 Git 分支名称的人来说,它是以下内容:

##代码##

This is an amended version of the regular expression written by Joey. In this version, however, an oblique is not required (it is for matching branchNamerather than refs/heads/branchName).

这是Joey编写的正则表达式的修正版本。然而,在这个版本中,不需要倾斜(它用于匹配branchName而不是refs/heads/branchName)。

Please refer to his correct answer to this question. He provides a full breakdown of each part of the regex, and how it relates to each requirement specified on the git-check-ref-format(1)manual pages.

请参考他对这个问题的正确回答。他提供了正则表达式每个部分的完整细分,以及它与git-check-ref-format(1)手册页上指定的每个要求的关系。