将匹配 Java 方法声明的正则表达式

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/68633/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-11 07:54:12  来源:igfitidea点击:

Regex that Will Match a Java Method Declaration

javaregexmethods

提问by Anton

I need a Regex that will match a java method declaration. I have come up with one that will match a method declaration, but it requires the opening bracket of the method to be on the same line as the declaration. If you have any suggestions to improve my regex or simply have a better one then please submit an answer.

我需要一个与 java 方法声明匹配的正则表达式。我想出了一个匹配方法声明的方法,但它要求方法的左括号与声明在同一行。如果您对改进我的正则表达式有任何建议或只是有更好的建议,请提交答案。

Here is my regex: "\w+ +\w+ *\(.*\) *\{"

这是我的正则表达式: "\w+ +\w+ *\(.*\) *\{"

For those who do not know what a java method looks like I'll provide a basic one:

对于那些不知道 java 方法是什么样子的人,我将提供一个基本的方法:

int foo()
{

}

There are several optional parts to java methods that may be added as well but those are the only parts that a method is guaranteed to have.

也可以添加 Java 方法的几个可选部分,但这些是方法保证具有的唯一部分。

Update: My current Regex is "\w+ +\w+ *\([^\)]*\) *\{"so as to prevent the situation that Mike and adkom described.

更新:我当前的 Regex 是"\w+ +\w+ *\([^\)]*\) *\{"为了防止 Mike 和 adkom 描述的情况。

采纳答案by Mike Stone

Have you considered matching the actual possible keywords? such as:

您是否考虑过匹配实际可能的关键字?如:

(?:(?:public)|(?:private)|(?:static)|(?:protected)\s+)*

It might be a bit more likely to match correctly, though it might also make the regex harder to read...

它可能更有可能正确匹配,尽管它也可能使正则表达式更难阅读......

回答by akdom

I'm pretty sure Java's regex engine is greedy by default, meaning that "\w+ +\w+ *\(.*\) *\{"will never match since the .*within the parenthesis will eat everything after the opening paren. I recommend you replace the .*with [^)], this way you it will select all non-closing-paren characters.

我很确定 Java 的正则表达式引擎默认情况下是贪婪的,这意味着它"\w+ +\w+ *\(.*\) *\{"永远不会匹配,因为.*括号内的内容会在打开括号之后吃掉所有内容。我建议你.*用 [^)]替换,这样你就可以选择所有非闭合括号字符。

NOTE:Mike Stone corrected me in the comments, and since most people don't really open the comments (I know I frequently don't notice them):

注意:Mike Stone 在评论中纠正了我,因为大多数人并没有真正打开评论(我知道我经常没有注意到它们):

Greedy doesn't mean it will never match... but it will eat parens if there are more parens after to satisfy the rest of the regex... so for example "public void foo(int arg) { if (test) { System.exit(0); } }" will not match properly...

贪婪并不意味着它永远不会匹配......但是如果有更多的括号来满足正则表达式的其余部分,它会吃掉括号......所以例如“public void foo(int arg) { if (test) { System.exit(0); } }" 不会正确匹配...

回答by UnkwnTech

I came up with this:

我想出了这个:

\b\w*\s*\w*\(.*?\)\s*\{[\x21-\x7E\s]*\}

I tested it against a PHP function but it should work just the same, this is the snippet of code I used:

我针对 PHP 函数对其进行了测试,但它应该可以正常工作,这是我使用的代码片段:

function getProfilePic($url)
 {
    if(@open_image($url) !== FALSE)
     {
        @imagepng($image, 'images/profiles/' . $_SESSION['id'] . '.png');
        @imagedestroy($image);
        return TRUE;
     }
    else 
     {
        return FALSE;
     }
 }

MORE INFO:

更多信息:

Options: case insensitive

Assert position at a word boundary ?\b?
Match a single character that is a “word character” (letters, digits, etc.) ?\w*?
   Between zero and unlimited times, as many times as possible, giving back as needed (greedy) ?*?
Match a single character that is a “whitespace character” (spaces, tabs, line breaks, etc.) ?\s*?
   Between zero and unlimited times, as many times as possible, giving back as needed (greedy) ?*?
Match a single character that is a “word character” (letters, digits, etc.) ?\w*?
   Between zero and unlimited times, as many times as possible, giving back as needed (greedy) ?*?
Match the character “(” literally ?\(?
Match any single character that is not a line break character ?.*??
   Between zero and unlimited times, as few times as possible, expanding as needed (lazy) ?*??
Match the character “)” literally ?\)?
Match a single character that is a “whitespace character” (spaces, tabs, line breaks, etc.) ?\s*?
   Between zero and unlimited times, as many times as possible, giving back as needed (greedy) ?*?
Match the character “{” literally ?\{?
Match a single character present in the list below ?[\x21-\x7E\s]*?
   Between zero and unlimited times, as many times as possible, giving back as needed (greedy) ?*?
   A character in the range between ASCII character 0x21 (33 decimal) and ASCII character 0x7E (126 decimal) ?\x21-\x7E?
   A whitespace character (spaces, tabs, line breaks, etc.) ?\s?
Match the character “}” literally ?\}?


Created with RegexBuddy

回答by UnkwnTech

A tip:

一个提示:

If you are going to write the regex in Perl, please use the "xms" options so that you can leave spaces and document the regex. For example you can write a regex like:

如果您打算用 Perl 编写正则表达式,请使用“xms”选项,以便您可以留下空格并记录正则表达式。例如,您可以编写如下正则表达式:

 m{\w+ \s+      #return type
   \w+ \s*      #function name
   [(] [^)]* [)] #params
   \s* [{]           #open paren
  }xms

One of the options (think x) allows the # comments inside a regex. Also use \s instead of a " ". \s stands for any "blank" character. So tabs would also match -- which is what you would want. In Perl you don't need to use / /, you can use { } or < > or | |.

选项之一(想想 x)允许在正则表达式中使用 # 注释。也使用 \s 而不是“”。\s 代表任何“空白”字符。所以标签也会匹配——这就是你想要的。在 Perl 中你不需要使用 //,你可以使用 { } 或 < > 或 | |。

Not sure if other languages have this ability. If they do, then please use them.

不确定其他语言是否有这种能力。如果他们这样做,那么请使用它们。

回答by Georgios Gousios

(public|protected|private|static|\s) +[\w\<\>\[\]]+\s+(\w+) *\([^\)]*\) *(\{?|[^;])

I think that the above regexp can match almost all possible combinations of Java method declarations, even those including generics and arrays are return arguments, which the regexp provided by the original author did not match.

我认为上面的正则表达式几乎可以匹配Java方法声明的所有可能组合,甚至包括泛型和数组的那些都是返回参数,而原作者提供的正则表达式不匹配。

回答by idbrii

I built a vim regex to do this for ctrlp/funkybased on Georgios Gousios's answer.

我根据 Georgios Gousios 的回答为ctrlp/funky构建了一个 vim 正则表达式。

    let regex = '\v^\s+'                " preamble
    let regex .= '%(<\w+>\s+){0,3}'     " visibility, static, final
    let regex .= '%(\w|[<>[\]])+\s+'    " return type
    let regex .= '\w+\s*'               " method name
    let regex .= '\([^\)]*\)'           " method parameters
    let regex .= '%(\w|\s|\{)+$'        " postamble

I'd guess that looks like this in Java:

我猜这在 Java 中看起来像这样:

^\s+(?:<\w+>\s+){0,3}(?:[\w\<\>\[\]])+\s+\w+\s*\([^\)]*\)(?:\w|\s|\{)+$

回答by sbaltes

I also needed such a regular expression and came up with this solution:

我也需要这样一个正则表达式并想出了这个解决方案:

(?:(?:public|private|protected|static|final|native|synchronized|abstract|transient)+\s+)+[$_\w<>\[\]\s]*\s+[$_\w]+\([^\)]*\)?\s*\{?[^\}]*\}?

This grammarand Georgios Gousios answer have been useful to build the regex.

这个语法和 Georgios Gousios 的回答对构建正则表达式很有用。

EDIT:Considered tharindu_DG's feedback, made groups non-capturing, improved formatting.

编辑:考虑了 tharindu_DG 的反馈,使组无法捕获,改进了格式。

回答by aliteralmind

After looking through the other answers, here is what I came up with:

在浏览了其他答案后,这是我想出的:

#permission
   ^[ \t]*(?:(?:public|protected|private)\s+)?
#keywords
   (?:(static|final|native|synchronized|abstract|threadsafe|transient|{#insert zJRgx123GenericsNotInGroup})\s+){0,}
#return type
   #If return type is "return" then it's actually a 'return funcName();' line. Ignore.
   (?!return)
   \b([\w.]+)\b(?:|{#insert zJRgx123GenericsNotInGroup})((?:\[\]){0,})\s+
#function name
   \b\w+\b\s*
#parameters
   \(
      #one
         \s*(?:\b([\w.]+)\b(?:|{#insert zJRgx123GenericsNotInGroup})((?:\[\]){0,})(\.\.\.)?\s+(\w+)\b(?![>\[])
      #two and up
         \(\s*(?:,\s+\b([\w.]+)\b(?:|{#insert zJRgx123GenericsNotInGroup})((?:\[\]){0,})(\.\.\.)?\s+(\w+)\b(?![>\[])\s*){0,})?\s*
   \)
#post parameters
   (?:\s*throws [\w.]+(\s*,\s*[\w.]+))?
#close-curly (concrete) or semi-colon (abstract)
   \s*(?:\{|;)[ \t]*$


Where {#insert zJRgx123GenericsNotInGroup}equals

哪里{#insert zJRgx123GenericsNotInGroup}等于

`(?:<[?\w\[\] ,.&]+>)|(?:<[^<]*<[?\w\[\] ,.&]+>[^>]*>)|(?:<[^<]*<[^<]*<[?\w\[\] ,.&]+>[^>]*>[^>]*>)`

Limitations:

限制:

  • ANY parameter can have an ellipsis: "..." (Java allows only last)
  • Three levels of nested generics at most: (<...<...<...>...>...>okay, <...<...<...<...>...>...>...>bad). The syntax inside generics can be very bogus, and still seem okay to this regex.
  • Requires no spaces between types and their (optional) opening generics '<'
  • Recognizes inner classes, but doesn't prevent two dots next to each other, such as Class....InnerClass
  • 任何参数都可以有省略号:“...”(Java 只允许最后一个)
  • 最多三层嵌套泛型:(<...<...<...>...>...>好的,<...<...<...<...>...>...>...>坏的)。泛型中的语法可能非常虚假,但对于这个正则表达式来说似乎还可以。
  • 类型和它们的(可选)开放泛型之间不需要空格“<”
  • 识别内部类,但不阻止彼此相邻的两个点,例如 Class....InnerClass

Below is the raw PhraseExpress code (auto-text and description on line 1, body on line 2). Call {#insert zJRgxJavaFuncSigThrSemicOrOpnCrly}, and you get this:

下面是原始 PhraseExpress 代码(第 1 行的自动文本和描述,第 2 行的正文)。调用{#insert zJRgxJavaFuncSigThrSemicOrOpnCrly},你会得到这个:

^[ \t]*(?:(?:public|protected|private)\s+)?(?:(static|final|native|synchronized|abstract|threadsafe|transient|(?:<[?\w\[\] ,&]+>)|(?:<[^<]*<[?\w\[\] ,&]+>[^>]*>)|(?:<[^<]*<[^<]*<[?\w\[\] ,&]+>[^>]*>[^>]*>))\s+){0,}(?!return)\b([\w.]+)\b(?:|(?:<[?\w\[\] ,&]+>)|(?:<[^<]*<[?\w\[\] ,&]+>[^>]*>)|(?:<[^<]*<[^<]*<[?\w\[\] ,&]+>[^>]*>[^>]*>))((?:\[\]){0,})\s+\b\w+\b\s*\(\s*(?:\b([\w.]+)\b(?:|(?:<[?\w\[\] ,&]+>)|(?:<[^<]*<[?\w\[\] ,&]+>[^>]*>)|(?:<[^<]*<[^<]*<[?\w\[\] ,&]+>[^>]*>[^>]*>))((?:\[\]){0,})(\.\.\.)?\s+(\w+)\b(?![>\[])\s*(?:,\s+\b([\w.]+)\b(?:|(?:<[?\w\[\] ,&]+>)|(?:<[^<]*<[?\w\[\] ,&]+>[^>]*>)|(?:<[^<]*<[^<]*<[?\w\[\] ,&]+>[^>]*>[^>]*>))((?:\[\]){0,})(\.\.\.)?\s+(\w+)\b(?![>\[])\s*){0,})?\s*\)(?:\s*throws [\w.]+(\s*,\s*[\w.]+))?\s*(?:\{|;)[ \t]*$


Raw code:

原始代码:

zJRgx123GenericsNotInGroup -- To precede return-type    (?:<[?\w\[\] ,.&]+>)|(?:<[^<]*<[?\w\[\] ,.&]+>[^>]*>)|(?:<[^<]*<[^<]*<[?\w\[\] ,.&]+>[^>]*>[^>]*>)  zJRgx123GenericsNotInGroup
zJRgx0OrMoreParams  \s*(?:{#insert zJRgxParamTypeName}\s*(?:,\s+{#insert zJRgxParamTypeName}\s*){0,})?\s*   zJRgx0OrMoreParams
zJRgxJavaFuncNmThrClsPrn_M_fnm -- Needs zvFOBJ_NAME (?<=\s)\b{#insert zvFOBJ_NAME}{#insert zzJRgxPostFuncNmThrClsPrn}   zJRgxJavaFuncNmThrClsPrn_M_fnm
zJRgxJavaFuncSigThrSemicOrOpnCrly -(**)-    {#insert zzJRgxJavaFuncSigPreFuncName}\w+{#insert zzJRgxJavaFuncSigPostFuncName}    zJRgxJavaFuncSigThrSemicOrOpnCrly
zJRgxJavaFuncSigThrSemicOrOpnCrly_M_fnm -- Needs zvFOBJ_NAME    {#insert zzJRgxJavaFuncSigPreFuncName}{#insert zvFOBJ_NAME}{#insert zzJRgxJavaFuncSigPostFuncName}  zJRgxJavaFuncSigThrSemicOrOpnCrly_M_fnm
zJRgxOptKeywordsBtwScopeAndRetType  (?:(static|final|native|synchronized|abstract|threadsafe|transient|{#insert zJRgx123GenericsNotInGroup})\s+){0,}    zJRgxOptKeywordsBtwScopeAndRetType
zJRgxOptionalPubProtPriv    (?:(?:public|protected|private)\s+)?    zJRgxOptionalPubProtPriv
zJRgxParamTypeName -(**)- Ends w/ '\b(?![>\[])' to NOT find <? 'extends XClass'> or ...[]>  (*Original: zJRgxParamTypeName, Needed by: zJRgxParamTypeName[4FQPTV,ForDel[NmsOnly,Types]]*){#insert zJRgxTypeW0123GenericsArry}(\.\.\.)?\s+(\w+)\b(?![>\[])   zJRgxParamTypeName
zJRgxTypeW0123GenericsArry -- Grp1=Type, Grp2='[]', if any  \b([\w.]+)\b(?:|{#insert zJRgx123GenericsNotInGroup})((?:\[\]){0,}) zJRgxTypeW0123GenericsArry
zvTTL_PRMS_stL1c    {#insert zCutL1c}{#SETPHRASE -description zvTTL_PRMS -content {#INSERTCLIPBOARD} -autotext zvTTL_PRMS -folder ctvv_folder}  zvTTL_PRMS_stL1c
zvTTL_PRMS_stL1cSvRstrCB    {#insert zvCB_CONTENTS_stCB}{#insert zvTTL_PRMS_stL1c}{#insert zSetCBToCB_CONTENTS} zvTTL_PRMS_stL1cSvRstrCB
zvTTL_PRMS_stPrompt {#SETPHRASE -description zvTTL_PRMS -content {#INPUT -head How many parameters? -single} -autotext zvTTL_PRMS -folder ctvv_folder}  zvTTL_PRMS_stPrompt
zzJRgxJavaFuncNmThrClsPrn_M_fnmTtlp -- Needs zvFOBJ_NAME, zvTTL_PRMS    (?<=[ \t])\b{#insert zvFOBJ_NAME}\b\s*\(\s*{#insert {#COND -if {#insert zvTTL_PRMS} = 0 -then z1slp -else zzParamsGT0_M_ttlp}}\)    zzJRgxJavaFuncNmThrClsPrn_M_fnmTtlp
zzJRgxJavaFuncSigPostFuncName   {#insert zzJRgxPostFuncNmThrClsPrn}(?:\s*throws \b(?:[\w.]+)\b(\s*,\s*\b(?:[\w.]+)\b))?\s*(?:\{|;)[ \t]*$   zzJRgxJavaFuncSigPostFuncName
zzJRgxJavaFuncSigPreFuncName    (*If a type has generics, there may be no spaces between it and the first open '<', also requires generics with three nestings at the most (<...<...<...>...>...> okay, <...<...<...<...>...>...>...> not)*)^[ \t]*{#insert zJRgxOptionalPubProtPriv}{#insert zJRgxOptKeywordsBtwScopeAndRetType}(*To prevent 'return funcName();' from being recognized:*)(?!return){#insert zJRgxTypeW0123GenericsArry}\s+\b  zzJRgxJavaFuncSigPreFuncName
zzJRgxPostFuncNmThrClsPrn   \b\s*\({#insert zJRgx0OrMoreParams}\)   zzJRgxPostFuncNmThrClsPrn
zzParamsGT0_M_ttlp -- Needs zvTTL_PRMS  {#insert zJRgxParamTypeName}\s*{#insert {#COND -if {#insert zvTTL_PRMS} = 1 -then z1slp -else zzParamsGT1_M_ttlp}}  zzParamsGT0_M_ttlp
zzParamsGT1_M_ttlp  {#LOOP ,\s+{#insert zJRgxParamTypeName}\s* -count {#CALC {#insert zvTTL_PRMS} - 1 -round 0 -thousands none}}    zzParamsGT1_M_ttlp

回答by Dexygen

This is for a more specific use case but it's so much simpler I believe its worth sharing. I did this for finding 'public static void' methods i.e. Play controller actions, and I did it from the Windows/Cygwin command line, using grep; see: https://stackoverflow.com/a/7167115/34806

这是一个更具体的用例,但它非常简单,我相信它值得分享。我这样做是为了找到“公共静态无效”方法,即播放控制器动作,我是从 Windows/Cygwin 命令行使用 grep 完成的;见:https: //stackoverflow.com/a/7167115/34806

cat Foobar.java | grep -Pzo '(?s)public static void.*?\)\s+{'

The last two entries from my output are as follows:

我的输出中的最后两个条目如下:

public static void activeWorkEventStations (String type,
            String symbol,
            String section,
            String day,
            String priority,
            @As("yyyy-MM-dd") Date scheduleDepartureDate) {
public static void getActiveScheduleChangeLogs(String type,
            String symbol,
            String section,
            String day,
            String priority,
            @As("yyyy-MM-dd") Date scheduleDepartureDate) {

回答by tharindu_DG

I found seba229's answer useful, it captures most of the scenarios, but not the following,

我发现seba229的回答很有用,它捕获了大部分场景,但不是以下场景,

public <T> T name(final Class<T> x, final T y)

This regex will capture that also.

这个正则表达式也将捕获它。

((public|private|protected|static|final|native|synchronized|abstract|transient)+\s)+[$_\w\<\>\w\s\[\]]*\s+[$_\w]+\([^\)]*\)?\s*

Hope this helps.

希望这可以帮助。