Scala 正则表达式启用多行选项

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/1088554/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-10-22 01:30:45  来源:igfitidea点击:

Scala Regex enable Multiline option

regexscalamultiline

提问by ed.

I'm learning Scala, so this is probably pretty noob-irific.

我正在学习 Scala,所以这可能非常菜鸟。

I want to have a multiline regular expression.

我想要一个多行正则表达式。

In Ruby it would be:

在 Ruby 中,它将是:

MY_REGEX = /com:Node/m

My Scala looks like:

我的 Scala 看起来像:

val ScriptNode =  new Regex("""<com:Node>""")

Here's my match function:

这是我的匹配功能:

def matchNode( value : String ) : Boolean = value match 
{
    case ScriptNode() => System.out.println( "found" + value ); true
    case _ => System.out.println("not found: " + value ) ; false
}

And I'm calling it like so:

我这样称呼它:

matchNode( "<root>\n<com:Node>\n</root>" ) // doesn't work
matchNode( "<com:Node>" ) // works

I've tried:

我试过了:

val ScriptNode =  new Regex("""<com:Node>?m""")

And I'd really like to avoid having to use java.util.regex.Pattern. Any tips greatly appreciated.

而且我真的很想避免使用 java.util.regex.Pattern。非常感谢任何提示。

回答by Daniel C. Sobral

This is a very common problem when first using Scala Regex.

这是第一次使用 Scala Regex 时非常常见的问题。

When you use pattern matching in Scala, it tries to match the whole string, as if you were using "^" and "$" (and did not activate multi-line parsing, which matches \n to ^ and $).

当您在 Scala 中使用模式匹配时,它会尝试匹配整个字符串,就像您使用“^”和“$”一样(并且没有激活多行解析,它将 \n 匹配到 ^ 和 $)。

The way to do what you want would be one of the following:

做你想做的事情的方法是以下之一:

def matchNode( value : String ) : Boolean = 
  (ScriptNode findFirstIn value) match {    
    case Some(v) => println( "found" + v ); true    
    case None => println("not found: " + value ) ; false
  }

Which would find find the first instance of ScriptNode inside value, and return thatinstance as v (if you want the whole string, just print value). Or else:

哪个会在 value 中找到 ScriptNode 的第一个实例,并将实例作为 v返回(如果你想要整个字符串,只需打印值)。要不然:

val ScriptNode =  new Regex("""(?s).*<com:Node>.*""")
def matchNode( value : String ) : Boolean = 
  value match {    
    case ScriptNode() => println( "found" + value ); true    
    case _ => println("not found: " + value ) ; false
  }

Which would print all all value. In this example, (?s) activates dotall matching (ie, matching "." to new lines), and the .* before and after the searched-for pattern ensures it will match any string. If you wanted "v" as in the first example, you could do this:

这将打印所有值。在此示例中, (?s) 激活 dotall 匹配(即,将“.”匹配到新行),并且搜索模式前后的 .* 确保它匹配任何字符串。如果你想要第一个例子中的“v”,你可以这样做:

val ScriptNode =  new Regex("""(?s).*(<com:Node>).*""")
def matchNode( value : String ) : Boolean = 
  value match {    
    case ScriptNode(v) => println( "found" + v ); true    
    case _ => println("not found: " + value ) ; false
  }

回答by Tristan Juricek

Just a quick and dirty addendum: the .rmethod on RichStringconverts all strings to scala.util.matching.Regex, so you can do something like this:

只是一个快速而肮脏的附录:.r方法 onRichString将所有字符串转换为scala.util.matching.Regex,因此您可以执行以下操作:

"""(?s)a.*b""".r replaceAllIn ( "a\nb\nc\n", "A\nB" )

And that will return

这将返回

A
B
c

I use this all the time for quick and dirty regex-scripting in the scala console.

我一直使用它在 Scala 控制台中快速和肮脏的正则表达式脚本。

Or in this case:

或者在这种情况下:

def matchNode( value : String ) : Boolean = {

    """(?s).*(<com:Node>).*""".r.findAllIn( text ) match {

       case ScriptNode(v) => System.out.println( "found" + v ); true    

       case _ => System.out.println("not found: " + value ) ; false
    }
}

Just my attempt to reduce the use of the word newin code worldwide. ;)

只是我试图减少new在全球代码中使用这个词。;)

回答by Eran Medan

Just a small addition, use tried to use the (?m)(Multiline) flag (although it might not be suitable here) but here is the right way to use it:

只是一个小小的补充,use 尝试使用(?m)(Multiline) 标志(虽然它可能不适合这里)但这里是使用它的正确方法:

e.g. instead of

例如代替

val ScriptNode =  new Regex("""<com:Node>?m""")

use

利用

val ScriptNode =  new Regex("""(?m)<com:Node>""")

But again the (?s) flag is more suitable in this question (adding this answer only because the title is "Scala Regex enable Multiline option")

但是 (?s) 标志再次更适合这个问题(添加这个答案只是因为标题是“Scala Regex enable Multiline option”)