scala 如何通过正则表达式拆分此字符串？

Question

提问by Freewind

I have some string, they looks like:

我有一些字符串，它们看起来像：

div#title.title.top
#main.main
a.bold#empty.red

They are similar to haml, and I want to split them by regex, but I don't know how to define it.

它们类似于haml，我想通过正则表达式拆分它们，但我不知道如何定义它。

val r = """???""".r // HELP
val items = "a.bold#empty.red".split(r)
items // -> "a", ".bold", "#empty", ".red"

How to do this?

这个怎么做？

UPDATE

更新

Sorry, everyone, but I need to make this question harder. I'm very interested in

对不起，大家，但我需要让这个问题更难。我很感兴趣

val r = """(?<=\w)\b"""

But it failed to parse the more complex ones:

但它未能解析更复杂的：

div#question-title.title-1.h-222_333

I hope it will be parsed to:

我希望它会被解析为：

div
#question-title
.title-1
.h-222_333

I wanna know how to improve that regex?

我想知道如何改进该正则表达式？

Answer 1

采纳答案by Josh M.

I'm not completely sure what you need here but this should help:

我不完全确定您在这里需要什么，但这应该会有所帮助：

(?:\.|#)?\w+

It means a "term" is defined as an optional dot or hash followed by some word characters.

这意味着“术语”被定义为可选的点或散列，后跟一些单词字符。

You will end up with:

你最终会得到：

div
#title
.title
.top
#main
.main
a
.bold
#empty
.red

Answer 2

回答by Daniel C. Sobral

val r = """(?<=\w)\b(?!-)"""

Note that split takes a Stringrepresenting a regular expression, not a Regex, so you must not convert rfrom Stringto Regex.

请注意， split 采用 aString表示正则表达式，而不是 a Regex，因此您不能r从转换String为Regex。

Brief explanation on the regex:

关于正则表达式的简要说明：

(?<=...)is a look-behind. It states that this match must be preceded by the pattern ..., or, in your case \w, meaning you want the pattern to follow a digit, letter, or underline.
\bmeans word boundary. It is a zero-length match that happen between a word character (digits, letters and underscore) and a non-word character, or vice versa. Because it is zero-length, splitwon't remove any character when splitting.
(?!...)is a negative-lookahead. Here I use to say that I'm not interested in word boundaries from a letter to a dash.

(?<=...)是后视。它指出此匹配必须以模式开头，或者...在您的情况下\w，这意味着您希望模式跟在数字、字母或下划线之后。
\b表示词边界。它是在单词字符（数字、字母和下划线）和非单词字符之间发生的零长度匹配，反之亦然。因为它是零长度，split所以拆分时不会删除任何字符。
(?!...)是负前瞻。在这里我经常说我对从字母到破折号的单词边界不感兴趣。

Answer 3

回答by Ken Bloom

Starting with Josh M's answer, he has a good regular expression, but since splittakes a regular expression matching the "delimiter", you need to use findAllInas follows:

从 Josh M 的回答开始，他有一个很好的正则表达式，但由于split采用了匹配“分隔符”的正则表达式，因此您需要使用findAllIn如下：

val r = """(?:\.|#)?\w+""".r
val items = r findAllIn "a.bold#empty.red"
    //maybe you want a toList on the end also

Then you get the results

然后你得到结果

div#title.title.top    -> List(div, #title, .title, .top)
#main.main             -> List(#main, .main)
a.bold#empty.red       -> List(a, .bold, #empty, .red)

scala 如何通过正则表达式拆分此字符串？

提问by Freewind

采纳答案by Josh M.

回答by Daniel C. Sobral

回答by Ken Bloom

相关推荐

最近更新

标签

scala 如何通过正则表达式拆分此字符串？

提问by Freewind

采纳答案by Josh M.

回答by Daniel C. Sobral

回答by Ken Bloom

相关推荐

Scala 映射 -> 运算符

在 Scala 中获取列表中的项目？

scala 从数组到列表的隐式转换

Scala 匹配错误

相关推荐

最近更新

标签