如何匹配 php 正则表达式中的英镑 (#) 符号(用于主题标签)

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/9421948/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-26 06:47:25  来源:igfitidea点击:

How to match a pound (#) symbol in a regex in php (for hashtags)

phpregex

提问by J-Rou

Very simple, I need to match the #symbol using a regex. I'm working on a hashtag detector.

很简单,我需要#使用正则表达式匹配符号。我正在研究一个标签检测器。

I've tried searching in google and in stack overflow. One related post is here, but since he wanted to remove the # symbol from the string he didn't use regex.

我试过在谷歌和堆栈溢出中搜索。一篇相关的帖子在这里,但由于他想从字符串中删除 # 符号,因此他没有使用正则表达式。

I've tried the regexes /\b\#\w\w+/, and /\b#\w\w+/and they don't work and if I remove the #, it detects the word.

我试过正则表达式/\b\#\w\w+//\b#\w\w+/但它们不起作用,如果我删除#,它会检测到这个词。

采纳答案by webbiedave

You don't need to escape it (it's probably the \bthat's throwing it off):

你不需要逃避它(它可能\b是扔掉它的):

if (preg_match('/^\w+#(\w+)/', 'abc#def', $matches)) {
    print_r($matches);
}

/* output of $matches:
Array
(
    [0] => abc#def
    [1] => def
)
*/

回答by Niet the Dark Absol

#does not have any special meaning in a regex, unless you use it as the delimiter. So just put it straight in and it should work.

#在正则表达式中没有任何特殊含义,除非您将其用作分隔符。所以只要把它直接放进去就可以了。

Note that \bdetects a word boundary, and in #abc, the word boundary is after the #and before the abc. Therefore, you need to usethe \bis superfluous and you just need #\w\w+.

请注意,\b检测词边界,并且在 中#abc,词边界在 之后#和之前abc。所以,你需要使用\b是多余的,你只需要#\w\w+

回答by Lasse Nielsen

With the comment on the earlier answer, you want to avoid matching x#x. In that case, your don't need \bbut \B:

通过对较早答案的评论,您希望避免匹配x#x. 在这种情况下,您不需要\b但是\B

\B#(\w\w+)

\B#(\w\w+)

(if you really need two-or-more word characters after the #).

(如果您确实需要在# 后面添加两个或更多单词字符)。

The \Bmeans NON-word-boundary, and since #is not a word character, this matches exactly if the previous character is not a word character.

\B意味着非单词边界,并且由于#不是单词字符,如果前一个字符不是单词字符,则这完全匹配。

回答by will

For what it is worth I only managed to match a hash(#) character as a string. In awk the parser takes out the comments as first thing. The only syntax that can 'hold' a # is

值得一提的是,我只设法将哈希(#)字符匹配为字符串。在 awk 中,解析器首先取出注释。可以“保持”# 的唯一语法是

"#"

So in my case I took-out lines with only comments as:

所以在我的情况下,我删除了只有评论的行:

 == "#" { next; }

I also attempted to make the hash a regex:

我还尝试使散列成为正则表达式:

HASH_PATTERN = "^#"

 ~ HASH_PATTERN { next; }

... This alsoworks. So I'm thinking you an put the whole expression in a string like: HASH_PATTERN.

......这有效。所以我想你把整个表达式放在一个字符串中,比如:HASH_PATTERN。

The string equals does work quite well. It isn't a perfect solution, just a starter.

字符串 equals 工作得很好。这不是一个完美的解决方案,只是一个开始。

回答by Highway of Life

You could use the following regex: /\#(\w+)/to match a hashtag with just the hashtag word, or: /\#\w+/will match the entire hashtag including the hash.

您可以使用以下正则表达式:/\#(\w+)/将主题标签与主题标签词/\#\w+/匹配,或者:将匹配包括散列在内的整个主题标签。