如何匹配 php 正则表达式中的英镑 (#) 符号(用于主题标签)
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/9421948/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How to match a pound (#) symbol in a regex in php (for hashtags)
提问by J-Rou
Very simple, I need to match the #
symbol using a regex. I'm working on a hashtag detector.
很简单,我需要#
使用正则表达式匹配符号。我正在研究一个标签检测器。
I've tried searching in google and in stack overflow. One related post is here, but since he wanted to remove the # symbol from the string he didn't use regex.
我试过在谷歌和堆栈溢出中搜索。一篇相关的帖子在这里,但由于他想从字符串中删除 # 符号,因此他没有使用正则表达式。
I've tried the regexes /\b\#\w\w+/
, and /\b#\w\w+/
and they don't work and if I remove the #
, it detects the word.
我试过正则表达式/\b\#\w\w+/
,/\b#\w\w+/
但它们不起作用,如果我删除#
,它会检测到这个词。
采纳答案by webbiedave
You don't need to escape it (it's probably the \b
that's throwing it off):
你不需要逃避它(它可能\b
是扔掉它的):
if (preg_match('/^\w+#(\w+)/', 'abc#def', $matches)) {
print_r($matches);
}
/* output of $matches:
Array
(
[0] => abc#def
[1] => def
)
*/
回答by Niet the Dark Absol
#
does not have any special meaning in a regex, unless you use it as the delimiter. So just put it straight in and it should work.
#
在正则表达式中没有任何特殊含义,除非您将其用作分隔符。所以只要把它直接放进去就可以了。
Note that \b
detects a word boundary, and in #abc
, the word boundary is after the #
and before the abc
. Therefore, you need to usethe \b
is superfluous and you just need #\w\w+
.
请注意,\b
检测词边界,并且在 中#abc
,词边界在 之后#
和之前abc
。所以,你需要使用该\b
是多余的,你只需要#\w\w+
。
回答by Lasse Nielsen
With the comment on the earlier answer, you want to avoid matching x#x
.
In that case, your don't need \b
but \B
:
通过对较早答案的评论,您希望避免匹配x#x
. 在这种情况下,您不需要\b
但是\B
:
\B#(\w\w+)
\B#(\w\w+)
(if you really need two-or-more word characters after the #).
(如果您确实需要在# 后面添加两个或更多单词字符)。
The \B
means NON-word-boundary, and since #
is not a word character, this matches exactly if the previous character is not a word character.
这\B
意味着非单词边界,并且由于#
不是单词字符,如果前一个字符不是单词字符,则这完全匹配。
回答by will
For what it is worth I only managed to match a hash(#) character as a string. In awk the parser takes out the comments as first thing. The only syntax that can 'hold' a # is
值得一提的是,我只设法将哈希(#)字符匹配为字符串。在 awk 中,解析器首先取出注释。可以“保持”# 的唯一语法是
"#"
So in my case I took-out lines with only comments as:
所以在我的情况下,我删除了只有评论的行:
== "#" { next; }
I also attempted to make the hash a regex:
我还尝试使散列成为正则表达式:
HASH_PATTERN = "^#"
~ HASH_PATTERN { next; }
... This alsoworks. So I'm thinking you an put the whole expression in a string like: HASH_PATTERN.
......这也有效。所以我想你把整个表达式放在一个字符串中,比如:HASH_PATTERN。
The string equals does work quite well. It isn't a perfect solution, just a starter.
字符串 equals 工作得很好。这不是一个完美的解决方案,只是一个开始。
回答by Highway of Life
You could use the following regex: /\#(\w+)/
to match a hashtag with just the hashtag word, or: /\#\w+/
will match the entire hashtag including the hash.
您可以使用以下正则表达式:/\#(\w+)/
将主题标签与主题标签词/\#\w+/
匹配,或者:将匹配包括散列在内的整个主题标签。