php 正则表达式只允许字母数字、逗号、连字符、下划线和分号
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/9333325/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Regex to only allow alphanumeric, comma, hyphen, underscore and semicolon
提问by Robin
I've already got a bit of working code but I need someone to help explain why it works if they can!
我已经有了一些工作代码,但我需要有人帮助解释为什么它可以工作!
I am using PHP to replace anything in a string if it is not either a-z, A-Z, 0-9, a comma, a semicolon, an underscore or a hyphen (which ultimately should represent either a single username, or a comma/semicolon separated list of usernames).
我正在使用 PHP 替换字符串中的任何内容,如果它不是 az、AZ、0-9、逗号、分号、下划线或连字符(最终应表示单个用户名或逗号/分号分隔)用户名列表)。
The following works:
以下工作:
$data = preg_replace('/[^,;a-zA-Z0-9_-]/s', '', $data);
But the following does not:
但以下没有:
$data = preg_replace('/[^a-zA-Z0-9_-,;]/s', '', $data);
Why will this only work when the comma and semicolon are at the start? Putting them at the end seems to break things (this is what I tried initially when I came across /[^a-zA-Z0-9_-]/s.
为什么这仅在逗号和分号位于开头时才有效?将它们放在最后似乎会破坏事情(这是我最初遇到 /[^a-zA-Z0-9_-]/s 时尝试的。
As an aside, I am also using the following to trim any trailingsemicolons (plural) or commas (plural) and someone may be able to suggest a more efficient and/or elegant way to do this?:
顺便说一句,我还使用以下内容来修剪任何尾随分号(复数)或逗号(复数),有人可能会建议一种更有效和/或更优雅的方法来做到这一点?:
if(preg_match('/;$/', $data))
{
$data = rtrim($data, ';' );
}
if(preg_match('/,$/', $data))
{
$data = rtrim($data, ',' );
}
Thanks for any help :)
谢谢你的帮助 :)
回答by Justin Morgan
It's not the comma and semicolon causing your problem; it's the hyphen. Look at the parts of your character class and consider what they mean:
不是逗号和分号导致您的问题;这是连字符。查看角色类的各个部分并考虑它们的含义:
0-9 # Anything from '0' to '9', meaning 0, 1, 2, ... 9
A-Z # Anything from 'A' to 'Z', meaning A, B, C, ... Z
_-, # Anything from '_' to ',', meaning...uh...hmmm.
There's no clear progression from _
to ,
, so the regex engine isn't sure what to make of this. In character classes, if you want a hyphen to be interpreted literally, it needs to be at the very beginning or end of the class (or escaped with a backslash). So any of these will work:
从_
到没有明确的进展,
,因此正则表达式引擎不确定如何处理。在字符类中,如果您希望从字面上解释连字符,则它需要位于类的开头或结尾(或用反斜杠转义)。所以这些都可以工作:
[^,;a-zA-Z0-9_-]
[^-,;a-zA-Z0-9_]
[^a-zA-Z0-9_\-,;]
As for trimming off the end, you can do all of this in one regex replace:
至于修剪结束,您可以在一个正则表达式替换中完成所有这些:
$data = preg_replace('/[^,;a-zA-Z0-9_-]|[,;]$/s', '', $data);
回答by Devin Ceartas
I believe it's the placement of the hyphen that matters -- has to be at start or end to be a hyphen (literal), otherwise it's being used to define a range.
我相信重要的是连字符的位置 - 必须在开头或结尾才能成为连字符(文字),否则它被用来定义一个范围。
回答by iDifferent
You can escape the hyphen and put it anywhere in the regex like this \-
您可以像这样转义连字符并将其放在正则表达式中的任何位置 \-
As for the trailing semicolons and commas, try this /[,;]+$/
it should match any commas and semicolons at the end even if they are many.
至于尾随的分号和逗号,试试这个,/[,;]+$/
它应该匹配末尾的任何逗号和分号,即使它们很多。