正则表达式清理 (PHP)
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/3022185/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Regular Expression Sanitize (PHP)
提问by Atif Mohammed Ameenuddin
I would like to sanitize a string in to a URL so this is what I basically need.
我想将一个字符串清理到一个 URL 中,所以这就是我基本上需要的。
- Everything must be removed except alphanumeric characters and spaces and dashed.
- Spaces should be converter into dashes.
- 除了字母数字字符和空格和虚线外,所有内容都必须删除。
- 空格应转换为破折号。
Eg.
例如。
This, is the URL!
must return
必须返回
this-is-the-url
回答by SilentGhost
function slug($z){
$z = strtolower($z);
$z = preg_replace('/[^a-z0-9 -]+/', '', $z);
$z = str_replace(' ', '-', $z);
return trim($z, '-');
}
回答by Rooneyl
First strip unwanted characters
首先去除不需要的字符
$new_string = preg_replace("/[^a-zA-Z0-9\s]/", "", $string);
Then changes spaces for unserscores
然后为 unsercores 更改空格
$url = preg_replace('/\s/', '-', $new_string);
Finally encode it ready for use
最后编码它准备使用
$new_url = urlencode($url);
回答by user1484291
This will do it in a Unix shell (I just tried it on my MacOS):
这将在 Unix shell 中完成(我刚刚在我的 MacOS 上尝试过):
$ tr -cs A-Za-z '-' < infile.txt > outfile.txt
I got the idea from a blog post on More Shell, Less Egg
我从一篇关于更多壳,更少鸡蛋的博客文章中得到了这个想法
回答by Abhishek Goel
Try This
尝试这个
function clean($string) {
$string = str_replace(' ', '-', $string); // Replaces all spaces with hyphens.
$string = preg_replace('/[^A-Za-z0-9\-]/', '', $string); // Removes special chars.
return preg_replace('/-+/', '-', $string); // Replaces multiple hyphens with single one.
}
Usage:
用法:
echo clean('a|"bc!@£de^&$f g');
Will output: abcdef-g
将输出: abcdef-g
回答by Denis Matafonov
All previous asnwers deal with url, but in case some one will need to sanitize string for login (e.g.) and keep it as text, here is you go:
所有以前的 asnwers 都处理 url,但如果有人需要清理登录字符串(例如)并将其保留为文本,那么您可以这样做:
function sanitizeText($str) {
$withSpecCharacters = htmlspecialchars($str);
$splitted_str = str_split($str);
$result = '';
foreach ($splitted_str as $letter){
if (strpos($withSpecCharacters, $letter) !== false) {
$result .= $letter;
}
}
return $result;
}
echo sanitizeText('ОРРииыфвсси ajvnsakjvnHB "&nvsp;\n" <script>alert()</script>');
//ОРРииыфвсси ajvnsakjvnHB &nvsp;\n scriptalert()/script
//No injections possible, all info at max keeped
回答by DjimOnDev
You should use the slugify package and not reinvent the wheel ;)
您应该使用 slugify 包而不是重新发明轮子;)
回答by Adeel Raza Azeemi
The following will replace spaces with dashes.
以下将用破折号替换空格。
$str = str_replace(' ', '-', $str);
Then the following statement will remove everything except alphanumeric characters and dashed. (didn't have spaces because in previous step we had replaced them with dashes.
然后以下语句将删除除字母数字字符和虚线以外的所有内容。(没有空格,因为在上一步中我们用破折号替换了它们。
// Char representation 0 - 9 A- Z a- z -
$str = preg_replace('/[^\x30-\x39\x41-\x5A\x61-\x7A\x2D]/', '', $str);
Which is equivalent to
这相当于
$str = preg_replace('/[^0-9A-Za-z-]+/', '', $str);
FYI: To remove all special characters from a string use
仅供参考:要从字符串中删除所有特殊字符,请使用
$str = preg_replace('/[^\x20-\x7E]/', '', $str);
\x20 is hexadecimal for space that is start of Acsii charecter and \x7E is tilde. As accordingly to wikipedia https://en.wikipedia.org/wiki/ASCII#Printable_characters
\x20 是 Acsii 字符开头的空间的十六进制,而 \x7E 是波浪号。根据维基百科https://en.wikipedia.org/wiki/ASCII#Printable_characters
FYI: look into the Hex Column for the interval 20-7E
仅供参考:查看间隔 20-7E 的十六进制列
Printable characters Codes 20hex to 7Ehex, known as the printable characters, represent letters, digits, punctuation marks, and a few miscellaneous symbols. There are 95 printable characters in total.
可打印字符 代码 20hex 到 7Ehex,称为可打印字符,代表字母、数字、标点符号和一些杂项符号。共有 95 个可打印字符。

