php 如何检测字符串中的非 ASCII 字符?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/6497685/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How do I detect non-ASCII characters in a string?
提问by rid
If I have a PHP string, how can I determine if it contains at least one non-ASCII character or not, in an efficient way? And by non-ASCII character, I mean any character that is not part of this table, http://www.asciitable.com/, positions 32 - 126 inclusive.
如果我有一个 PHP 字符串,如何以一种有效的方式确定它是否至少包含一个非 ASCII 字符?非 ASCII 字符是指任何不属于该表的字符,http://www.asciitable.com/,位置 32 - 126(含)。
So not only does it have to be part of the ASCII table, but it also has to be printable. I want to detect a string that contains at least one character that does not meet these specifications (either non-printable ASCII, or a different character altogether, such as a Unicode character that is not part of that table.
因此,它不仅必须是 ASCII 表的一部分,而且还必须是可打印的。我想检测一个字符串,该字符串至少包含一个不符合这些规范的字符(不可打印的 ASCII 或完全不同的字符,例如不属于该表的一部分的 Unicode 字符。
回答by Karolis
I found it more useful to detect if any character falls out of the list
我发现检测是否有任何字符不在列表中更有用
if(preg_match('/[^\x20-\x7e]/', $string))
回答by Gumbo
You can use mb_detect_encoding
and check for ASCII:
您可以使用mb_detect_encoding
和检查 ASCII:
mb_detect_encoding($str, 'ASCII', true)
This will return falseif $str
contains at least one non-ASCI character (byte value > 0x7F).
如果包含至少一个非 ASCI 字符(字节值 > 0x7F),这将返回false$str
。
回答by Hans Kerkhof
Try (mb_detect_encoding)
回答by Steffen
The function ctype_printreturns true iff all characters fall into the ASCII range 32-126 (PHP unit test).
如果所有字符都落入 ASCII 范围 32-126(PHP 单元测试),则函数ctype_print返回 true 。
回答by Hamid Sarfraz
Try: (Source)
尝试:(来源)
function is_ascii( $string = '' ) {
return ( bool ) ! preg_match( '/[\x80-\xff]+/' , $string );
}
Although, all of the above answers are correct, but depending upon the input, these solutions may give wrong answers. See the last section in this ASCII validation post.
虽然,以上所有答案都是正确的,但根据输入,这些解决方案可能会给出错误的答案。请参阅此 ASCII 验证帖子的最后一部分。
回答by fyr
You could use:
你可以使用:
but it will be maybe not as precise as you want it to be.
但它可能不会像您希望的那样精确。
回答by Ole Media
I suggest you look into utf8_encode or utf8_decode under PHP's manual:
我建议您查看 PHP 手册下的 utf8_encode 或 utf8_decode :
http://www.php.net/manual/en/function.utf8-encode.php
http://www.php.net/manual/en/function.utf8-encode.php
Look into the examples down below as it may have something there that leads you to the right direction if not finding what you are looking for.
查看下面的示例,因为如果没有找到您想要的东西,它可能会引导您走向正确的方向。
回答by loretoparisi
If you do not want to deal with Regex
in javascript you can do
如果你不想Regex
在 javascript 中处理你可以做
detectUf8 : function(s) {
var utf8=s.split('').filter(function(C) {
return C.charCodeAt(0)>127;
})
return (utf8.join('').length>0);
},