php 如何检测字符串中的非 ASCII 字符?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/6497685/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-26 00:33:47  来源:igfitidea点击:

How do I detect non-ASCII characters in a string?

phpstring

提问by rid

If I have a PHP string, how can I determine if it contains at least one non-ASCII character or not, in an efficient way? And by non-ASCII character, I mean any character that is not part of this table, http://www.asciitable.com/, positions 32 - 126 inclusive.

如果我有一个 PHP 字符串,如何以一种有效的方式确定它是否至少包含一个非 ASCII 字符?非 ASCII 字符是指任何不属于该表的字符,http://www.asciitable.com/,位置 32 - 126(含)。

So not only does it have to be part of the ASCII table, but it also has to be printable. I want to detect a string that contains at least one character that does not meet these specifications (either non-printable ASCII, or a different character altogether, such as a Unicode character that is not part of that table.

因此,它不仅必须是 ASCII 表的一部分,而且还必须是可打印的。我想检测一个字符串,该字符串至少包含一个不符合这些规范的字符(不可打印的 ASCII 或完全不同的字符,例如不属于该表的一部分的 Unicode 字符。

回答by Karolis

I found it more useful to detect if any character falls out of the list

我发现检测是否有任何字符不在列表中更有用

if(preg_match('/[^\x20-\x7e]/', $string))

回答by Gumbo

You can use mb_detect_encodingand check for ASCII:

您可以使用mb_detect_encoding和检查 ASCII:

mb_detect_encoding($str, 'ASCII', true)

This will return falseif $strcontains at least one non-ASCI character (byte value > 0x7F).

如果包含至少一个非 ASCI 字符(字节值 > 0x7F),这将返回false$str

回答by Hans Kerkhof

回答by Steffen

The function ctype_printreturns true iff all characters fall into the ASCII range 32-126 (PHP unit test).

如果所有字符都落入 ASCII 范围 32-126(PHP 单元测试),则函数ctype_print返回 true 。

回答by Hamid Sarfraz

Try: (Source)

尝试:(来源

function is_ascii( $string = '' ) {
    return ( bool ) ! preg_match( '/[\x80-\xff]+/' , $string );
}

Although, all of the above answers are correct, but depending upon the input, these solutions may give wrong answers. See the last section in this ASCII validation post.

虽然,以上所有答案都是正确的,但根据输入,这些解决方案可能会给出错误的答案。请参阅此 ASCII 验证帖子的最后一部分。

回答by fyr

You could use:

你可以使用:

mb_detect_encoding

mb_detect_encoding

but it will be maybe not as precise as you want it to be.

但它可能不会像您希望的那样精确。

回答by Ole Media

I suggest you look into utf8_encode or utf8_decode under PHP's manual:

我建议您查看 PHP 手册下的 utf8_encode 或 utf8_decode :

http://www.php.net/manual/en/function.utf8-encode.php

http://www.php.net/manual/en/function.utf8-encode.php

Look into the examples down below as it may have something there that leads you to the right direction if not finding what you are looking for.

查看下面的示例,因为如果没有找到您想要的东西,它可能会引导您走向正确的方向。

回答by loretoparisi

If you do not want to deal with Regexin javascript you can do

如果你不想Regex在 javascript 中处理你可以做

detectUf8 : function(s) {
  var utf8=s.split('').filter(function(C) {
    return C.charCodeAt(0)>127;
  })
  return (utf8.join('').length>0);
},