我可以使用 php 获取字符的 unicode 值,反之亦然吗?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/9361303/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-26 06:39:25  来源:igfitidea点击:

can I get the unicode value of a character or vise versa with php?

phpunicodeutf-8

提问by Totoro

Is it possible to input a character and get the unicode value back? for example, i can put &#12103 in html to output "?", is it possible to give that character as an argument to a function and get the number as an output without building a unicode table?

是否可以输入一个字符并取回 unicode 值?例如,我可以将 ⽇ 放入 html 中以输出“?”,是否可以将该字符作为函数的参数并在不构建 unicode 表的情况下获取数字作为输出?

$val = someFunction("?");//returns 12103

or the reverse?

还是相反?

$val2 = someOtherFunction(12103);//returns "?"

I would like to be able to output the actual characters to the page not the codes, and I would also like to be able to get the code from the character if possible. The closest I got to what I want is php.net/manual/en/function.mb-decode-numericentity.php but I cant get it working, is this the code I need or am I on the wrong track?

我希望能够将实际字符输出到页面而不是代码,如果可能的话,我也希望能够从字符中获取代码。我最接近我想要的是 php.net/manual/en/function.mb-decode-numericentity.php 但我无法让它工作,这是我需要的代码还是我走错了路?

回答by Mark Baker

function _uniord($c) {
    if (ord($c{0}) >=0 && ord($c{0}) <= 127)
        return ord($c{0});
    if (ord($c{0}) >= 192 && ord($c{0}) <= 223)
        return (ord($c{0})-192)*64 + (ord($c{1})-128);
    if (ord($c{0}) >= 224 && ord($c{0}) <= 239)
        return (ord($c{0})-224)*4096 + (ord($c{1})-128)*64 + (ord($c{2})-128);
    if (ord($c{0}) >= 240 && ord($c{0}) <= 247)
        return (ord($c{0})-240)*262144 + (ord($c{1})-128)*4096 + (ord($c{2})-128)*64 + (ord($c{3})-128);
    if (ord($c{0}) >= 248 && ord($c{0}) <= 251)
        return (ord($c{0})-248)*16777216 + (ord($c{1})-128)*262144 + (ord($c{2})-128)*4096 + (ord($c{3})-128)*64 + (ord($c{4})-128);
    if (ord($c{0}) >= 252 && ord($c{0}) <= 253)
        return (ord($c{0})-252)*1073741824 + (ord($c{1})-128)*16777216 + (ord($c{2})-128)*262144 + (ord($c{3})-128)*4096 + (ord($c{4})-128)*64 + (ord($c{5})-128);
    if (ord($c{0}) >= 254 && ord($c{0}) <= 255)    //  error
        return FALSE;
    return 0;
}   //  function _uniord()

and

function _unichr($o) {
    if (function_exists('mb_convert_encoding')) {
        return mb_convert_encoding('&#'.intval($o).';', 'UTF-8', 'HTML-ENTITIES');
    } else {
        return chr(intval($o));
    }
}   // function _unichr()

回答by bobince

Here's a more compact implementation of unichr/uniord based on pack:

这是一个基于 unichr/uniord 的更紧凑的实现pack

// code point to UTF-8 string
function unichr($i) {
    return iconv('UCS-4LE', 'UTF-8', pack('V', $i));
}

// UTF-8 string to code point
function uniord($s) {
    return unpack('V', iconv('UTF-8', 'UCS-4LE', $s))[1];
}

回答by MAChitgarha

If you're using PHP7.2 (or later), you don't need to define a new function. There are two functions for your purposes from Multibyte String extension!

如果您使用的是 PHP7.2(或更高版本),则不需要定义新函数。多字节字符串扩展中有两个功能可用于您的目的!

To get code point of a character (i.e. Unicode value), use mb_ord(); and to get a specific character from that value, use mb_chr().

要获取字符的代码点(即 Unicode 值),请使用mb_ord();并从该值中获取特定字符,请使用mb_chr()

E.g.:

例如:

mb_chr(12103, "utf8"); // ?
mb_ord("?", "utf8"); // 12103

回答by user23127

This also works, (for someone who understands bitshifting this might be more readable than Mark Bakers answer):

这也有效,(对于了解位移的人来说,这可能比 Mark Ba​​kers 的回答更具可读性):

public function ordinal($str){
    $charString = mb_substr($str, 0, 1, 'utf-8');
    $size = strlen($charString);        
    $ordinal = ord($charString[0]) & (0xFF >> $size);
    //Merge other characters into the value
    for($i = 1; $i < $size; $i++){
        $ordinal = $ordinal << 6 | (ord($charString[$i]) & 127);
    }
    return $ordinal;
}

回答by Akhil Thayyil

You can use the following functions

您可以使用以下功能

For encoding

用于编码

string utf8_encode ( string $data )

http://php.net/manual/en/function.utf8-encode.php

http://php.net/manual/en/function.utf8-encode.php

For decoding

解码用

string utf8_decode ( string $data )

http://php.net/manual/en/function.utf8-decode.php

http://php.net/manual/en/function.utf8-decode.php

Also check

还要检查

http://php.net/manual/en/function.htmlspecialchars.php

http://php.net/manual/en/function.htmlspecialchars.php

<?php


echo htmlspecialchars_decode("&#12103");//will print ?

?>