Java 获取字符的 unicode 值
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/2220366/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Get unicode value of a character
提问by Saurabh
Is there any way in Java so that I can get Unicode equivalent of any character? e.g.
Java 中有什么方法可以让我获得任何字符的 Unicode 等价物?例如
Suppose a method getUnicode(char c)
. A call getUnicode('÷')
should return \u00f7
.
假设一个方法getUnicode(char c)
。一个电话getUnicode('÷')
应该返回\u00f7
。
采纳答案by SyntaxT3rr0r
You can do it for any Java char using the one liner here:
您可以使用此处的单行对任何 Java 字符执行此操作:
System.out.println( "\u" + Integer.toHexString('÷' | 0x10000).substring(1) );
But it's only going to work for the Unicode characters up to Unicode 3.0, which is why I precised you could do it for any Java char.
但它只适用于 Unicode 3.0 之前的 Unicode 字符,这就是为什么我明确指出您可以对任何 Java 字符执行此操作。
Because Java was designed way before Unicode 3.1 came and hence Java's char primitive is inadequate to represent Unicode 3.1 and up: there's not a "one Unicode character to one Java char" mapping anymore (instead a monstrous hack is used).
因为 Java 是在 Unicode 3.1 出现之前设计的,因此 Java 的 char 原语不足以表示 Unicode 3.1 及更高版本:不再有“一个 Unicode 字符到一个 Java 字符”的映射(而是使用了一个可怕的黑客)。
So you really have to check your requirements here: do you need to support Java char or any possible Unicode character?
所以你真的必须在这里检查你的要求:你需要支持 Java 字符或任何可能的 Unicode 字符吗?
回答by Chathuranga Chandrasekara
I found this nice code on web.
我在网上找到了这个不错的代码。
import java.io.BufferedReader;
import java.io.IOException;
import java.io.InputStreamReader;
public class Unicode {
public static void main(String[] args) {
System.out.println("Use CTRL+C to quite to program.");
// Create the reader for reading in the text typed in the console.
InputStreamReader inputStreamReader = new InputStreamReader(System.in);
BufferedReader bufferedReader = new BufferedReader(inputStreamReader);
try {
String line = null;
while ((line = bufferedReader.readLine()).length() > 0) {
for (int index = 0; index < line.length(); index++) {
// Convert the integer to a hexadecimal code.
String hexCode = Integer.toHexString(line.codePointAt(index)).toUpperCase();
// but the it must be a four number value.
String hexCodeWithAllLeadingZeros = "0000" + hexCode;
String hexCodeWithLeadingZeros = hexCodeWithAllLeadingZeros.substring(hexCodeWithAllLeadingZeros.length()-4);
System.out.println("\u" + hexCodeWithLeadingZeros);
}
}
} catch (IOException ioException) {
ioException.printStackTrace();
}
}
}
回答by Aaron Digulla
If you have Java 5, use char c = ...; String s = String.format ("\\u%04x", (int)c);
如果您有 Java 5,请使用 char c = ...; String s = String.format ("\\u%04x", (int)c);
If your source isn't a Unicode character (char
) but a String, you must use charAt(index)
to get the Unicode character at position index
.
如果您的源不是 Unicode 字符 ( char
) 而是字符串,则必须使用charAt(index)
来获取位置 处的 Unicode 字符index
。
Don't use codePointAt(index)
because that will return 24bit values (full Unicode) which can't be represented with just 4 hex digits (it needs 6). See the docs for an explanation.
不要使用,codePointAt(index)
因为这将返回 24 位值(完整的 Unicode),不能仅用 4 个十六进制数字表示(它需要 6 个)。有关解释,请参阅文档。
[EDIT] To make it clear: This answer doesn't use Unicode but the method which Java uses to represent Unicode characters (i.e. surrogate pairs) since char is 16bit and Unicode is 24bit. The question should be: "How can I convert char
to a 4-digit hex number", since it's not (really) about Unicode.
[编辑] 说清楚:这个答案不使用 Unicode,而是 Java 用来表示 Unicode 字符(即代理对)的方法,因为 char 是 16 位,Unicode 是 24 位。问题应该是:“我如何转换char
为 4 位十六进制数”,因为它(实际上)与 Unicode 无关。
回答by Yogesh Dubey
private static String toUnicode(char ch) {
return String.format("\u%04x", (int) ch);
}
回答by Deepak Sharma
char c = 'a';
String a = Integer.toHexString(c); // gives you---> a = "61"
回答by Jordan Doerksen
are you picky with using Unicode because with java its more simple if you write your program to use "dec" value or (HTML-Code) then you can simply cast data types between char and int
您是否对使用 Unicode 很挑剔,因为如果您将程序编写为使用“dec”值或(HTML 代码),那么使用 java 会更简单,那么您可以简单地在 char 和 int 之间转换数据类型
char a = 98;
char b = 'b';
char c = (char) (b+0002);
System.out.println(a);
System.out.println((int)b);
System.out.println((int)c);
System.out.println(c);
Gives this output
给出这个输出
b
98
100
d
回答by Josiel Novaes
First, I get the high side of the char. After, get the low side. Convert all of things in HexString and put the prefix.
首先,我得到了字符的高端。之后,获得低端。转换 HexString 中的所有内容并添加前缀。
int hs = (int) c >> 8;
int ls = hs & 0x000F;
String highSide = Integer.toHexString(hs);
String lowSide = Integer.toHexString(ls);
lowSide = Integer.toHexString(hs & 0x00F0);
String hexa = Integer.toHexString( (int) c );
System.out.println(c+" = "+"\u"+highSide+lowSide+hexa);