java 字符串到二进制,反之亦然:扩展 ASCII
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/5535988/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
String to binary and vice versa: extended ASCII
提问by anonymous001
I want to convert a String to binary by putting it in a byte array (String.getBytes[]
) and then store the binary string for each byte (Integer.toBinaryString(bytearray)
) in a String[]. Then I want to convert back to normal String via Byte.parseByte(stringarray[i], 2)
. This works great for standard ASCII-Table, but not for the extended one. For example, an A
gives me 1000001
, but an ?
returns
我想通过将字符串放入字节数组 ( String.getBytes[]
)中将其转换为二进制,然后将每个字节的二进制字符串 ( Integer.toBinaryString(bytearray)
) 存储在 String[] 中。然后我想通过Byte.parseByte(stringarray[i], 2)
. 这对标准 ASCII 表非常有效,但不适用于扩展表。例如, anA
给了我1000001
,但?
返回
11111111111111111111111111000011
11111111111111111111111110000100
Any ideas how to manage this?
任何想法如何管理这个?
public class BinString {
public static void main(String args[]) {
String s = "?";
System.out.println(binToString(stringToBin(s)));
}
public static String[] stringToBin(String s) {
System.out.println("Converting: " + s);
byte[] b = s.getBytes();
String[] sa = new String[s.getBytes().length];
for (int i = 0; i < b.length; i++) {
sa[i] = Integer.toBinaryString(b[i] & 0xFF);
}
return sa;
}
public static String binToString(String[] strar) {
byte[] bar = new byte[strar.length];
for (int i = 0; i < strar.length; i++) {
bar[i] = Byte.parseByte(strar[i], 2);
System.out.println(Byte.parseByte(strar[i], 2));
}
String s = new String(bar);
return s;
}
}
回答by Joachim Sauer
First off: "extended ASCII" is a very misleading title that's used to refer to a ton of different encodings.
首先:“extended ASCII”是一个非常具有误导性的标题,用于指代大量不同的编码。
Second: byte
in Java is signed, while bytes in encodings are usually handled as unsigned. Since you use Integer.toBinaryString()
the byte
will be converted to an int
using sign extension (because byte values > 127 will be represented by negative values in Java).
第二:byte
在 Java 中是有符号的,而编码中的字节通常被处理为无符号。由于您使用Integer.toBinaryString()
的byte
将被转换为一个int
用符号扩展(因为字节值> 127将由负值在Java中表示)。
To avoid this simply use & 0xFF
to mask all but the lower 8 bit like this:
为了避免这种情况,只需& 0xFF
像这样屏蔽除低 8 位之外的所有位:
String binary = Integer.toBinaryString(byteArray[i] & 0xFF);
回答by McDowell
To expand on Joachim's pointabout "extended ASCII" I'd add...
为了扩展约阿希姆关于“扩展 ASCII”的观点,我要添加......
Note that getBytes()
is a transcoding operation that converts data from UTF-16 to the platform default encoding. The encoding varies from system to system and sometimes even between users on the same PC. This means that results are not consistent on all platforms and if a legacy encoding is the default (as it is on Windows) that data can be lost.
请注意,这getBytes()
是将数据从 UTF-16 转换为平台默认编码的转码操作。编码因系统而异,有时甚至在同一台 PC 上的用户之间也不同。这意味着结果在所有平台上都不一致,如果旧编码是默认值(就像在 Windows 上一样),则数据可能会丢失。
To make the operation symmetrical, you need to provide an encoding explicitly(preferably a Unicode encoding such as UTF-8 or UTF-16.)
要使操作对称,您需要明确提供编码(最好是 Unicode 编码,例如 UTF-8 或 UTF-16。)
Charset encoding = Charset.forName("UTF-16");
byte[] b = s1.getBytes(encoding);
String s2 = new String(b, encoding);
assert s1.equals(s2);