java 将字节数组转换为可理解的字符串

Question

提问by Mike B

I have a program that handles byte arrays in Java, and now I would like to write this into a XML file. However, I am unsure as to how I can convert the following byte array into a sensible String to write to a file. Assuming that it was Unicode characters I attempted the following code:

我有一个用 Java 处理字节数组的程序，现在我想把它写成一个 XML 文件。但是，我不确定如何将以下字节数组转换为合理的字符串以写入文件。假设它是 Unicode 字符，我尝试了以下代码：

String temp = new String(encodedBytes, "UTF-8");

Only to have the debugger show that the encodedBytes contain "\ufffd\ufffd ^\ufffd\ufffd-m\ufffd\ufffd\/ufffd \ufffd\ufffdIA\ufffd\ufffd". The String should contain a hash in alphanumerical format.

只是让调试器显示编码字节包含"\ufffd\ufffd ^\ufffd\ufffd-m\ufffd\ufffd\/ufffd \ufffd\ufffdIA\ufffd\ufffd". 字符串应包含字母数字格式的散列。

How would I turn the above String into a sensible String for output?

我如何将上面的字符串转换为合理的字符串以进行输出？

Answer 1

回答by trashgod

The byte array doesn't look like UTF-8. Note that \ufffd(named REPLACEMENT CHARACTER) is "used to replace an incoming character whose value is unknown or unrepresentable in Unicode."

字节数组看起来不像 UTF-8。请注意，\ufffd(named REPLACEMENT CHARACTER)用于“用于替换其值未知或无法在 Unicode 中表示的传入字符。”

Addendum: Here's a simple example of how this can happen. When cast to a byte, the code point for ?is neither UTF-8 nor US-ASCII; but it isvalid ISO-8859-1. In effect, you have to know what the bytes represent before you can encode them into a String.

附录：这是一个简单的例子，说明这是如何发生的。当转换为 a 时byte，的代码点?既不是 UTF-8 也不是 US-ASCII；但它是有效的 ISO-8859-1。实际上，您必须先知道字节代表什么，然后才能将它们编码为String.

public class Hello {

    public static void main(String[] args)
            throws java.io.UnsupportedEncodingException {
        String s = "Hola, se?or!";
        System.out.println(s);
        byte[] b = new byte[s.length()];
        for (int i = 0; i < b.length; i++) {
            int cp = s.codePointAt(i);
            b[i] = (byte) cp;
            System.out.print((byte) cp + " ");
        }
        System.out.println();
        System.out.println(new String(b, "UTF-8"));
        System.out.println(new String(b, "US-ASCII"));
        System.out.println(new String(b, "ISO-8859-1"));
    }
}

Output:

输出：

Hola, se?or!
72 111 108 97 44 32 115 101 -15 111 114 33 
Hola, se?or!
Hola, se?or!
Hola, se?or!

Answer 2

回答by trashgod

If your string is the output of a password hashing scheme (which it looks like it might be) then I think you will need to Base64 encode in order to put it into plain text.

如果您的字符串是密码散列方案的输出（看起来可能是这样），那么我认为您需要进行 Base64 编码才能将其转换为纯文本。

Standard procedure, if you have raw bytes you want to output to a text file, is to use Base 64 encoding. The Commons Codeclibrary provides a Base64 encoder / decoder for you to use.

如果您有想要输出到文本文件的原始字节，标准程序是使用 Base 64 编码。该共享编解码器库为您提供了使用一个Base64编码器/解码器。

Hope this helps.

希望这可以帮助。

java 将字节数组转换为可理解的字符串

提问by Mike B

回答by trashgod

回答by trashgod

相关推荐

最近更新

标签

java 将字节数组转换为可理解的字符串

提问by Mike B

回答by trashgod

回答by trashgod

相关推荐

java 如何去除二叉树的叶子？

java 排序 Android ListView

错误：java.security.AccessControlException：拒绝访问

java JavaDoc 中带有注释的代码示例

相关推荐

最近更新

标签