java 将字节数组转换为可理解的字符串

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/2654145/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-10-29 22:14:10  来源:igfitidea点击:

Convert byte array to understandable String

javaunicodeasciihash

提问by Mike B

I have a program that handles byte arrays in Java, and now I would like to write this into a XML file. However, I am unsure as to how I can convert the following byte array into a sensible String to write to a file. Assuming that it was Unicode characters I attempted the following code:

我有一个用 Java 处理字节数组的程序,现在我想把它写成一个 XML 文件。但是,我不确定如何将以下字节数组转换为合理的字符串以写入文件。假设它是 Unicode 字符,我尝试了以下代码:

String temp = new String(encodedBytes, "UTF-8");

Only to have the debugger show that the encodedBytes contain "\ufffd\ufffd ^\ufffd\ufffd-m\ufffd\ufffd\/ufffd \ufffd\ufffdIA\ufffd\ufffd". The String should contain a hash in alphanumerical format.

只是让调试器显示编码字节包含"\ufffd\ufffd ^\ufffd\ufffd-m\ufffd\ufffd\/ufffd \ufffd\ufffdIA\ufffd\ufffd". 字符串应包含字母数字格式的散列。

How would I turn the above String into a sensible String for output?

我如何将上面的字符串转换为合理的字符串以进行输出?

回答by trashgod

The byte array doesn't look like UTF-8. Note that \ufffd(named REPLACEMENT CHARACTER) is "used to replace an incoming character whose value is unknown or unrepresentable in Unicode."

字节数组看起来不像 UTF-8。请注意,\ufffd(named REPLACEMENT CHARACTER)用于“用于替换其值未知或无法在 Unicode 中表示的传入字符。”

Addendum: Here's a simple example of how this can happen. When cast to a byte, the code point for ?is neither UTF-8 nor US-ASCII; but it isvalid ISO-8859-1. In effect, you have to know what the bytes represent before you can encode them into a String.

附录:这是一个简单的例子,说明这是如何发生的。当转换为 a 时byte, 的代码点?既不是 UTF-8 也不是 US-ASCII;但它有效的 ISO-8859-1。实际上,您必须先知道字节代表什么,然后才能将它们编码为String.

public class Hello {

    public static void main(String[] args)
            throws java.io.UnsupportedEncodingException {
        String s = "Hola, se?or!";
        System.out.println(s);
        byte[] b = new byte[s.length()];
        for (int i = 0; i < b.length; i++) {
            int cp = s.codePointAt(i);
            b[i] = (byte) cp;
            System.out.print((byte) cp + " ");
        }
        System.out.println();
        System.out.println(new String(b, "UTF-8"));
        System.out.println(new String(b, "US-ASCII"));
        System.out.println(new String(b, "ISO-8859-1"));
    }
}

Output:

输出:

Hola, se?or!
72 111 108 97 44 32 115 101 -15 111 114 33 
Hola, se?or!
Hola, se?or!
Hola, se?or!

回答by trashgod

If your string is the output of a password hashing scheme (which it looks like it might be) then I think you will need to Base64 encode in order to put it into plain text.

如果您的字符串是密码散列方案的输出(看起来可能是这样),那么我认为您需要进行 Base64 编码才能将其转换为纯文本。

Standard procedure, if you have raw bytes you want to output to a text file, is to use Base 64 encoding. The Commons Codeclibrary provides a Base64 encoder / decoder for you to use.

如果您有想要输出到文本文件的原始字节,标准程序是使用 Base 64 编码。该共享编解码器库为您提供了使用一个Base64编码器/解码器。

Hope this helps.

希望这可以帮助。