java Base64 编码/解码问题:解码后的字符串是“?”

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/6484369/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-10-30 16:03:31  来源:igfitidea点击:

Issue with Base64 encoding/decoding: decoded string is '?'

javabase64java-io

提问by Ankit

I am trying to read an image and use Base64 encoding to convert it into byte array and then to string to send it over network. The problem is that when I try to decode the Base64 encoded string, I am getting incorrect data.

我正在尝试读取图像并使用 Base64 编码将其转换为字节数组,然后转换为字符串以通过网络发送。问题是当我尝试解码 Base64 编码的字符串时,我得到了不正确的数据。

For eg. I am facing issue with below special character.

例如。我面临以下特殊字符的问题。

I am using following code for encoding:

我使用以下代码进行编码:

byte[] b = Base64.encodeBase64(IOUtils.toByteArray(loInputStream));
String ab = new String(b);

IOUtilsis org.apache.commons.io.IOUtils.

IOUtilsorg.apache.commons.io.IOUtils

and loInput

和 loInput

Code for decoding:

解码代码:

byte[] c = Base64.decodeBase64(ab.getBytes());
String ca = new String(c);
System.out.println(ca);

It prints ?for decoded String.

它打印?解码的字符串。

Can anyone please let me know the issue.

任何人都可以让我知道这个问题。

回答by nos

If your input is an image, it makes sense to encode it as base64 - base64 is text, and can be represented by a String.

如果您的输入是图像,则将其编码为 base64 是有意义的 - base64 是文本,并且可以由字符串表示。

Decoding it again though, you get the original image. An image is usually a binary format; it does not make sense to try to convert that to a string - it is not text.

再次解码,你会得到原始图像。图像通常是二进制格式;尝试将其转换为字符串是没有意义的 - 它不是文本。

That is, the last 2 lines:

也就是说,最后两行:

   String ca = new String(c);
   System.out.println(ca);

Simply does not make sense to do.

根本就没有意义。

If you want to check that the decoding produces the same output as the original input, do e.g.

如果您想检查解码产生的输出与原始输入相同,请执行例如

  System.out.println("Original and decoded are the same: " + Arrays.equals(b,c));

(Or save the byte array to a file and view the image in an image viewer)

(或将字节数组保存到文件并在图像查看器中查看图像)

回答by ninjalj

As I've said elsewhere, in Java, Stringis for text, and byte[]is for binary data.

正如我在别处所说的,在 Java 中,String用于文本,byte[]用于二进制数据。

String ≠ byte[]

字符串≠字节[]

Text ≠ Binary Data

文本≠二进制数据

An image is binary data. Base64 is an encoding which allows transmission of binary data over US_ASCII compatible text channels (there is a similar encoding for supersets of ASCII text: Quoted Printable).

图像是二进制数据。Base64 是一种允许通过 US_ASCII 兼容文本通道传输二进制数据的编码(对于 ASCII 文本的超集有类似的编码:Quoted Printable)。

So, it goes like:

所以,它是这样的:

Image (binary data) → Image (text, Base64 encoded binary data) → Image (binary data)

Image (binary data) → Image (text, Base64 encoded binary data) → Image (binary data)

where you would use String encodeBase64String(byte[])to encode, and byte[] decode(String)to decode. These are the only sane API's for Base64, byte[] encodeBase64(byte[])is misleading, the result is US_ASCII-compatible text (so, a String, notbyte[]).

您将String encodeBase64String(byte[])用来编码和byte[] decode(String)解码的地方。这些是 Base64 唯一合理的 API,byte[] encodeBase64(byte[])具有误导性,结果是 US_ASCII 兼容文本(因此, a String而不是byte[])。

Now, text has a charset and an encoding, Stringuses a fixed Unicode/UTF-16 charset/encoding combination internally, and you have to specify a charset/encoding when converting something from/to a String, either explicitly, or implicitly, using the platform's default encoding (which is what PrintStream.println()does). Base64 text is pure US_ASCII, so you need to use that, or a superset of US_ASCII. org.apache.commons.codec.binary.Base64uses UTF8, which is a superset of US_ASCII, so all is well. (OTOH, the internal java.util.prefs.Base64uses the platform's default encoding, so I guess it would break if you start your JVM with, say, an UTF-16 encoding).

现在,文本有一个字符集和一个编码,在内部String使用固定的 Unicode/UTF-16 字符集/编码组合,并且您必须在String使用平台的显式或隐式将某些内容从/转换为 a 时指定字符集/编码默认编码(这是什么PrintStream.println())。Base64 文本是纯 US_ASCII,因此您需要使用它或 US_ASCII 的超集。org.apache.commons.codec.binary.Base64使用 UTF8,它是 US_ASCII 的超集,所以一切都很好。(OTOH,内部java.util.prefs.Base64使用平台的默认编码,所以我想如果你用 UTF-16 编码启动你的 JVM,它会中断)。

Back on topic: you've tried to print the decoded image (binary data) as text, which obviously hasn't worked. PrintStreamhas write()methods that can write binary data, so you could use those, and you would get the same garbage as if you wrote the original image. It would be much better to use a FileOutputStream, and compare the resulting file with the original image file.

回到主题:您已经尝试将解码后的图像(二进制数据)打印为文本,这显然没有用。PrintStreamwrite()可以写入二进制数据的方法,因此您可以使用这些方法,并且会得到与写入原始图像相同的垃圾。最好使用FileOutputStream, 并将生成的文件与原始图像文件进行比较。