你如何将二进制数据转换为字符串并返回到 Java 中?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/20778/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-11 07:14:05  来源:igfitidea点击:

How do you convert binary data to Strings and back in Java?

javaserialization

提问by Bill the Lizard

I have binary data in a file that I can read into a byte array and process with no problem. Now I need to send parts of the data over a network connection as elements in an XML document. My problem is that when I convert the data from an array of bytes to a String and back to an array of bytes, the data is getting corrupted. I've tested this on one machine to isolate the problem to the String conversion, so I now know that it isn't getting corrupted by the XML parser or the network transport.

我在一个文件中有二进制数据,我可以将它读入一个字节数组并毫无问题地进行处理。现在我需要通过网络连接将部分数据作为 XML 文档中的元素发送。我的问题是,当我将数据从字节数组转换为字符串并返回到字节数组时,数据已损坏。我已经在一台机器上对此进行了测试,以将问题与字符串转换隔离开来,所以我现在知道它没有被 XML 解析器或网络传输损坏。

What I've got right now is

我现在有的是

byte[] buffer = ...; // read from file
// a few lines that prove I can process the data successfully
String element = new String(buffer);
byte[] newBuffer = element.getBytes();
// a few lines that try to process newBuffer and fail because it is not the same data anymore

Does anyone know how to convert binary to String and back without data loss?

有谁知道如何将二进制转换为字符串并返回而不会丢失数据?

Answered: Thanks Sam. I feel like an idiot. I had this answered yesterday because my SAX parser was complaining. For some reason when I ran into this seemingly separate issue, it didn't occur to me that it was a new symptom of the same problem.

回答:谢谢山姆。我觉得自己像个白痴。我昨天回答了这个问题,因为我的 SAX 解析器在抱怨。出于某种原因,当我遇到这个看似独立的问题时,我没有想到这是同一问题的新症状。

EDIT: Just for the sake of completeness, I used the Base64class from the Apache CommonsCodecpackage to solve this problem.

编辑:为了完整起见,我使用了Apache Commons Codec包中的Base64类来解决这个问题。

采纳答案by Sam

If you encode it in base64, this will turn any data into ascii safe text, but base64 encoded data is larger than the orignal data

如果你用 base64 编码,这会将任何数据转换为 ascii 安全文本,但 base64 编码的数据大于原始数据

回答by Herms

How are you building your XML document? If you use java's built in XML classes then the string encoding should be handled for you.

您如何构建 XML 文档?如果您使用 Java 的内置 XML 类,则应该为您处理字符串编码。

Take a look at the javax.xml and org.xml packages. That's what we use for generating XML docs, and it handles all the string encoding and decoding quite nicely.

查看 javax.xml 和 org.xml 包。这就是我们用于生成 XML 文档的内容,它可以很好地处理所有字符串编码和解码。

---EDIT:

- -编辑:

Hmm, I think I misunderstood the problem. You're not trying to encode a regular string, but some set of arbitrary binary data? In that case the Base64 encoding suggested in an earlier comment is probably the way to go. I believe that's a fairly standard way of encoding binary data in XML.

嗯,我想我误解了这个问题。您不是要对常规字符串进行编码,而是要对一组任意二进制数据进行编码?在这种情况下,先前评论中建议的 Base64 编码可能是可行的方法。我相信这是在 XML 中编码二进制数据的一种相当标准的方式。

回答by basszero

See this question, How do you embed binary data in XML?Instead of converting the byte[] into String then pushing into XML somewhere, convert the byte[] to a String via BASE64 encoding (some XML libraries have a type to do this for you). The BASE64 decode once you get the String back from XML.

看到这个问题,你如何在 XML 中嵌入二进制数据?不是将 byte[] 转换为 String 然后在某处推送到 XML,而是通过 BASE64 编码将 byte[] 转换为 String(一些 XML 库有一种类型可以为您执行此操作)。一旦您从 XML 获取字符串,BASE64 就会解码。

Use http://commons.apache.org/codec/

使用http://commons.apache.org/codec/

You data may be getting messed up due to all sorts of weird character set restrictions and the presence of non-priting characters. Stick w/ BASE64.

由于各种奇怪的字符集限制和非打印字符的存在,您的数据可能会变得混乱。坚持使用 BASE64。

回答by McDowell

String(byte[])treats the data as the default character encoding. So, how bytes get converted from 8-bit values to 16-bit Java Unicode chars will vary not only between operating systems, but can even vary between different users using different codepages on the same machine! This constructor is only good for decoding one of your own text files. Do not try to convert arbitrary bytes to chars in Java!

String(byte[])将数据视为默认字符编码。因此,字节如何从 8 位值转换为 16 位 Java Unicode 字符不仅会因操作系统而异,甚至在同一台机器上使用不同代码页的不同用户之间也会有所不同!此构造函数仅适用于解码您自己的文本文件之一。不要尝试在 Java 中将任意字节转换为字符!

Encoding as base64is a good solution. This is how files are sent over SMTP (e-mail). The (free) Apache Commons Codecproject will do the job.

编码为base64是一个很好的解决方案。这就是通过 SMTP(电子邮件)发送文件的方式。(免费)Apache Commons Codec项目将完成这项工作。

byte[] bytes = loadFile(file);          
//all chars in encoded are guaranteed to be 7-bit ASCII
byte[] encoded = Base64.encodeBase64(bytes);
String printMe = new String(encoded, "US-ASCII");
System.out.println(printMe);
byte[] decoded = Base64.decodeBase64(encoded);

Alternatively, you can use the Java 6 DatatypeConverter:

或者,您可以使用 Java 6 DatatypeConverter

import java.io.*;
import java.nio.channels.*;
import javax.xml.bind.DatatypeConverter;

public class EncodeDecode {    
  public static void main(String[] args) throws Exception {
    File file = new File("/bin/ls");
    byte[] bytes = loadFile(file, new ByteArrayOutputStream()).toByteArray();
    String encoded = DatatypeConverter.printBase64Binary(bytes);
    System.out.println(encoded);
    byte[] decoded = DatatypeConverter.parseBase64Binary(encoded);
    // check
    for (int i = 0; i < bytes.length; i++) {
      assert bytes[i] == decoded[i];
    }
  }

  private static <T extends OutputStream> T loadFile(File file, T out)
                                                       throws IOException {
    FileChannel in = new FileInputStream(file).getChannel();
    try {
      assert in.size() == in.transferTo(0, in.size(), Channels.newChannel(out));
      return out;
    } finally {
      in.close();
    }
  }
}