Java 如何在 XML 中嵌入二进制数据?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/19893/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How do you embed binary data in XML?
提问by Bill the Lizard
I have two applications written in Java that communicate with each other using XML messages over the network. I'm using a SAX parser at the receiving end to get the data back out of the messages. One of the requirements is to embed binary data in an XML message, but SAX doesn't like this. Does anyone know how to do this?
我有两个用 Java 编写的应用程序,它们通过网络使用 XML 消息相互通信。我在接收端使用 SAX 解析器从消息中取回数据。要求之一是在 XML 消息中嵌入二进制数据,但 SAX 不喜欢这样。有谁知道如何做到这一点?
UPDATE: I got this working with the Base64class from the apache commons codec library, in case anyone else is trying something similar.
更新:我使用来自apache commons 编解码器库的Base64类进行了此操作,以防其他人尝试类似的操作。
采纳答案by Greg Hurlman
You could encode the binary data using base64 and put it into a Base64 element; the below article is a pretty good one on the subject.
您可以使用 base64 对二进制数据进行编码并将其放入 Base64 元素中;下面的文章是关于这个主题的一篇很好的文章。
回答by mercutio
Maybe encode them into a known set - something like base 64 is a popular choice.
也许将它们编码成一个已知的集合——像 base 64 这样的东西是一个流行的选择。
回答by basszero
Try Base64 encoding/decoding your binary data. Also look into CDATA sections
尝试 Base64 编码/解码您的二进制数据。还要查看 CDATA 部分
回答by Anders Sandvig
I usually encode the binary data with MIME Base64or URL encoding.
我通常使用MIME Base64或URL encoding对二进制数据进行编码。
回答by Mo.
XML is so versatile...
XML 是如此多才多艺...
<DATA>
<BINARY>
<BIT index="0">0</BIT>
<BIT index="1">0</BIT>
<BIT index="2">1</BIT>
...
<BIT index="n">1</BIT>
</BINARY>
</DATA>
XML is like violence - If it doesn't solve your problem, you're not using enough of it.
XML 就像暴力——如果它不能解决您的问题,那么您使用的还不够多。
EDIT:
编辑:
BTW: Base64 + CDATA is probably the best solution
顺便说一句:Base64 + CDATA 可能是最好的解决方案
(EDIT2:
Whoever upmods me, please also upmod the real answer. We don't want any poor soul to come here and actually implement my method because it was the highest ranked on SO, right?)
(EDIT2:
无论谁升级我,也请升级真正的答案。我们不希望任何可怜的灵魂来到这里并实际实施我的方法,因为它是SO上排名最高的,对吧?)
回答by Boris Terzic
Base64 is indeed the right answer but CDATA is not, that's basically saying: "this could be anything", however it must notbe just anything, it has to be Base64 encoded binary data. XML Schema defines Base 64 binary as a primitive datatypewhich you can use in your xsd.
Base64 确实是正确的答案,但 CDATA 不是,这基本上是说:“这可以是任何东西”,但它不能只是任何东西,它必须是 Base64 编码的二进制数据。XML Schema 将Base 64 二进制定义为可以在 xsd 中使用的原始数据类型。
回答by Andrei Savu
回答by Jarek Przygódzki
Any binary-to-text encodingwill do the trick. I use something like that
任何二进制到文本编码都可以解决问题。我使用类似的东西
<data encoding="yEnc>
<![CDATA[ encoded binary data ]]>
</data>
回答by Baxter Tidwell
I had this problem just last week. I had to serialize a PDF file and send it, inside an XML file, to a server.
我上周刚遇到这个问题。我必须序列化一个 PDF 文件并将其在 XML 文件中发送到服务器。
If you're using .NET, you can convert a binary file directly to a base64 string and stick it inside an XML element.
如果您使用 .NET,则可以将二进制文件直接转换为 base64 字符串并将其粘贴到 XML 元素中。
string base64 = Convert.ToBase64String(File.ReadAllBytes(fileName));
Or, there is a method built right into the XmlWriter object. In my particular case, I had to include Microsoft's datatype namespace:
或者,在 XmlWriter 对象中内置了一个方法。在我的特殊情况下,我必须包含 Microsoft 的数据类型命名空间:
StringBuilder sb = new StringBuilder();
System.Xml.XmlWriter xw = XmlWriter.Create(sb);
xw.WriteStartElement("doc");
xw.WriteStartElement("serialized_binary");
xw.WriteAttributeString("types", "dt", "urn:schemas-microsoft-com:datatypes", "bin.base64");
byte[] b = File.ReadAllBytes(fileName);
xw.WriteBase64(b, 0, b.Length);
xw.WriteEndElement();
xw.WriteEndElement();
string abc = sb.ToString();
The string abc looks something that looks like this:
字符串 abc 看起来像这样:
<?xml version="1.0" encoding="utf-16"?>
<doc>
<serialized_binary types:dt="bin.base64" xmlns:types="urn:schemas-microsoft-com:datatypes">
JVBERi0xLjMKJaqrrK0KNCAwIG9iago8PCAvVHlwZSAvSW5mbw...(plus lots more)
</serialized_binary>
</doc>
回答by Jamie
While the other answers are mostly fine, you could try another, more space-efficient, encoding method like yEnc. (yEnc wikipedia link) With yEnc also get checksum capability right "out of the box". Read and links below. Of course, because XML does not have a native yEnc type your XML schema should be updated to properly describe the encoded node.
虽然其他答案大多都很好,但您可以尝试另一种更节省空间的编码方法,如 yEnc。( yEnc 维基百科链接) 使用 yEnc 还可以“开箱即用”地获得校验和功能。阅读下面的链接。当然,因为 XML 没有本地 yEnc 类型,您的 XML 模式应该更新以正确描述编码节点。
Why: Due to the encoding strategies base64/63, uuencode et al. encodings increase the amount of data (overhead) you need to store and transfer by roughly 40% (vs. yEnc's 1-2%). Depending on what you're encoding, 40% overhead could be/become an issue.
为什么:由于编码策略 base64/63,uuencode 等。编码使您需要存储和传输的数据量(开销)增加了大约 40%(而 yEnc 为 1-2%)。根据您编码的内容,40% 的开销可能成为/成为一个问题。
yEnc - Wikipedia abstract:https://en.wikipedia.org/wiki/YEncyEnc is a binary-to-text encoding scheme for transferring binary files in messages on Usenet or via e-mail. ... An additional advantage of yEnc over previous encoding methods, such as uuencode and Base64, is the inclusion of a CRC checksum to verify that the decoded file has been delivered intact. ?
yEnc - 维基百科摘要:https://en.wikipedia.org/wiki/YEnc yEnc 是一种二进制到文本的编码方案,用于在 Usenet 上的消息中或通过电子邮件传输二进制文件。... yEnc 相对于以前的编码方法(例如 uuencode 和 Base64)的另一个优势是包含 CRC 校验和以验证解码的文件是否已完整传送。?