java 将字节数组从编码 A 转换为编码 B
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/34413681/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Convert a byte array from Encoding A to Encoding B
提问by Display name
I have a pretty interesting topic - at least for me. Given a ByteArrayOutputStream with bytes for example in UTF-8, I need a function that can "translate" those bytes into another - new - ByteArrayOutputStream in for example UTF-16, or ASCII or you name it. My naive approach would have been to use a an InputStreamReader and give in the the desired encoding, but that didn't work because that'll read into a char[] and I can only write byte[] to the new BAOS.
我有一个非常有趣的话题——至少对我来说是这样。给定一个带有字节的 ByteArrayOutputStream 例如在 UTF-8 中,我需要一个函数可以将这些字节“转换”为另一个 - 新 - 例如 UTF-16 或 ASCII 的 ByteArrayOutputStream 或您命名它。我天真的方法是使用 InputStreamReader 并提供所需的编码,但这不起作用,因为它会读入 char[] 而我只能将 byte[] 写入新的 BAOS。
public byte[] convertStream(Charset encoding) {
ByteArrayInputStream original = new ByteArrayInputStream(raw.toByteArray());
InputStreamReader contentReader = new InputStreamReader(original, encoding);
ByteArrayOutputStream converted = new ByteArrayOutputStream();
int readCount;
char[] buffer = new char[4096];
while ((readCount = contentReader.read(buffer, 0, buffer.length)) != -1)
converted.write(buffer, 0, readCount);
return converted.toByteArray();
}
Now, this obviously doesn't work and I'm looking for a way to make this scenario possible, without building a String out of the byte[].
现在,这显然不起作用,我正在寻找一种方法来使这种情况成为可能,而无需从字节 [] 中构建字符串。
@Edit: Since it seems rather hard to read the obvious things. 1) raw: ByteArrayOutputStream containing bytes of a BINARY object sent to us from clients. The bytes usually come in UTF-8 as a part of a HTTP Message. 2) The goal here is to send this BINARY data forward to an internal System that's not flexible - well this is an internal System - and it accepts such attachments in UTF-16. I don't know why don't even ask, it does so.
@Edit:因为似乎很难阅读明显的东西。1) raw: ByteArrayOutputStream 包含从客户端发送给我们的 BINARY 对象的字节。字节通常以 UTF-8 格式作为 HTTP 消息的一部分出现。2) 这里的目标是将这个 BINARY 数据转发到一个不灵活的内部系统 - 这是一个内部系统 - 它接受 UTF-16 格式的此类附件。我不知道为什么甚至不问,它确实如此。
So to justify my question: Is there a way to convert a byte array from Charset A to Charset B or encoding of your choise. Once again Building a String is NOT what I'm after.
所以为了证明我的问题:有没有办法将字节数组从字符集 A 转换为字符集 B 或您选择的编码。再一次构建一个字符串不是我所追求的。
Thank you and hope that clears up questionable parts :).
谢谢你,希望能解决有问题的部分:)。
回答by Jon Skeet
As mentioned in comments, I'd just convert to a string:
正如评论中提到的,我只是转换为一个字符串:
String text = new String(raw.toByteArray(), encoding);
byte[] utf8 = text.getBytes(StandardCharsets.UTF_8);
However, if that's not feasible (for some unspecified reason...) what you've got now is nearly there - you just need to add an OutputStreamWriter
into the mix:
但是,如果这不可行(由于某些未指明的原因......)你现在已经差不多了 - 你只需要OutputStreamWriter
在组合中添加一个:
// Nothing here should throw IOException in reality - work out what you want to do.
public byte[] convertStream(Charset encoding) throws IOException {
ByteArrayInputStream original = new ByteArrayInputStream(raw.toByteArray());
InputStreamReader contentReader = new InputStreamReader(original, encoding);
int readCount;
char[] buffer = new char[4096];
try (ByteArrayOutputStream converted = new ByteArrayOutputStream()) {
try (Writer writer = new OutputStreamWriter(converted, StandardCharsets.UTF_8)) {
while ((readCount = contentReader.read(buffer, 0, buffer.length)) != -1) {
writer.write(buffer, 0, readCount);
}
}
return converted.toByteArray();
}
}
Note that you're still creating an extra temporary copy of the data in memory, admittedly in UTF-8 rather than UTF-16... but fundamentally this is hardly any more efficient than creating a string.
请注意,您仍在内存中创建数据的额外临时副本,无可否认是使用 UTF-8 而不是 UTF-16...
If memory efficiency is a particular concern, you couldperform multiple passes in order to work out how many bytes will be required, create a byte array of the write length, and then adjust the code to write straight into that byte array.
如果内存效率是一个特别关注的问题,您可以执行多次传递以计算需要多少字节,创建写入长度的字节数组,然后调整代码以直接写入该字节数组。