Java 如何将 Reader 转换为 InputStream,将 Writer 转换为 OutputStream?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/62241/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How to convert a Reader to InputStream and a Writer to OutputStream?
提问by Andrei Savu
Is there an easy way to avoid dealing with text encoding problems?
有没有一种简单的方法可以避免处理文本编码问题?
采纳答案by Peter
You can't really avoid dealing with the text encoding issues, but there are existing solutions in Apache Commons:
您无法真正避免处理文本编码问题,但 Apache Commons 中有现有的解决方案:
Reader
toInputStream
:ReaderInputStream
Writer
toOutputStream
:WriterOutputStream
Reader
到InputStream
:ReaderInputStream
Writer
到OutputStream
:WriterOutputStream
You just need to pick the encoding of your choice.
您只需要选择您选择的编码。
回答by Tom Hawtin - tackline
The obvious names for these classes are ReaderInputStream and WriterOutputStream. Unfortunately these are not included in the Java library. However, google is your friend.
这些类的明显名称是 ReaderInputStream 和 WriterOutputStream。不幸的是,这些不包含在 Java 库中。然而,谷歌是你的朋友。
I'm not sure that it is going to get around all text encoding problems, which are nightmarish.
我不确定它是否会解决所有文本编码问题,这是噩梦般的。
There is an RFE,but it's Closed, will not fix.
有一个 RFE,但它已关闭,不会修复。
回答by Sam Barnum
Are you trying to write the contents of a Reader
to an OutputStream
? If so, you'll have an easier time wrapping the OutputStream
in an OutputStreamWriter
and write the char
s from the Reader
to the Writer
, instead of trying to convert the reader to an InputStream
:
您是否试图将 a 的内容写入Reader
an OutputStream
?如果是这样,您将更轻松地将 the 包装OutputStream
在 an 中OutputStreamWriter
并将char
s 从 the写入Reader
到Writer
,而不是尝试将读取器转换为 an InputStream
:
final Writer writer = new BufferedWriter(new OutputStreamWriter( urlConnection.getOutputStream(), "UTF-8" ) );
int charsRead;
char[] cbuf = new char[1024];
while ((charsRead = data.read(cbuf)) != -1) {
writer.write(cbuf, 0, charsRead);
}
writer.flush();
// don't forget to close the writer in a finally {} block
回答by Phil Harvey
Also note that, if you're starting off with a String, you can skip creating a StringReader and create an InputStream in one step using org.apache.commons.io.IOUtils from Commons IOlike so:
另请注意,如果您从 String 开始,您可以跳过创建 StringReader 并使用Commons IO 中的org.apache.commons.io.IOUtils 一步创建 InputStream ,如下所示:
InputStream myInputStream = IOUtils.toInputStream(reportContents, "UTF-8");
Of course you still need to think about the text encoding, but at least the conversion is happening in one step.
当然你仍然需要考虑文本编码,但至少转换是一步完成的。
回答by Ritesh Tendulkar
If you are starting off with a String you can also do the following:
如果您从字符串开始,您还可以执行以下操作:
new ByteArrayInputStream(inputString.getBytes("UTF-8"))
回答by dfrankow
You can't avoid text encoding issues, but Apache commons-iohas
您无法避免文本编码问题,但Apache commons-io有
Note these are the libraries referred to in Peter's answer of koders.com, just links to the library instead of source code.
请注意,这些是 Peter 在 koders.com 的回答中提到的库,只是链接到库而不是源代码。
回答by Peter Ford
Well, a Reader deals with characters and an InputStream deals with bytes. The encoding specifies how you wish to represent your characters as bytes, so you can't really ignore the issue. As for avoiding problems, my opinion is: pick one charset (e.g. "UTF-8") and stick with it.
嗯,Reader 处理字符而 InputStream 处理字节。编码指定了您希望如何将字符表示为字节,因此您不能真正忽略该问题。至于避免问题,我的意见是:选择一个字符集(例如“UTF-8”)并坚持使用。
Regarding how to actually do it, as has been pointed out, "the obvious names for these classes are ReaderInputStreamand WriterOutputStream." Surprisingly, "these are not included in the Java library" even though the 'opposite' classes, InputStreamReaderand OutputStreamWriterareincluded.
关于如何真正做到这一点,正如已经指出的那样,“这些类的明显名称是ReaderInputStream和WriterOutputStream。”令人惊讶的是,“ Java 库中不包含这些”,即使“相反”类InputStreamReader和OutputStreamWriter是包括。
So, lots of people have come up with their own implementations, including ApacheCommons IO. Depending on licensing issues, you will probably be able to include the commons-io library in your project, or even copy a portion of the source code (which is downloadable here).
所以,很多人提出了他们自己的实现,包括Apache Commons IO。根据许可问题,您可能能够在您的项目中包含 commons-io 库,甚至复制一部分源代码(可在此处下载)。
- Apache ReaderInputStream: API/ source code direct link
- Apache WriterOutputStream: API/ source code direct link
As you can see, both classes' documentation states that "all charset encodings supported by the JRE are handled correctly".
如您所见,这两个类的文档都声明“JRE 支持的所有字符集编码都得到了正确处理”。
N.B. A comment on one of the other answers here mentions this bug. But that affects the Apache AntReaderInputStream class (here), notthe Apache Commons IOReaderInputStream class.
注意这里对其他答案之一的评论提到了这个错误。但这会影响 Apache AntReaderInputStream 类(此处),而不是Apache Commons IOReaderInputStream 类。
回答by romeara
A warning when using WriterOutputStream - it doesn't always handle writing binary data to a file properly/the same as a regular output stream. I had an issue with this that took me awhile to track down.
使用 WriterOutputStream 时的警告 - 它并不总是正确处理将二进制数据写入文件/与常规输出流相同。我遇到了一个问题,我花了一段时间才找到。
If you can, I'd recommend using an output stream as your base, and if you need to write strings, use an OUtputStreamWriter wrapper around the stream to do it. It is far more reliable to convert text to bytes than the other way around, which is likely why WriterOutputStream is not a part of the standard Java library
如果可以,我建议您使用输出流作为基础,如果您需要编写字符串,请在流周围使用 OutputStreamWriter 包装器来完成。将文本转换为字节比其他方式更可靠,这可能是 WriterOutputStream 不是标准 Java 库的一部分的原因
回答by Oliv
Use:
用:
new CharSequenceInputStream(html, StandardCharsets.UTF_8);
This way does not require an upfront conversion to String
and then to byte[]
, which allocates lot more heap memory, in case the report is large. It converts to bytes on the fly as the stream is read, right from the StringBuffer.
这种方式不需要预先转换到String
然后到byte[]
,这会分配更多的堆内存,以防报告很大。它在读取流时即时转换为字节,直接从 StringBuffer 读取。
It uses CharSequenceInputStreamfrom Apache Commons IO project.
它使用来自 Apache Commons IO 项目的CharSequenceInputStream。