java 用字符串读/写二进制文件?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/11584508/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Reading/writing a BINARY File with Strings?
提问by Bob Let
How can I write/read a string from a binary file?
如何从二进制文件中写入/读取字符串?
I've tried using writeUTF
/ readUTF
(DataOutputStream/DataInputStream) but it was too much of a hassle.
我试过使用writeUTF
/ readUTF
(DataOutputStream/DataInputStream) 但它太麻烦了。
Thanks.
谢谢。
回答by Joop Eggen
Forget about FileWriter, DataOutputStream for a moment.
暂时忘掉 FileWriter、DataOutputStream。
- For binary data one uses
OutputStream
andInputStream
classes. They handlebyte[]
. - For text data one uses
Reader
andWriter
classes. They handleString
which can store all kind of text, as it internally uses Unicode.
- 对于二进制数据,一种用途
OutputStream
和InputStream
类。他们处理byte[]
。 - 对于文本数据,一种用途
Reader
和Writer
类。它们处理String
可以存储所有类型的文本,因为它内部使用 Unicode。
The crossover from text to binary data can be done by specifying the encoding, which defaults to the OS encoding.
从文本到二进制数据的交叉可以通过指定编码来完成,默认为操作系统编码。
new OutputStreamWriter(outputStream, encoding)
string.getBytes(encoding)
new OutputStreamWriter(outputStream, encoding)
string.getBytes(encoding)
So if you want to avoid byte[]
and use String you must abuse an encoding which covers all 256 byte values in any order. So no "UTF-8", but maybe "windows-1252" (also named "Cp1252").
因此,如果您想避免byte[]
和使用 String,您必须滥用以任何顺序覆盖所有 256 字节值的编码。所以没有“UTF-8”,但可能是“windows-1252”(也称为“Cp1252”)。
But internally there is a conversion, and in very rare cases problems might happen. For instance é
can in Unicode be one code, or two, e
+ combining diacritical mark right-accent '
. There exists a conversion function (java.text.Normalizer) for that.
但在内部有一个转换,在极少数情况下可能会发生问题。例如é
可以在 Unicode 中是一个代码,或两个,e
+ 组合变音标记 right-accent '
。存在一个转换函数(java.text.Normalizer)。
One case where this already led to problems is file names in different operating systems; MacOS has another Unicode normalisation than Windows, and hence in version control system need special attention.
这已经导致问题的一种情况是不同操作系统中的文件名;MacOS 比 Windows 有另一个 Unicode 规范化,因此在版本控制系统中需要特别注意。
So on principle it is better to use the more cumbersome byte arrays, or ByteArrayInputStream, or java.nio buffers. Mind also that String char
s are 16 bit.
所以原则上最好使用比较麻烦的字节数组,或者ByteArrayInputStream,或者java.nio 缓冲区。还要注意 Stringchar
是 16 位的。
回答by Peter Lawrey
If you want to write text you can use Writers and Readers.
如果你想写文本,你可以使用 Writers 和 Readers。
You can use Data*Stream writeUTF/readUTF, but the strings have to be less than 64K characters long.
您可以使用 Data*Stream writeUTF/readUTF,但字符串的长度必须小于 64K 个字符。
public static void main(String... args) throws IOException {
// generate a million random words.
List<String> words = new ArrayList<String>();
for (int i = 0; i < 1000000; i++)
words.add(Long.toHexString(System.nanoTime()));
writeStrings("words", words);
List<String> words2 = readWords("words");
System.out.println("Words are the same is " + words.equals(words2));
}
public static List<String> readWords(String filename) throws IOException {
DataInputStream dis = new DataInputStream(new BufferedInputStream(new FileInputStream(filename)));
int count = dis.readInt();
List<String> words = new ArrayList<String>(count);
while (words.size() < count)
words.add(dis.readUTF());
return words;
}
public static void writeStrings(String filename, List<String> words) throws IOException {
DataOutputStream dos = new DataOutputStream(new BufferedOutputStream(new FileOutputStream(filename)));
dos.writeInt(words.size());
for (String word : words)
dos.writeUTF(word);
dos.close();
}
prints
印刷
Words are the same is true