java 用字符串读/写二进制文件?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/11584508/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-10-31 05:41:01  来源:igfitidea点击:

Reading/writing a BINARY File with Strings?

javastringbinarydatainputstreamdataoutputstream

提问by Bob Let

How can I write/read a string from a binary file?

如何从二进制文件中写入/读取字符串?

I've tried using writeUTF/ readUTF(DataOutputStream/DataInputStream) but it was too much of a hassle.

我试过使用writeUTF/ readUTF(DataOutputStream/DataInputStream) 但它太麻烦了。

Thanks.

谢谢。

回答by Joop Eggen

Forget about FileWriter, DataOutputStream for a moment.

暂时忘掉 FileWriter、DataOutputStream。

  • For binary data one uses OutputStreamand InputStreamclasses. They handle byte[].
  • For text data one uses Readerand Writerclasses. They handle Stringwhich can store all kind of text, as it internally uses Unicode.
  • 对于二进制数据,一种用途OutputStreamInputStream类。他们处理byte[]
  • 对于文本数据,一种用途ReaderWriter类。它们处理String可以存储所有类型的文本,因为它内部使用 Unicode。

The crossover from text to binary data can be done by specifying the encoding, which defaults to the OS encoding.

从文本到二进制数据的交叉可以通过指定编码来完成,默认为操作系统编码。

  • new OutputStreamWriter(outputStream, encoding)
  • string.getBytes(encoding)
  • new OutputStreamWriter(outputStream, encoding)
  • string.getBytes(encoding)

So if you want to avoid byte[]and use String you must abuse an encoding which covers all 256 byte values in any order. So no "UTF-8", but maybe "windows-1252" (also named "Cp1252").

因此,如果您想避免byte[]和使用 String,您必须滥用以任何顺序覆盖所有 256 字节值的编码。所以没有“UTF-8”,但可能是“windows-1252”(也称为“Cp1252”)。

But internally there is a conversion, and in very rare cases problems might happen. For instance écan in Unicode be one code, or two, e+ combining diacritical mark right-accent '. There exists a conversion function (java.text.Normalizer) for that.

但在内部有一个转换,在极少数情况下可能会发生问题。例如é可以在 Unicode 中是一个代码,或两个,e+ 组合变音标记 right-accent '。存在一个转换函数(java.text.Normalizer)。

One case where this already led to problems is file names in different operating systems; MacOS has another Unicode normalisation than Windows, and hence in version control system need special attention.

这已经导致问题的一种情况是不同操作系统中的文件名;MacOS 比 Windows 有另一个 Unicode 规范化,因此在版本控制系统中需要特别注意。

So on principle it is better to use the more cumbersome byte arrays, or ByteArrayInputStream, or java.nio buffers. Mind also that String chars are 16 bit.

所以原则上最好使用比较麻烦的字节数组,或者ByteArrayInputStream,或者java.nio 缓冲区。还要注意 Stringchar是 16 位的。

回答by Peter Lawrey

If you want to write text you can use Writers and Readers.

如果你想写文本,你可以使用 Writers 和 Readers。

You can use Data*Stream writeUTF/readUTF, but the strings have to be less than 64K characters long.

您可以使用 Data*Stream writeUTF/readUTF,但字符串的长度必须小于 64K 个字符。



public static void main(String... args) throws IOException {
    // generate a million random words.
    List<String> words = new ArrayList<String>();
    for (int i = 0; i < 1000000; i++)
        words.add(Long.toHexString(System.nanoTime()));

    writeStrings("words", words);
    List<String> words2 = readWords("words");
    System.out.println("Words are the same is " + words.equals(words2));
}

public static List<String> readWords(String filename) throws IOException {
    DataInputStream dis = new DataInputStream(new BufferedInputStream(new FileInputStream(filename)));
    int count = dis.readInt();
    List<String> words = new ArrayList<String>(count);
    while (words.size() < count)
        words.add(dis.readUTF());
    return words;
}

public static void writeStrings(String filename, List<String> words) throws IOException {
    DataOutputStream dos = new DataOutputStream(new BufferedOutputStream(new FileOutputStream(filename)));
    dos.writeInt(words.size());
    for (String word : words)
        dos.writeUTF(word);
    dos.close();
}

prints

印刷

Words are the same is true