java:写大文件?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/2017868/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
java : writing large files?
提问by Ashika Umanga Umagiliya
Greetings , I get huge number of records from database and write into a file.I was wondering what the best way to write huge files. (1Gb - 10Gb).
问候,我从数据库中获取大量记录并写入文件。我想知道写入大文件的最佳方法是什么。(1Gb - 10Gb)。
Currently I am using BufferedWriter
目前我正在使用 BufferedWriter
BufferedWriter mbrWriter=new BufferedWriter(new FileWriter(memberCSV));
while(done){
//do writings
}
mbrWriter.close();
回答by BalusC
If you really insist using Java for this, then the best way would be to write immediatelyas soon as the data comes in and thus not to collect allthe data from ResultSetinto Java's memory first. You would need at least that much of free memory in Java otherwise.
如果您真的坚持为此使用 Java,那么最好的方法是在数据传入后立即写入,因此不要先将所有数据收集ResultSet到 Java 的内存中。否则,您将需要至少那么多的 Java 可用内存。
Thus, do e.g.
因此,做例如
while (resultSet.next()) {
writer.write(resultSet.getString("columnname"));
// ...
}
That said, most decent DB's ships with builtin export-to-CSV capabilities which are undoubtely way more efficient than you could ever do in Java. You didn't mention which one you're using, but if it was for example MySQL, you could have used the LOAD DATA INFILEfor this. Just refer the DB-specific documentation. Hope this gives new insights.
也就是说,大多数体面的 DB 都带有内置的导出到 CSV 功能,这无疑比您在 Java 中所做的更有效率。您没有提到您使用的是哪个,但如果它是例如 MySQL,您可以使用LOAD DATA INFILE。只需参考特定于数据库的文档。希望这能提供新的见解。
回答by Stephen C
The default buffer size for a BufferedWriter is 8192. If you are going to be writing squigabyte files, you might want to increase this using the 2 argument constructor; e.g.
BufferedWriter 的默认缓冲区大小为 8192。如果您要写入 squigabyte 文件,您可能希望使用 2 参数构造函数增加它;例如
int buffSize = ... // 1 megabyte or so
BufferedWriter mbrWriter = new BufferedWriter(new FileWriter(memberCSV), buffSize);
This should reduce the number of syscalls needed to write the file.
这应该会减少写入文件所需的系统调用数量。
But I doubt that this would make more than a couple of percent difference. Pulling rows from the resultset will probably be the main performance bottleneck. For significant improvements in performance you'd need to use the database's native bulk export facilities.
但我怀疑这会产生超过几个百分点的差异。从结果集中提取行可能是主要的性能瓶颈。为了显着提高性能,您需要使用数据库的本地批量导出工具。
回答by Henry Hammond
Im not 100% sure, but it appears tha BufferedReader loads the data into a Buffer in the RAM. Java can use 128mb Ram (unless otherwise specified), so the BufferedReader will likely overflow java's memory causing an error. Try using InputStreamReader and FileInputStream to read and then store the data in a char, then just write that char using a FileOutputStream.
我不是 100% 确定,但似乎 BufferedReader 将数据加载到 RAM 中的缓冲区中。Java 可以使用 128mb Ram(除非另有说明),因此 BufferedReader 可能会溢出 Java 的内存,从而导致错误。尝试使用 InputStreamReader 和 FileInputStream 读取数据,然后将数据存储在一个字符中,然后使用 FileOutputStream 写入该字符。

