java.lang.OutOfMemoryError: GC 开销限制超出 excel 阅读器
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/19435658/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
java.lang.OutOfMemoryError: GC overhead limit exceeded excel reader
提问by Eduardo Dennis
I am getting a java.lang.OutOfMemoryError: GC overhead limit exceeded exception when I try to run the program below. This program's main method access' a specified directory and iterates over all the files that contain .xlsx. This works fine as I tested it before any of the other logic. And the method it is calling xlsx which basically converts the xlsx file into csv and appends it to an existing file works fine as well. But when I put that in the for loop, this is when I get this exception. I am guessing it there is a conflict when after it has opened the xlsx and converted it the csv and its time to open the second maybe I have to somehow close this line:
当我尝试运行下面的程序时,出现 java.lang.OutOfMemoryError: GC 开销限制超出异常。该程序的 main 方法访问指定的目录并遍历所有包含 .xlsx 的文件。这工作正常,因为我在任何其他逻辑之前对其进行了测试。并且它调用 xlsx 的方法基本上将 xlsx 文件转换为 csv 并将其附加到现有文件中也可以正常工作。但是当我把它放在 for 循环中时,这就是我得到这个异常的时候。我猜它在打开 xlsx 并将其转换为 csv 和打开第二个的时间之后会发生冲突,也许我必须以某种方式关闭此行:
File inputFile = new File("C:\Users\edennis.AD\Desktop\test\"+nameOfFile);
Thats my only guess right now, that it when this file is interfering when the second iteration of the loop comes. I am using the Apache POI libraries to manipulate the excel files. Thanks in Advance!
那是我现在唯一的猜测,当循环的第二次迭代到来时,这个文件会干扰它。我正在使用 Apache POI 库来操作 excel 文件。提前致谢!
import java.io.File;
import java.io.FileInputStream;
import java.io.FileNotFoundException;
import java.io.FileOutputStream;
import org.apache.poi.ss.usermodel.Cell;
import org.apache.poi.ss.usermodel.Row;
import org.apache.poi.xssf.usermodel.XSSFSheet;
import org.apache.poi.xssf.usermodel.XSSFWorkbook;
public class ExcelMan {
public static void main(String[] args) throws FileNotFoundException {
int i =0;
File dir = new File("C:\Users\edennis.AD\Desktop\test\");
for (File child : dir.listFiles()) {
//initializing whether the sheet sent to method is first or not, and //counting iterations for each time the for loop as run
boolean firstSheet = true;
i++;
String nameOfFile = child.getName();
if (nameOfFile.contains(".xlsx")){
System.out.println(nameOfFile);
if (i != 0)
firstSheet = false;
File inputFile = new File("C:\Users\edennis.AD\Desktop\test\"+nameOfFile);
// writing excel data to csv
File outputFile = new File("C:\Users\edennis.AD\Desktop\test\memb.csv");
xlsx(inputFile, outputFile, firstSheet);
}
// }
}
}
static void xlsx(File inputFile, File outputFile, boolean firstSheet) {
// For storing data into CSV files
StringBuffer data = new StringBuffer();
try {
FileOutputStream fos = new FileOutputStream(outputFile, true);
// Get the workbook object for XLSX file
XSSFWorkbook wBook = new XSSFWorkbook(new FileInputStream(inputFile));
// Get first sheet from the workbook
XSSFSheet sheet = wBook.getSheetAt(7);
Row row;
Cell cell;
// Iterate through each rows from first sheet
java.util.Iterator<Row> rowIterator = sheet.iterator();
while (rowIterator.hasNext()) {
if (firstSheet != true)
rowIterator.next();
row = rowIterator.next();
// For each row, iterate through each columns
java.util.Iterator<Cell> cellIterator = row.cellIterator();
while (cellIterator.hasNext()) {
cell = cellIterator.next();
switch (cell.getCellType()) {
case Cell.CELL_TYPE_BOOLEAN:
data.append(cell.getBooleanCellValue() + "^");
break;
case Cell.CELL_TYPE_NUMERIC:
data.append(cell.getNumericCellValue() + "^");
break;
case Cell.CELL_TYPE_STRING:
data.append(cell.getStringCellValue() + "^");
break;
case Cell.CELL_TYPE_BLANK:
data.append("" + "^");
break;
default:
data.append(cell + "^");
}
}
data.append("\r\n");
}
fos.write(data.toString().getBytes());
fos.close();
} catch (Exception ioe) {
ioe.printStackTrace();
}
}
}
Additional Info:
附加信息:
Below is the stacktrace
下面是堆栈跟踪
MR.xlsx
Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
at org.apache.xmlbeans.impl.store.Cur$CurLoadContext.attr(Cur.java:3039)
at org.apache.xmlbeans.impl.store.Cur$CurLoadContext.attr(Cur.java:3060)
at org.apache.xmlbeans.impl.store.Locale$SaxHandler.startElement(Locale.java:3250)
at org.apache.xmlbeans.impl.piccolo.xml.Piccolo.reportStartTag(Piccolo.java:1082)
at org.apache.xmlbeans.impl.piccolo.xml.PiccoloLexer.parseAttributesNS(PiccoloLexer.java:1802)
at org.apache.xmlbeans.impl.piccolo.xml.PiccoloLexer.parseOpenTagNS(PiccoloLexer.java:1521)
at org.apache.xmlbeans.impl.piccolo.xml.PiccoloLexer.parseTagNS(PiccoloLexer.java:1362)
at org.apache.xmlbeans.impl.piccolo.xml.PiccoloLexer.parseXMLNS(PiccoloLexer.java:1293)
at org.apache.xmlbeans.impl.piccolo.xml.PiccoloLexer.parseXML(PiccoloLexer.java:1261)
at org.apache.xmlbeans.impl.piccolo.xml.PiccoloLexer.yylex(PiccoloLexer.java:4808)
at org.apache.xmlbeans.impl.piccolo.xml.Piccolo.yylex(Piccolo.java:1290)
at org.apache.xmlbeans.impl.piccolo.xml.Piccolo.yyparse(Piccolo.java:1400)
at org.apache.xmlbeans.impl.piccolo.xml.Piccolo.parse(Piccolo.java:714)
at org.apache.xmlbeans.impl.store.Locale$SaxLoader.load(Locale.java:3439)
at org.apache.xmlbeans.impl.store.Locale.parseToXmlObject(Locale.java:1270)
at org.apache.xmlbeans.impl.store.Locale.parseToXmlObject(Locale.java:1257)
at org.apache.xmlbeans.impl.schema.SchemaTypeLoaderBase.parse(SchemaTypeLoaderBase.java:345)
at org.openxmlformats.schemas.spreadsheetml.x2006.main.WorksheetDocument$Factory.parse(Unknown Source)
at org.apache.poi.xssf.usermodel.XSSFSheet.read(XSSFSheet.java:138)
at org.apache.poi.xssf.usermodel.XSSFSheet.onDocumentRead(XSSFSheet.java:130)
at org.apache.poi.xssf.usermodel.XSSFWorkbook.onDocumentRead(XSSFWorkbook.java:286)
at org.apache.poi.POIXMLDocument.load(POIXMLDocument.java:159)
at org.apache.poi.xssf.usermodel.XSSFWorkbook.<init>(XSSFWorkbook.java:207)
at ExcelMan.xlsx(ExcelMan.java:71)
at ExcelMan.main(ExcelMan.java:47)
The excel files are pretty big, there is going to be around 30 or so in the directory and the biggest one is about 170 MB, with these file sizes should I change from POI ?
excel 文件非常大,目录中大约有 30 个左右,最大的文件大约 170 MB,我应该从 POI 更改这些文件大小吗?
采纳答案by Ortwin Angermeier
Whats the size of your excel file? I had a similar problem once, creating csv
out of xls
. In my case i had to switch to the event driven model, take a look at XSSF and SAX (Event API). I too ran out of memory (with -Xmx8g
)
你的excel文件的大小是多少?我曾经遇到过类似的问题,创建csv
了xls
. 就我而言,我不得不切换到事件驱动模型,看看XSSF 和 SAX (Event API)。我也用完了内存(与-Xmx8g
)
A quote from the linked site:
来自链接网站的报价:
Further effort on HSSF is going to focus on the following major areas:
- Performance: POI currently uses a lot of memory for large sheets.
HSSF 的进一步努力将集中在以下主要领域:
- 性能:POI 目前使用大量内存来处理大表。
回答by Deadron
Files do not need to be closed. As long as you aren't maintaining references to them they will be GCd as they fall out of scope.
文件不需要关闭。只要您不维护对它们的引用,它们就会因超出范围而成为 GCd。
The line if (i != 0)
will always evaluate to true since you are incrementing the variable i at least once before hitting this conditional. Thus firstSheet is always set to false.
该行将if (i != 0)
始终评估为真,因为您在达到此条件之前至少增加了一次变量 i 。因此 firstSheet 始终设置为 false。
The line
线
File inputFile = new File("C:\Users\edennis.AD\Desktop\test\"+nameOfFile);
is creating new files. However, you already have a file object for this path represented by child
正在创建新文件。但是,您已经有了这个路径的文件对象,由child
You are always writing to the same file, which you recreate a file object and new FileOutputStream for every time you iterate over the initial directories children even though all the writes are to the same file.
您总是写入同一个文件,即使所有写入都写入同一个文件,每次迭代初始目录子级时,您都会重新创建一个文件对象和新的 FileOutputStream 。
You are not closing your FileOutputStream in a finally block and it may not be properly closing your FileOutputStream under error conditions.
您没有在 finally 块中关闭 FileOutputStream,并且在错误情况下可能无法正确关闭 FileOutputStream。
Use StringBuilder instead of StringBuffer unless you need synchronized methods for building the string.
除非您需要同步方法来构建字符串,否则请使用 StringBuilder 而不是 StringBuffer。
Consider using a FileWriter instead of an intermediary StringBuilder. Instead of writing to a Builder use
考虑使用 FileWriter 而不是中间 StringBuilder。而不是写入 Builder 使用
PrintWriter writer = new PrintWriter(new BufferedWriter(new FileWriter(outputFile, true))))
Instead of doing data.append
use writer.print
or writer.println
Note: PrintWriter and Buffered Writer wrappers aren't strictly necessary, but useful.
而不是data.append
使用 usewriter.print
或writer.println
注意: PrintWriter 和 Buffered Writer 包装器不是绝对必要的,但很有用。
If you refer to the XSSFWorkbook javadocs for the constructors options you will see it says "Using an InputStream requires more memory than using a File, so if a File is available then you should instead do something like 'example follows'" http://poi.apache.org/apidocs/org/apache/poi/xssf/usermodel/XSSFWorkbook.html#XSSFWorkbook(java.io.InputStream)
如果您参考 XSSFWorkbook javadocs 的构造函数选项,您将看到它说“使用 InputStream 需要比使用文件更多的内存,因此如果文件可用,那么您应该执行类似'示例如下'的操作” http:// poi.apache.org/apidocs/org/apache/poi/xssf/usermodel/XSSFWorkbook.html#XSSFWorkbook(java.io.InputStream)
Increasing your heap size will likely be a workable solution if all else fails. Assuming you don't have the potential for significantly larger files than what you are currently testing with. Increase heap size in Java
如果所有其他方法都失败了,增加堆大小可能是一个可行的解决方案。假设您没有比当前正在测试的文件大得多的文件的潜力。在 Java 中增加堆大小