在java中读取大型CSV

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/20043181/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-12 22:37:13  来源:igfitidea点击:

Read large CSV in java

javafile-ioopencsv

提问by Ninad Pingale

I want to read huge data from CSV, containing around 500,000 rows. I am using OpenCSV library for it. My code for it is like this

我想从 CSV 读取大量数据,包含大约 500,000 行。我正在使用 OpenCSV 库。我的代码是这样的

    CsvToBean<User> csvConvertor = new CsvToBean<User>();
    List<User> list = null;
    try {
        list =csvConvertor.parse(strategy, new BufferedReader(new FileReader(filepath)));
    } catch (FileNotFoundException e) {
        e.printStackTrace();
    }

Upto 200,000 records,data is read into list of User bean objects. But for data more than that I am getting

多达 200,000 条记录,数据被读入 User bean 对象列表。但是对于我得到的更多数据

java.lang.OutOfMemoryError: Java heap space

I have this memory setting in "eclipse.ini" file

我在“eclipse.ini”文件中有这个内存设置

-Xms256m
-Xmx1024m

I am thinking a solution of splitting the huge file in separate files and read those files again, which I think is a lengthy solution.

我正在考虑将大文件拆分为单独的文件并再次读取这些文件的解决方案,我认为这是一个冗长的解决方案。

Is there any other way, by which I can avoid OutOfMemoryError exception.

有没有其他方法可以避免 OutOfMemoryError 异常。

采纳答案by urbiwanus

Read line by line

逐行读取

something like this

像这样的东西

    CSVReader reader = new CSVReader(new FileReader("yourfile.csv"));
    String [] nextLine;
    while ((nextLine = reader.readNext()) != null) {
        // nextLine[] is an array of values from the line
        System.out.println(nextLine[0] + nextLine[1] + "etc...");
    }

回答by Sa?a ?ijak

You must set -Xmxvalue for your app, not eclipse in this case. In "Run configurations", select your app, then go to "Arguments" tab and in the "VM arguments" set that value, for example -Xmx1024m. You can open Run configurations by right clicking in the file you wish to run, then select Run As and then selecting "Run configurations..."

-Xmx在这种情况下,您必须为您的应用程序设置值,而不是 eclipse。在“运行配置”中,选择您的应用程序,然后转到“参数”选项卡并在“虚拟机参数”中设置该值,例如-Xmx1024m。您可以通过右键单击要运行的文件来打开运行配置,然后选择运行方式,然后选择“运行配置...”

回答by Gautam Viradiya

Below Example through you read n number of records from csv file.

下面的示例通过您从 csv 文件中读取了 n 条记录。

import java.io.BufferedReader;
import java.io.FileNotFoundException;
import java.io.FileReader;
import java.io.IOException;

public class ReadCSV 
{
    public static void main(String[] args) 
    {
        String csvFile = "C:/Users/LENOVO/Downloads/Compressed/GeoIPCountryWhois.csv";
        BufferedReader br = null;
        String line = "";
        String cvsSplitBy = ",";

        try 
        {
            br = new BufferedReader(new FileReader(csvFile));
            while ((line = br.readLine()) != null) 
            {
                // use comma as separator
                String[] country = line.split(cvsSplitBy);

                System.out.println("Country [code= " + country[4] + " , name=" + country[5] + "]");
            }

        }
        catch (FileNotFoundException e) 
        {
            e.printStackTrace();
        } 
        catch (IOException e) 
        {
            e.printStackTrace();
        } 
        finally 
        {
            if (br != null) 
            {
                try 
                {
                    br.close();
                } 
                catch (IOException e) 
                {
                    e.printStackTrace();
                }
            }
        }
        System.out.println("Done");
    }
}