java 如何使用POI检查excel中的重复记录?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/31320795/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-11-02 18:26:27  来源:igfitidea点击:

How to check for duplicate records in excel using POI?

javaexcelapache-poi

提问by coder

below is the code for reading the excel file using poi: which is working fine

下面是使用 poi 读取 excel 文件的代码:工作正常

public class ReadExcelDemo { 
    public static void main(String[] args)  { 
 try {           
     FileInputStream file = new FileInputStream(new File("demo.xlsx"));  
     List sheetData = new ArrayList();

    XSSFWorkbook workbook = new XSSFWorkbook(file); 

    XSSFSheet sheet = workbook.getSheetAt(0);
  ArrayList<Form> vipList = new ArrayList<Form>();
    Iterator<Row> rowIterator = sheet.iterator();   
    while (rowIterator.hasNext()) {            
        Row row = rowIterator.next();

        Iterator<Cell> cellIterator = row.cellIterator();   
        List data = new ArrayList();

        while (cellIterator.hasNext())  { 

            Cell cell = cellIterator.next();    

            switch (cell.getCellType())                     {        
                case Cell.CELL_TYPE_NUMERIC:  System.out.print(cell.getNumericCellValue() + "\t"); 
            break;                       
                case Cell.CELL_TYPE_STRING: System.out.print(cell.getStringCellValue() + "\t");  
            break;     
            }           
        }

    }  


    }

Now if excel contains duplicate records I should be able to print a simple error message. How do I do that?

现在,如果 excel 包含重复记录,我应该能够打印一条简单的错误消息。我怎么做?

Example:

例子:

ID    Firstname     Lastname     Address
  1     Ron           wills      Paris
  1     Ron           wills      London

Now i want to check the duplicates only for the 3 columns: ID,Firstname and Lastname together. If these columns together contain same data as shown in the above example then it needs to be considered duplicate.

现在我只想检查 3 列的重复项:ID、名字和姓氏。如果这些列一起包含如上例所示的相同数据,则需要将其视为重复数据。

I have a pojo class Form consisting of the id,firstname and lastname with getters

我有一个 pojo 类表单,由 id、firstname 和 lastname 和 getter 组成

and setters. Each record read is written to the pojo class using the setter methods. Then I am getting the values using getters and adding them to the arraylist object. Now the list object contains all the records. How do I compare them?

和二传手。使用 setter 方法将读取的每个记录写入 pojo 类。然后我使用 getter 获取值并将它们添加到 arraylist 对象。现在列表对象包含所有记录。我如何比较它们?

采纳答案by coder

public class ProcessAction extends DispatchAction {

    String dupValue = null;
    ArrayList<String> dupList = new ArrayList<String>();

    private String validateDuplicateRecords(ProcessForm process) {
        String errorMessage = null;

        dupValue = process.getId.trim()+"    "+process.getFirstname().trim()+"    "+process.getLastanme().trim();
        mLogger.debug("order id,ctn,item id: "+dupValue);
        if (dupList.contains(dupValue)){
            mLogger.debug("value not added");
            errorMessage = "Duplicate Record Exists";
        } else {
            dupList.add(dupValue);
        }

        return errorMessage;
    }
}

Don't forget to clear the duplicate arraylist. I my case after performing certain tasks like writing the arraylist to a file i am clearing the duplicate arraylist using:

不要忘记清除重复的数组列表。在执行某些任务(例如将数组列表写入文件)后,我正在使用以下方法清除重复的数组列表:

dupList.clear();

If you don't do this then what happens is when you upload the same data once more even if the records are not duplicate it will say duplicate since the dupList arraylist contains the previous uploaded data.

如果你不这样做,那么当你再次上传相同的数据时会发生什么,即使记录不重复,它也会说重复,因为 dupList arraylist 包含以前上传的数据。

回答by Arnfinn Gj?rvad

Throw the data in a set and check contains before every new entry. If you use a HashSet it will be quite quick. You can just pretend everything is Strings for the compare.

将数据放入一个集合中,并在每个新条目之前检查包含。如果您使用 HashSet,它将很快。你可以假装一切都是字符串进行比较。

        Set data = new HashSet();

    while (cellIterator.hasNext())  { 

        Cell cell = cellIterator.next();    
        if(data.contains(cell.getStringCellValue())
            trow new IllegalDataException()
        data.add(cell.getStringCellValue();

        switch (cell.getCellType())                     {        
            case Cell.CELL_TYPE_NUMERIC:  System.out.print(cell.getNumericCellValue() + "\t"); 
        break;                       
            case Cell.CELL_TYPE_STRING: System.out.print(cell.getStringCellValue() + "\t");  
        break;     
        }           
    }

If you need to actually compare the whole row you can create a class with all the fields, and then just override the equals method. Then throw that in a set and compare.

如果您需要实际比较整行,您可以创建一个包含所有字段的类,然后只需覆盖 equals 方法。然后将其放入一组并进行比较。