Java - 删除 ArrayList 中的重复项

Question

提问by Will

I'm working on a program that uses an ArrayListto store Strings. The program prompts the user with a menu and allows the user to choose an operation to perform. Such operations are adding Strings to the List, printing the entries etc. What I want to be able to do is create a method called removeDuplicates(). This method will search the ArrayListand remove any duplicated values. I want to leave one instance of the duplicated value(s) within the list. I also want this method to return the total number of duplicates removed.

我正在开发一个使用ArrayList来存储Strings. 该程序通过菜单提示用户并允许用户选择要执行的操作。此类操作是将字符串添加到列表、打印条目等。我想要做的是创建一个名为removeDuplicates(). 此方法将搜索ArrayList并删除任何重复的值。我想在列表中留下一个重复值的实例。我还希望此方法返回删除的重复项总数。

I've been trying to use nested loops to accomplish this but I've been running into trouble because when entries get deleted, the indexing of the ArrayListgets altered and things don't work as they should. I know conceptually what I need to do but I'm having trouble implementing this idea in code.

我一直在尝试使用嵌套循环来完成此操作，但我遇到了麻烦，因为当条目被删除时，索引ArrayList会被更改，并且事情无法正常工作。我从概念上知道我需要做什么，但我在代码中无法实现这个想法。

Here is some pseudo code:

下面是一些伪代码：

start with first entry; check each subsequent entry in the list and see if it matches the first entry; remove each subsequent entry in the list that matches the first entry;

从第一个条目开始；检查列表中的每个后续条目，看看它是否与第一个条目匹配；删除列表中与第一个条目匹配的每个后续条目；

after all entries have been examined, move on to the second entry; check each entry in the list and see if it matches the second entry; remove each entry in the list that matches the second entry;

检查完所有条目后，转到第二个条目；检查列表中的每个条目，看看它是否与第二个条目匹配；删除列表中与第二个条目匹配的每个条目；

repeat for entry in the list

重复以进入列表

Here's the code I have so far:

这是我到目前为止的代码：

public int removeDuplicates()
{
  int duplicates = 0;

  for ( int i = 0; i < strings.size(); i++ )
  {
     for ( int j = 0; j < strings.size(); j++ )
     {
        if ( i == j )
        {
          // i & j refer to same entry so do nothing
        }

        else if ( strings.get( j ).equals( strings.get( i ) ) )
        {
           strings.remove( j );
           duplicates++;
        }
     }
 }

   return duplicates;
}

UPDATE: It appears that Will is looking for a homework solution that involves developing the algorithm to remove duplicates, rather than a pragmatic solution using Sets. See his comment:

更新：似乎 Will 正在寻找一种家庭作业解决方案，该解决方案涉及开发算法以删除重复项，而不是使用 Sets 的实用解决方案。看他的评论：

Thx for the suggestions. This is part of an assignment and I believe the teacher had intended for the solution to not include sets. In other words, I am to come up with a solution that will search for and remove duplicates without implementing a HashSet. The teacher suggested using nested loops which is what I'm trying to do but I've been having some problems with the indexing of the ArrayListafter certain entries are removed.

谢谢你的建议。这是作业的一部分，我相信老师打算让解决方案不包括集合。换句话说，我将提出一个解决方案，该解决方案将在不实现HashSet. 老师建议使用嵌套循环，这正是我正在尝试做的，但ArrayList在删除某些条目后，我在索引索引方面遇到了一些问题。

Answer 1

回答by matt b

Why not use a collection such as Set(and an implementation like HashSet) which naturally prevents duplicates?

为什么不使用自然防止重复的集合Set（以及类似的实现HashSet）？

Answer 2

回答by Peter

Just to clarify my comment on matt b's answer, if you really want to count the number of duplicates removed, use this code:

只是为了澄清我对 matt b 的回答的评论，如果您真的想计算删除的重复数，请使用以下代码：

List<String> list = new ArrayList<String>();

// list gets populated from user input...

Set<String> set = new HashSet<String>(list);
int numDuplicates = list.size() - set.size();

Answer 3

回答by Theo

Using a set is the best option to remove the duplicates:

使用集合是删除重复项的最佳选择：

If you have a list of of arrays you can remove the duplicates and still retain array list features:

如果您有一个数组列表，您可以删除重复项并仍然保留数组列表功能：

 List<String> strings = new ArrayList<String>();
 //populate the array
 ...
 List<String> dedupped = new ArrayList<String>(new HashSet<String>(strings));
 int numdups = strings.size() - dedupped.size();

if you can't use a set, sort the array (Collections.sort()) and iterate over the list, checking if the current element is equal to the previous element, if it is, remove it.

如果您不能使用集合，请对数组进行排序 (Collections.sort()) 并遍历列表，检查当前元素是否等于前一个元素，如果是，则将其删除。

Answer 4

回答by Thirler

Using a set is the best option (as others suggested).

使用一组是最好的选择（正如其他人建议的那样）。

If you want to compare all elements in a list with eachother you should slightly adapt your for loops:

如果您想将列表中的所有元素相互比较，您应该稍微调整您的 for 循环：

for(int i = 0; i < max; i++)
    for(int j = i+1; j < max; j++)

This way you don't compare each element only once instead of twice. This is because the second loop start at the next element compared to the first loop.

这样你就不会只比较每个元素一次而不是两次。这是因为与第一个循环相比，第二个循环从下一个元素开始。

Also when removing from a list when iterating over them (even when you use a for loop instead of an iterator), keep in mind that you reduce the size of the list. A common solution is to keep another list of items you want to delete, and then after you finished deciding which to delete, you delete them from the original list.

此外，在迭代列表时从列表中删除（即使您使用 for 循环而不是迭代器），请记住要减小列表的大小。一个常见的解决方案是保留另一个要删除的项目列表，然后在决定删除哪些项目后，将它们从原始列表中删除。

Answer 5

回答by OscarRyz

I've been trying to use nested loops to accomplish this but I've been running into trouble because when entries get deleted, the indexing of the ArrayList gets alteredand things don't work as they should

我一直在尝试使用嵌套循环来完成此操作，但我遇到了麻烦，因为当条目被删除时，ArrayList 的索引会被更改，并且事情无法正常工作

Why don't you just decrease the counter each time you delete an entry.

为什么不在每次删除条目时减少计数器。

When you delete an entry the elements will move too:

当您删除条目时，元素也会移动：

ej:

ej：

String [] a = {"a","a","b","c" }

positions:

职位：

a[0] = "a";
a[1] = "a";    
a[2] = "b";
a[3] = "c";

After you remove your first "a" the indexes are:

删除第一个“a”后，索引为：

a[0] = "a";
a[1] = "b";
a[2] = "c";

So, you should take this into consideration and decrease the value of j( j--) to avoid "jumping" over a value.

因此，您应该考虑到这一点并减小j( j--)的值以避免“跳过”某个值。

See this screenshot:

看这个截图：

its working

它的工作

Answer 6

回答by Will Hartung

public Collection removeDuplicates(Collection c) {
// Returns a new collection with duplicates removed from passed collection.
    Collection result = new ArrayList();

    for(Object o : c) {
        if (!result.contains(o)) {
            result.add(o);
        }
    }

    return result;
}

or

或者

public void removeDuplicates(List l) {
// Removes duplicates in place from an existing list
    Object last = null;
    Collections.sort(l);

    Iterator i = l.iterator();
    while(i.hasNext()) {
        Object o = i.next();
        if (o.equals(last)) {
            i.remove();
        } else {
            last = o;
        }
    }
}

Both untested.

两者都未经测试。

Answer 7

回答by Will Hartung

public ArrayList removeDuplicates(ArrayList <String> inArray)
{
    ArrayList <String> outArray = new ArrayList();
    boolean doAdd = true;
    for (int i = 0; i < inArray.size(); i++)
    {
        String testString = inArray.get(i);
        for (int j = 0; j < inArray.size(); j++)
        {
            if (i == j)
            {
                break;
            }
            else if (inArray.get(j).equals(testString))
            {
                doAdd = false;
                break;
            }

        }
        if (doAdd)
        {
            outArray.add(testString);
        }
        else
        {
            doAdd = true;
        }

    }
    return outArray;

}

Answer 8

回答by Smalltown2k

You could replace the duplicate with an empty string*, thus keeping the indexing in tact. Then after you've completed you can strip out the empty strings.

您可以用空字符串* 替换重复项，从而保持索引完好无损。完成后，您可以删除空字符串。

*But only if an empty string isn't valid in your implementation.

*但仅当空字符串在您的实现中无效时。

Answer 9

回答by Carl

public <Foo> Entry<Integer,List<Foo>> uniqueElementList(List<Foo> listWithPossibleDuplicates) {
  List<Foo> result = new ArrayList<Foo>();//...might want to pre-size here, if you have reliable info about the number of dupes
  Set<Foo> found = new HashSet<Foo>(); //...again with the pre-sizing
  for (Foo f : listWithPossibleDuplicates) if (found.add(f)) result.add(f);
  return entryFactory(listWithPossibleDuplicates.size()-found.size(), result);
}

and then some entryFactory(Integer key, List<Foo> value)method. If you want to mutate the original list (possibly not a good idea, but whatever) instead:

然后是一些entryFactory(Integer key, List<Foo> value)方法。如果你想改变原始列表（可能不是一个好主意，但无论如何）：

public <Foo> int removeDuplicates(List<Foo> listWithPossibleDuplicates) {
  int original = listWithPossibleDuplicates.size();
  Iterator<Foo> iter = listWithPossibleDuplicates.iterator();
  Set<Foo> found = new HashSet<Foo>();
  while (iter.hasNext()) if (!found.add(iter.next())) iter.remove();
  return original - found.size();
}

for your particularcase using strings, you may need to deal with some additional equality constraints (e.g., are upper and lower case versions the same or different?).

对于使用字符串的特定情况，您可能需要处理一些额外的等式约束（例如，大写和小写版本相同还是不同？）。

EDIT: ah, this is homework. Look up Iterator/Iterable in the Java Collections framework, as well as Set, and see if you don't come to the same conclusion I offered. The generics part is just gravy.

编辑：啊，这是作业。在 Java Collections 框架中查找 Iterator/Iterable 以及 Set，看看您是否得出我提供的相同结论。泛型部分只是肉汁。

Answer 10

回答by Jared Russell

Assuming you can't use a Set like you said, the easiest way of solving the problem is to use a temporary list, rather than attempting to remove the duplicates in place:

假设您不能像您说的那样使用 Set，解决问题的最简单方法是使用临时列表，而不是尝试删除原地重复项：

public class Duplicates {

    public static void main(String[] args) {
        List<String> list = new ArrayList<String>();
        list.add("one");
        list.add("one");
        list.add("two");
        list.add("three");
        list.add("three");
        list.add("three");

        System.out.println("Prior to removal: " +list);
        System.out.println("There were " + removeDuplicates(list) + " duplicates.");
        System.out.println("After removal: " + list);
    }

    public static int removeDuplicates(List<String> list) {
        int removed = 0;
        List<String> temp = new ArrayList<String>();

        for(String s : list) {
            if(!temp.contains(s)) {
                temp.add(s);
            } else {
                //if the string is already in the list, then ignore it and increment the removed counter
                removed++;
            }
        }

        //put the contents of temp back in the main list
        list.clear();
        list.addAll(temp);

        return removed;
    }

}

Java - 删除 ArrayList 中的重复项

提问by Will

回答by matt b

回答by Peter

回答by Theo

回答by Thirler

回答by OscarRyz

回答by Will Hartung

回答by Will Hartung

回答by Smalltown2k

回答by Carl

回答by Jared Russell

相关推荐

最近更新

标签

Java - 删除 ArrayList 中的重复项

提问by Will

回答by matt b

回答by Peter

回答by Theo

回答by Thirler

回答by OscarRyz

回答by Will Hartung

回答by Will Hartung

回答by Smalltown2k

回答by Carl

回答by Jared Russell

相关推荐

java outOfMemoryError 与 stringbuilder

Java Apache HttpClient 4.3 - 设置连接空闲超时

Java 休眠异常：缺少列（列存在）

Java 是否可以使用 jsp 变量值来初始化 JQUERY 变量？

相关推荐

最近更新

标签