Java 如何从列表中删除重复项?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/2849450/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How to remove duplicates from a list?
提问by Mercer
I want to remove duplicates from a list but what I am doing is not working:
我想从列表中删除重复项,但我所做的不起作用:
List<Customer> listCustomer = new ArrayList<Customer>();
for (Customer customer: tmpListCustomer)
{
if (!listCustomer.contains(customer))
{
listCustomer.add(customer);
}
}
采纳答案by Stephen C
If that code doesn't work, you probably have not implemented equals(Object)
on the Customer
class appropriately.
如果该代码不起作用,则您可能没有适当地equals(Object)
在Customer
类上实现。
Presumably there is some key (let us call it customerId
) that uniquely identifies a customer; e.g.
大概有一些customerId
唯一标识客户的密钥(让我们称之为);例如
class Customer {
private String customerId;
...
An appropriate definition of equals(Object)
would look like this:
的适当定义equals(Object)
如下所示:
public boolean equals(Object obj) {
if (obj == this) {
return true;
}
if (!(obj instanceof Customer)) {
return false;
}
Customer other = (Customer) obj;
return this.customerId.equals(other.customerId);
}
For completeness, you shouldalso implement hashCode
so that two Customer
objects that are equal will return the same hash value. A matching hashCode
for the above definition of equals
would be:
为完整起见,您还应该实现hashCode
两个Customer
相等的对象将返回相同的哈希值。hashCode
上述定义的匹配equals
将是:
public int hashCode() {
return customerId.hashCode();
}
It is also worth noting that this is not an efficient way to remove duplicates if the list is large. (For a list with N customers, you will need to perform N*(N-1)/2
comparisons in the worst case; i.e. when there are no duplicates.) For a more efficient solution you should use something like a HashSet
to do the duplicate checking.
还值得注意的是,如果列表很大,这不是删除重复项的有效方法。(对于包含 N 个客户的列表,您将需要N*(N-1)/2
在最坏的情况下执行比较;即,当没有重复项时。)对于更有效的解决方案,您应该使用类似 a 的HashSet
方法进行重复检查。
回答by Péter T?r?k
I suspect you might not have Customer.equals()
implemented properly (or at all).
我怀疑您可能没有Customer.equals()
正确实施(或根本没有实施)。
List.contains()
uses equals()
to verify whether any of its elements is identical to the object passed as parameter. However, the default implementation of equals
tests for physical identity, not value identity. So if you have not overwritten it in Customer
, it will return false for two distinct Customer objects having identical state.
List.contains()
用于equals()
验证其任何元素是否与作为参数传递的对象相同。但是,默认实现equals
测试物理身份,而不是值身份。因此,如果您没有在 中覆盖它Customer
,它将为两个具有相同状态的不同 Customer 对象返回 false。
Here are the nitty-gritty details of how to implement equals
(and hashCode
, which is its pair - you must practically always implement both if you need to implement either of them). Since you haven't shown us the Customer class, it is difficult to give more concrete advice.
以下是如何实现equals
的基本细节(并且hashCode
,这是它的对 - 如果您需要实现其中任何一个,您实际上必须始终实现两者)。由于您没有向我们展示 Customer 类,因此很难给出更具体的建议。
As others have noted, you are better off using a Set rather than doing the job by hand, but even for that, you still need to implement those methods.
正如其他人所指出的,您最好使用 Set 而不是手动完成这项工作,但即便如此,您仍然需要实现这些方法。
回答by folone
List → Set → List (distinct)
列表 → 设置 → 列表(不同)
Just add all your elements to a Set
: it does not allow it's elements to be repeated. If you need a list afterwards, use new ArrayList(theSet)
constructor afterwards (where theSet
is your resulting set).
只需将所有元素添加到 a Set
:它不允许重复它的元素。如果您之后需要一个列表,请在之后使用新的ArrayList(theSet)
构造函数(theSet
您的结果集在哪里)。
回答by folone
The correct answer for Java is use a Set. If you already have a List<Customer>
and want to de duplicate it
Java 的正确答案是使用Set。如果你已经有一个List<Customer>
并且想要去重复它
Set<Customer> s = new HashSet<Customer>(listCustomer);
Otherise just use a Set
implemenation HashSet
, TreeSet
directly and skip the List
construction phase.
否则只需使用一个Set
实现HashSet
,TreeSet
直接跳过List
构建阶段。
You will need to override hashCode()
and equals()
on your domain classes that are put in the Set
as well to make sure that the behavior you want actually what you get. equals()
can be as simple as comparing unique ids of the objects to as complex as comparing every field. hashCode()
can be as simple as returning the hashCode()
of the unique id' String
representation or the hashCode()
.
您将需要覆盖hashCode()
和equals()
放在您的域类上,Set
以确保您想要的行为实际上是您获得的。equals()
可以像比较对象的唯一 id 一样简单,也可以像比较每个字段一样复杂。hashCode()
可以像返回hashCode()
唯一 id'String
表示的 或一样简单hashCode()
。
回答by Uri
As others have mentioned, you are probably not implementing equals() correctly.
正如其他人所提到的,您可能没有正确实现 equals()。
However, you should also note that this code is considered quite inefficient, since the runtime could be the number of elements squared.
但是,您还应该注意,此代码被认为效率很低,因为运行时可能是元素数的平方。
You might want to consider using a Set structure instead of a List instead, or building a Set first and then turning it into a list.
您可能需要考虑使用 Set 结构而不是 List,或者先构建 Set 然后将其转换为列表。
回答by mikera
Two suggestions:
两个建议:
Use a HashSet instead of an ArrayList. This will speed up the contains() checks considerably if you have a long list
Make sure Customer.equals() and Customer.hashCode() are implemented properly, i.e. they should be based on the combined values of the underlying fields in the customer object.
使用 HashSet 而不是 ArrayList。如果你有一个很长的列表,这将大大加快 contains() 检查
确保 Customer.equals() 和 Customer.hashCode() 正确实现,即它们应该基于客户对象中基础字段的组合值。
回答by DJClayworth
The "contains" method searched for whether the list contains an entry that returns true from Customer.equals(Object o). If you have not overridden equals(Object) in Customer or one of its parents then it will only search for an existing occurrence of the same object. It may be this was what you wanted, in which case your code should work. But if you were looking for not having two objects both representing the same customer, then you need to override equals(Object) to return true when that is the case.
“包含”方法搜索列表是否包含从 Customer.equals(Object o) 返回 true 的条目。如果您没有覆盖 Customer 或其父对象之一中的 equals(Object),那么它只会搜索同一对象的现有出现。这可能是您想要的,在这种情况下您的代码应该可以工作。但是,如果您希望没有两个对象同时代表同一客户,那么您需要覆盖 equals(Object) 以在这种情况下返回 true。
It is also true that using one of the implementations of Set instead of List would give you duplicate removal automatically, and faster (for anything other than very small Lists). You will still need to provide code for equals.
同样,使用 Set 而不是 List 的实现之一可以自动且更快地删除重复项(对于非常小的列表以外的任何内容)。您仍然需要为 equals 提供代码。
You should also override hashCode() when you override equals().
当您覆盖 equals() 时,您还应该覆盖 hashCode()。
回答by Scott Fines
Does Customer implement the equals()
contract?
客户是否履行equals()
合同?
If it doesn't implement equals()
and hashCode()
, then listCustomer.contains(customer)
will check to see if the exact same instancealready exists in the list (By instance I mean the exact same object--memory address, etc). If what you are looking for is to test whether or not the same Customer( perhaps it's the same customer if they have the same customer name, or customer number) is in the list already, then you would need to override equals()
to ensure that it checks whether or not the relevant fields(e.g. customer names) match.
如果它没有实现equals()
and hashCode()
,那么listCustomer.contains(customer)
将检查列表中是否已经存在完全相同的实例(通过实例我的意思是完全相同的对象——内存地址等)。如果您正在寻找的是测试列表中是否已经存在相同的客户(如果他们具有相同的客户名称或客户编号,则可能是同一个客户),那么您需要覆盖equals()
以确保它检查相关字段(例如客户姓名)是否匹配。
Note: Don't forget to override hashCode()
if you are going to override equals()
! Otherwise, you might get trouble with your HashMaps and other data structures. For a good coverage of why this is and what pitfalls to avoid, consider having a look at Josh Bloch's Effective Javachapters on equals()
and hashCode()
(The link only contains iformation about why you must implement hashCode()
when you implement equals()
, but there is good coverage about how to override equals()
too).
注意:hashCode()
如果您要覆盖,请不要忘记覆盖equals()
!否则,您的 HashMap 和其他数据结构可能会出现问题。要详细了解为什么会出现这种情况以及要避免哪些陷阱,请考虑查看 Josh Bloch 的Effective Java章节equals()
和hashCode()
(该链接仅包含有关在实现hashCode()
时为什么必须实现的信息equals()
,但对如何覆盖有很好的介绍equals()
也)。
By the way, is there an ordering restriction on your set? If there isn't, a slightly easier way to solve this problem is use a Set<Customer>
like so:
顺便问一下,你的套装有订购限制吗?如果没有,解决这个问题的一个稍微简单的方法是使用Set<Customer>
类似这样的方法:
Set<Customer> noDups = new HashSet<Customer>();
noDups.addAll(tmpListCustomer);
return new ArrayList<Customer>(noDups);
Which will nicely remove duplicates for you, since Sets don't allow duplicates. However, this will lose any ordering that was applied to tmpListCustomer
, since HashSet
has no explicit ordering (You can get around that by using a TreeSet
, but that's not exactly related to your question). This can simplify your code a little bit.
这将很好地为您删除重复项,因为 Sets 不允许重复项。但是,这将丢失任何应用于 的排序tmpListCustomer
,因为HashSet
没有明确的排序(您可以使用 a 来解决这个问题TreeSet
,但这与您的问题并不完全相关)。这可以稍微简化您的代码。
回答by Tom Hawtin - tackline
Assuming you want to keep the current order and don't want a Set
, perhaps the easiest is:
假设您想保留当前订单而不想要 aSet
,也许最简单的方法是:
List<Customer> depdupeCustomers =
new ArrayList<>(new LinkedHashSet<>(customers));
If you want to change the original list:
如果要更改原始列表:
Set<Customer> depdupeCustomers = new LinkedHashSet<>(customers);
customers.clear();
customers.addAll(dedupeCustomers);
回答by Eduardo
The cleanest way is:
最干净的方法是:
List<XXX> lstConsultada = dao.findByPropertyList(YYY);
List<XXX> lstFinal = new ArrayList<XXX>(new LinkedHashSet<GrupoOrigen>(XXX));
and override hascode
and equals
over the Id's properties of each entity
和覆盖hascode
并equals
在每个实体的标识属性