在 Java 中删除列表中的重复字符串
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/14040331/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
remove duplicate strings in a List in Java
提问by user121196
Update:
I guess HashSet.add(Object obj)
does not call contains
. is there a way to implement what I want(remove dup strings ignore case using Set
)?
更新:我想HashSet.add(Object obj)
不会调用contains
. 有没有办法实现我想要的(使用删除 dup 字符串忽略大小写Set
)?
Original question:
trying to remove dups from a list of String in java, however in the following code CaseInsensitiveSet.contains(Object ob)
is not getting called, why?
原始问题:试图从 java 中的字符串列表中删除 dups,但是在下面的代码CaseInsensitiveSet.contains(Object ob)
中没有被调用,为什么?
public static List<String> removeDupList(List<String>list, boolean ignoreCase){
Set<String> set = (ignoreCase?new CaseInsensitiveSet():new LinkedHashSet<String>());
set.addAll(list);
List<String> res = new Vector<String>(set);
return res;
}
public class CaseInsensitiveSet extends LinkedHashSet<String>{
@Override
public boolean contains(Object obj){
//this not getting called.
if(obj instanceof String){
return super.contains(((String)obj).toLowerCase());
}
return super.contains(obj);
}
}
回答by Evgeniy Dorofeev
Try
尝试
Set set = new TreeSet(String.CASE_INSENSITIVE_ORDER);
set.addAll(list);
return new ArrayList(set);
UPDATEbut as Tom Anderson mentioned it does not preserve the initial order, if this is really an issue try
更新但正如汤姆安德森提到的那样,它不会保留初始顺序,如果这真的是一个问题,请尝试
Set<String> set = new TreeSet<String>(String.CASE_INSENSITIVE_ORDER);
Iterator<String> i = list.iterator();
while (i.hasNext()) {
String s = i.next();
if (set.contains(s)) {
i.remove();
}
else {
set.add(s);
}
}
prints
印刷
[2, 1]
回答by Peter Lawrey
contains
is not called as LinkedHashSet is not implemented that way.
contains
不被调用,因为 LinkedHashSet 不是那样实现的。
If you want add() to call contains() you will need to override it as well.
如果您希望 add() 调用 contains() 您还需要覆盖它。
The reason it is not implemented this way is that calling contains first would mean you are performing two lookups instead of one which would be slower.
没有以这种方式实现的原因是首先调用 contains 意味着您正在执行两次查找,而不是执行速度较慢的一次。
回答by Rahul
add()
method of LinkedHashSet
do not call contains()
internally else your method would have been called as well.
add()
方法LinkedHashSet
不要在contains()
内部调用,否则您的方法也会被调用。
Instead of a LinkedHashSet
, why dont you use a SortedSet
with a case insensitive comparator
? With the String.CASE_INSENSITIVE_ORDERcomparator
而不是 a LinkedHashSet
,为什么不使用 aSortedSet
和不区分大小写的比较器?使用String.CASE_INSENSITIVE_ORDER比较器
Your code is reduced to
您的代码减少到
public static List<String> removeDupList(List<String>list, boolean ignoreCase){
Set<String> set = (ignoreCase?new TreeSet<String>(String.CASE_INSENSITIVE_ORDER):new LinkedHashSet<String>());
set.addAll(list);
List<String> res = new ArrayList<String>(set);
return res;
}
If you wish to preserve the Order, as @tom anderson specified in his comment, you can use an auxiliary LinkedHashSet for the order.
如果您希望保留订单,如@tom anderson 在他的评论中指定的那样,您可以为订单使用辅助 LinkedHashSet。
You can try adding that element to TreeSet, if it returns true also add it to LinkedHashSet else not.
您可以尝试将该元素添加到 TreeSet,如果它返回 true 也将它添加到 LinkedHashSet 否则不。
public static List<String> removeDupList(List<String>list){
Set<String> sortedSet = new TreeSet<String>(String.CASE_INSENSITIVE_ORDER);
List<String> orderedList = new ArrayList<String>();
for(String str : list){
if(sortedSet.add(str)){ // add returns true, if it is not present already else false
orderedList.add(str);
}
}
return orderedList;
}
回答by Yogesh Patil
Try
尝试
public boolean addAll(Collection<? extends String> c) {
for(String s : c) {
if(! this.contains(s)) {
this.add(s);
}
}
return super.addAll(c);
}
@Override
public boolean contains(Object o) {
//Do your checking here
// return super.contains(o);
}
This will make sure the contains method is called if you want the code to go through there.
如果您希望代码在那里通过,这将确保调用 contains 方法。
回答by Tom Anderson
Here's another approach, using a HashSet
of the strings for deduplication, but building the result list directly:
这是另一种方法,使用一个HashSet
字符串进行重复数据删除,但直接构建结果列表:
public static List<String> removeDupList(List<String> list, boolean ignoreCase) {
HashSet<String> seen = new HashSet<String>();
ArrayList<String> deduplicatedList = new ArrayList<String>();
for (String string : list) {
if (seen.add(ignoreCase ? string.toLowerCase() : string)) {
deduplicatedList.add(string);
}
}
return deduplicatedList;
}
This is fairly simple, makes only one pass over the elements, and does only a lowercase, a hash lookup, and then a list append for each element.
这相当简单,只对元素进行一次传递,并且只执行小写、哈希查找,然后为每个元素添加一个列表。