C# 为什么我不能在没有枚举的情况下从 HashSet 中检索项目?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/1494812/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Why can't I retrieve an item from a HashSet without enumeration?
提问by sooniln
I'm looking for insight into the heads of HashSet designers. As far as I am aware, my question applies to both Java and C# HashSets, making me think there must be some good reason for it, though I can't think of any myself.
我正在寻找深入了解 HashSet 设计者的头脑。据我所知,我的问题适用于 Java 和 C# HashSets,这让我认为一定有一些很好的理由,尽管我自己想不出任何理由。
After I have inserted an item into a HashSet, why is it impossible to retrieve that item without enumeration, hardly an efficient operation? Especially since a HashSet is explicitly built in a way which supports efficient retrieval.
在我将一个项目插入到 HashSet 之后,为什么不枚举就无法检索该项目,这几乎不是一个有效的操作?特别是因为 HashSet 是以支持有效检索的方式显式构建的。
It would often be useful to me to have Remove(x) and Contains(x) return the actual item that is being removed or contained. This is not necessarily the item I pass into the Remove(x) or Contains(x) function. Sure, I guess I could achieve the same effect through a HashMap but why waste all that space and effort when it should be perfectly possible to do this with a set?
让 Remove(x) 和 Contains(x) 返回正在被删除或包含的实际项目对我来说通常很有用。这不一定是我传递给 Remove(x) 或 Contains(x) 函数的项目。当然,我想我可以通过 HashMap 实现相同的效果,但是为什么在完全可以用集合来做到这一点的情况下浪费所有的空间和精力呢?
I can appreciate that there may be some design concerns that adding this functionality would allows uses of HashSet which are not consistent with their role or future role in the framework, but if this is so, what are these design concerns?
我可以理解,可能存在一些设计问题,即添加此功能将允许使用 HashSet,这与其在框架中的角色或未来角色不一致,但如果是这样,这些设计问题是什么?
Edit
编辑
To answer some more questions, here are more details:
要回答更多问题,这里有更多详细信息:
I am using an immutable reference type with overridden hashcode, equals, etc to emulate a value type in C#. Let's say the type has members A, B, and C. Hashcode, equals, etc depend only on A and B. Given some A and B I want to be able to retrieve that equivalent item from a hashset and get it's C. I won't be able to use HashSet for this it appears, but I would at least like to know if there is any good reason for this. Pseudo code follows:
我正在使用具有覆盖哈希码、等于等的不可变引用类型来模拟 C# 中的值类型。假设该类型具有成员 A、B 和 C。Hashcode、equals 等仅取决于 A 和 B。鉴于某些 A 和 BI 希望能够从哈希集中检索该等效项并获得 C。我会的似乎无法为此使用 HashSet,但我至少想知道是否有任何充分的理由。伪代码如下:
public sealed class X{
object A;
object B;
object extra;
public int HashCode(){
return A.hashCode() + B.hashCode();
}
public bool Equals(X obj){
return obj.A == A && obj.B == B;
}
}
hashset.insert(new X(1,2, extra1));
hashset.contains(new X(1,2)); //returns true, but I can't retrieve extra
采纳答案by Peter
How were you proposing to retrieve the item from the hash set? A set is by definition not ordered in any way and therefore, there is no index with which to use to retrieve the object in question.
你是如何提议从哈希集中检索项目的?根据定义,集合没有以任何方式排序,因此没有可用于检索相关对象的索引。
Sets, as a concept, are used to test inclusion, i.e. whether or not the element in question is in the hash data set. If you're looking to retrieve a value from a data source using a key value or index, I would suggest looking into either a Mapor a List.
作为一个概念,集合用于测试包含性,即所讨论的元素是否在散列数据集中。如果您希望使用键值或索引从数据源中检索值,我建议您查看Map或List。
EDIT: Additional answer based on the Edit to the original question
编辑:基于对原始问题的编辑的附加答案
Soonil, based on your new information, it looks like you might be interested in implementing your data as a Java Enum, something similar to this:
Soonil,根据您的新信息,您似乎有兴趣将数据实现为 Java 枚举,类似于以下内容:
public enum SoonilsDataType {
A, B, C;
// Just an example of what's possible
public static SoonilsDataType getCompositeValue(SoonilsDataType item1,
SoonilsDataType item2) {
if (item1.equals(A) &&
item2.equals(B)) {
return C;
}
}
}
Enum's automatically inherit values() which returns the list of all values in the enum's "set", which you can use to test inclusion against in the same way as the Set. Also, because its a full class, you can define new static methods to do the composite logic (like I was trying to allude to in the example code). The only thing about the Enum is that you can't add new instances at runtime, which may not be what you want (though if the set's data size isn't going to grow at runtime, the Enum is what you want).
枚举的自动继承 values() 返回枚举“集合”中所有值的列表,您可以使用与集合相同的方式测试包含。此外,因为它是一个完整的类,您可以定义新的静态方法来执行复合逻辑(就像我试图在示例代码中提到的那样)。Enum 唯一的一点是你不能在运行时添加新实例,这可能不是你想要的(尽管如果集合的数据大小不会在运行时增长,那么 Enum 就是你想要的)。
回答by aperkins
If you change an object after it has been inserted, it's hash may have changed (this is especially likely if hashCode() has been overridden). If the hash changes, a lookup of it in the set will fail, as you will be attempting to lookup an object that is hashed at a different location than it is stored in.
如果在插入后更改对象,则它的哈希值可能已更改(如果 hashCode() 已被覆盖,则尤其可能发生这种情况)。如果散列更改,则在集合中查找它会失败,因为您将尝试查找散列在与存储位置不同的位置的对象。
Also, you need to make sure you have overridden hashCode and equals in your object if you want to lookup equal objects that are different instances.
此外,如果要查找属于不同实例的相等对象,则需要确保已覆盖对象中的 hashCode 和 equals。
Note that this is all for Java - I am assuming C# has something similar, but as it has been several years since I used C#, I will let others speak to it's capabilities.
请注意,这都是针对 Java 的——我假设 C# 有类似的东西,但由于我使用 C# 已经好几年了,我会让其他人谈论它的功能。
回答by penpen
Set objects in those languages were mostly designed as set of value, not for mutable objects. They check that object put in them are unique by using equals. That is why contains and remove returns boolean, not the object: they check for or remove the value you pass to them.
这些语言中的 Set 对象主要设计为一组值,而不是可变对象。他们通过使用 equals 来检查放入其中的对象是否唯一。这就是为什么 contains 和 remove 返回布尔值,而不是对象:它们检查或删除您传递给它们的值。
And actually, if you do a contains(X) on a set, and expect to get a different object Y, that would means X and Y are equals (ie X.equals(Y) => true), but somewhat different, which seems wrong.
实际上,如果你在一个集合上做一个 contains(X),并期望得到一个不同的对象 Y,那就意味着 X 和 Y 是相等的(即 X.equals(Y) => true),但有些不同,这似乎错了。
回答by Adamski
I imagine the designers of the Set
interface and HashSet
class wanted to ensure that the remove(Object)
method defined on the Collection
interface was also applicable to Set
; this method returns a boolean denoting whether the object was successfully removed. If the designers wanted to provide functionality whereby remove(Object) returned the "equal" object already in the Set
this would mean a different method signature.
我想Set
接口和HashSet
类的设计者希望确保接口remove(Object)
上定义的方法Collection
也适用于Set
;此方法返回一个布尔值,表示对象是否已成功删除。如果设计者想要提供这样的功能,remove(Object) 返回“相等”对象已经在Set
这意味着不同的方法签名。
Also, given that the object being removed is logically equal to the object passed to remove(Object) it is arguable about the value added in returning the contained object. However, I have had this problem myself before and have used a Map to solve the problem.
此外,鉴于被删除的对象在逻辑上等于传递给 remove(Object) 的对象,因此在返回包含的对象时添加的值是有争议的。但是,我自己之前也遇到过这个问题,并且使用了 Map 来解决问题。
Note that in Java, a HashSet
uses a HashMap
internally and so there isn't additional storage overhead in using a HashMap
instead.
请注意,在 Java 中, aHashSet
在HashMap
内部使用 a ,因此使用 a不会产生额外的存储开销HashMap
。
回答by Robert Munteanu
Looks to me like you're actually looking for a Map<X,Y>
, where Y is the type of extra1
.
在我看来,您实际上是在寻找 a Map<X,Y>
,其中 Y 是extra1
.
(rant below)
(下面吐槽)
The equals and hashCode methods define meaningful object equality. The HashSet class assumes that if two objects are equal as defined by Object.equals(Object)
there is no difference between these two objects.
equals 和 hashCode 方法定义了有意义的对象相等性。HashSet 类假设如果两个对象如定义的那样相等,Object.equals(Object)
则这两个对象之间没有区别。
I'd go as far as to say that if the object extra
is meaningful, your design is not ideal.
我什至会说,如果object extra
有意义,那么您的设计并不理想。
回答by sooniln
I was given an interesting suggestion as to a way to use a Map, by having my own objects define themselves as KeyValuePairs. While a good concept, unfortunately KeyValuePair is not an interface (why not?) and is a struct, which shoots that plan out of the air. In the end I will roll my own Set, as my constraints allow me this option.
我得到了一个关于使用 Map 的有趣建议,通过让我自己的对象将自己定义为 KeyValuePairs。虽然是一个很好的概念,但不幸的是 KeyValuePair 不是一个接口(为什么不呢?),而是一个结构,它会凭空拍摄该计划。最后我将推出我自己的 Set,因为我的约束允许我选择这个选项。
回答by leat
In .Net, what you are probably looking for is KeyedCollection http://msdn.microsoft.com/en-us/library/ms132438.aspx
在 .Net 中,您可能正在寻找的是 KeyedCollection http://msdn.microsoft.com/en-us/library/ms132438.aspx
You can get around the nastiness of re-implementing this abstract class each time with some "generic" cleverness. (See IKeyedObject`1.)
你可以通过一些“通用”的聪明来解决每次重新实现这个抽象类的麻烦。(请参阅 IKeyedObject`1。)
Note: Any data transfer object which implements IKeyedObject`1 should have an overridden GetHashCode method simply returning this.Key.GetHashCode(); and same goes for equals...
注意:任何实现 IKeyedObject`1 的数据传输对象都应该有一个重写的 GetHashCode 方法,只需返回 this.Key.GetHashCode(); 平等也一样......
My Base Class Library usually ends up with something like this in it:
我的基类库通常以这样的方式结束:
public class KeyedCollection<TItem> : System.Collections.ObjectModel.KeyedCollection<TItem, TItem>
where TItem : class
{
public KeyedCollection() : base()
{
}
public KeyedCollection(IEqualityComparer<TItem> comparer) : base(comparer)
{
}
protected override TItem GetKeyForItem(TItem item)
{
return item;
}
}
public class KeyedObjectCollection<TKey, TItem> : System.Collections.ObjectModel.KeyedCollection<TKey, TItem>
where TItem : class, IKeyedObject<TKey>
where TKey : struct
{
public KeyedCollection() : base()
{
}
protected override TItem GetKeyForItem(TItem item)
{
return item.Key;
}
}
///<summary>
/// I almost always implement this explicitly so the only
/// classes that have access without some rigmarole
/// are generic collections built to be aware that an object
/// is keyed.
///</summary>
public interface IKeyedObject<TKey>
{
TKey Key { get; }
}
回答by Richard Petheram
Short answer; because the items cannot be guaranteed to be immutable.
简短的回答;因为不能保证项目是不可变的。
I've hit the exact problem you describe, where the HashCode is based on fixed fields within the member class, but the class holds additional information that can be updated without changing the hash.
我遇到了您描述的确切问题,其中 HashCode 基于成员类中的固定字段,但该类包含可以在不更改哈希的情况下更新的附加信息。
My solution was to implement a generic MyHashSet<T> based on ICollection<T> but wrapped round a Dictionary<int, List<T>> to provide the required lookup efficiency, where the int key is the HashCode of T. However, this shows that if the HashCode of the member objects can change then the dictionary lookup followed by equality comparison of items in the list will never find the changed items. There is no mechanism for forcing the members to be immutable so the only solution is to enumerate the lot.
我的解决方案是实现一个基于 ICollection<T> 的通用 MyHashSet<T> 但包裹了一个 Dictionary<int, List<T>> 以提供所需的查找效率,其中 int 键是 T 的 HashCode。然而,这表明如果成员对象的 HashCode 可以更改,则字典查找以及列表中项目的相等比较将永远找不到更改的项目。没有强制成员不可变的机制,因此唯一的解决方案是枚举批次。
回答by user1111929
Why not just use a HashMap<X,X>
? This does exactly what you want. Just do .put(x,x)
every time and then you can just get the stored element equal to x with .get(x)
.
为什么不直接使用HashMap<X,X>
? 这正是您想要的。.put(x,x)
每次都做,然后你就可以得到等于 x 的存储元素.get(x)
。
回答by user1852503
After wondering the same thing, and finely being able to look at the source code:
在想同样的事情之后,并且可以很好地查看源代码:
source: http://referencesource.microsoft.com/#System.Core/System/Collections/Generic/HashSet.cs
来源:http: //referencesource.microsoft.com/#System.Core/System/Collections/Generic/HashSet.cs
A set is a collection of unique items (objects or values). In the .net implementation an item is the same as another item (not unique) if the Equals method of the comparer returns true for the two items. Not if the two items have the same hash code. so a check of the existence of an item is a two step process. first using the hashset to minimize the number of items to compere, then the compression itself.
集合是唯一项(对象或值)的集合。在 .net 实现中,如果比较器的 Equals 方法对两个项目返回 true,则该项目与另一个项目相同(不是唯一的)。如果两个项目具有相同的哈希码,则不会。所以检查一个项目是否存在是一个两步过程。首先使用散列集来最小化要执行的项目数,然后是压缩本身。
If you wish to retrieve an item, you must be able to supply the retrieving function with a unique identifier. you might know the hash code of the item you want. but that is not enough. as more than one item can have that same hash. you will also need to supply the item itself so that the Equal method can be called. and clearly if you have the item there is no reason to get it.
如果您希望检索项目,您必须能够为检索功能提供唯一标识符。你可能知道你想要的项目的哈希码。但这还不够。因为不止一项可以具有相同的哈希值。您还需要提供项目本身,以便可以调用 Equal 方法。很明显,如果您拥有该物品,则没有理由获得它。
One could create a data structure that demands that no two unique items ever return the same hash code. and than you could get an item from it. it will be faster of adding*, and retrieving will be possible if you know the hash. if two items that are not equal but return the same hash are put into it the first will be overwritten. as far as I know this Type doesn't exist in .net , and no this is not the same as a dictionary.
可以创建一种数据结构,要求没有两个唯一项返回相同的哈希码。而且你可以从中得到一件物品。添加*会更快,如果您知道哈希值,则可以检索。如果将不相等但返回相同散列的两个项目放入其中,第一个将被覆盖。据我所知,这种类型在 .net 中不存在,不,这与字典不同。
*given that the GetHash method is the same.
*鉴于 GetHash 方法是相同的。