.net：添加字典项 - 检查它是否存在或允许异常？

Question

提问by ScottE

I'm adding items to a StringDictionary and it's possible that a duplicate key will come up. This will of course throw an exception.

我正在向 StringDictionary 添加项目，并且可能会出现重复的键。这当然会抛出异常。

If the chance of duplicates is very low (ie it will rarely happen), am I better off using a Try Catch block and leaving it unhandled, or should I always do a .ContainsKey check before adding each entry?

如果重复的机会非常低（即它很少发生），我最好使用 Try Catch 块而不对其进行处理，还是应该在添加每个条目之前始终进行 .ContainsKey 检查？

I'm assuming that if the likelihood of duplicate keys was high, then allowing exceptions would be a poor decision as they are expensive.

我假设如果重复键的可能性很高，那么允许异常将是一个糟糕的决定，因为它们很昂贵。

Thoughts?

想法？

Edit

编辑

I used reflector on the generic Dictionary and found the following for ContainsKey and TryGetValue, as both were mentioned below.

我在通用字典上使用了反射器，并找到了包含键和 TryGetValue 的以下内容，如下所述。

public bool TryGetValue(TKey key, out TValue value)
{
    int index = this.FindEntry(key);
    if (index >= 0)
    {
        value = this.entries[index].value;
        return true;
    }
    value = default(TValue);
    return false;
}

And

和

public bool ContainsKey(TKey key)
{
    return (this.FindEntry(key) >= 0);
}

Am I missing something, or is TryGetValue doing more work than ContainsKey ?

我错过了什么，还是 TryGetValue 比 ContainsKey 做了更多的工作？

I appreciate the responses, and for my current purpose I'm going to go with doing a ContainsKey call as the collection will be small, and the code more readable.

我感谢您的答复，并且出于我目前的目的，我将进行 ContainsKey 调用，因为集合很小，并且代码更具可读性。

Answer 1

回答by Fredrik M?rk

How to approach this depends on what you want to do if a collision happens. If you want to keep the first inserted value, you should use ContainsKeyto check before inserting. If, on the other hand, you want to use the lastvalue for that key, you can do like so:

如何解决这个问题取决于发生碰撞时你想做什么。如果你想保留第一个插入的值，你应该ContainsKey在插入前使用检查。另一方面，如果您想使用该键的最后一个值，您可以这样做：

// c# sample:
myDictionary[key] = value;

As a side note: I would probably, if possible, use Dictionary<string, string>instead of StringDictionary. If nothing else that will give you access to some more Linq extension methods.

附带说明：如果可能，我可能会使用Dictionary<string, string>代替StringDictionary. 如果没有别的办法可以让您访问更多的 Linq 扩展方法。

Answer 2

回答by Kelsey

I would do the Contains check.

我会做包含检查。

My reasoning is exceptions should be saved for those things that just shouldn't happen. If they do then alarm bells should be rang and calvery brought in. Just seems odd to me to use exceptions for known issue case handling especially when you can test for it.

我的推理是应该为那些不应该发生的事情保存例外。如果他们这样做了，那么应该敲响警钟并开始处理。对我来说，使用异常处理已知问题案例似乎很奇怪，尤其是当您可以对其进行测试时。

Answer 3

回答by Pavel Minaev

If at all possible, replace StringDictionarywith Dictionary<string, string>, and use TryGetValue. This avoids both exception handling overhead, and double lookup.

如果可能，请替换StringDictionary为Dictionary<string, string>，并使用TryGetValue。这避免了异常处理开销和双重查找。

Answer 4

回答by nawfal

I did some benchmarking regarding this. But I have to reiterate Kelsey's point:

我对此做了一些基准测试。但我必须重申凯尔西的观点：

exceptions should be saved for those things that just shouldn't happen. If they do then alarm bells should be rang and calvary brought in. Just seems odd to me to use exceptions for known issue case handling especially when you can test for it.

应该为那些不应该发生的事情保存异常。如果他们这样做了，那么应该敲响警钟并引入加略山。对我来说，使用异常处理已知问题案例似乎很奇怪，尤其是当您可以对其进行测试时。

It makes sense because the performance gain you gain by going for try-catch(if at all) is trivial but the "catch" can be more penalizing. Here's the test:

这是有道理的，因为你通过追求try-catch（如果有的话）获得的性能提升是微不足道的，但“捕获”可能会更加不利。这是测试：

public static void Benchmark(Action method, int iterations = 10000)
{
    Stopwatch sw = new Stopwatch();
    sw.Start();
    for (int i = 0; i < iterations; i++)
        method();

    sw.Stop();
    MessageBox.Show(sw.Elapsed.TotalMilliseconds.ToString());
}

public static string GetRandomAlphaNumeric()
{
    return Path.GetRandomFileName().Replace(".", "").Substring(0, 8);
}

var dict = new Dictionary<string, int>();

No duplicates:

没有重复：

Benchmark(() =>
{
    // approach 1
    var key = GetRandomAlphaNumeric();
    if (!dict.ContainsKey(key))
        dict.Add(item, 0);

    // approach 2
    try
    {
        dict.Add(GetRandomAlphaNumeric(), 0);
    }
    catch (ArgumentException)
    {

    }
}, 100000);

50% duplicates:

50% 重复：

for (int i = 0; i < 50000; i++)
{
    dict.Add(GetRandomAlphaNumeric(), 0);  
}

var lst = new List<string>();
for (int i = 0; i < 50000; i++)
{
    lst.Add(GetRandomAlphaNumeric());
}
lst.AddRange(dict.Keys);
Benchmark(() =>
{
    foreach (var key in lst)
    {
        // approach 1
        if (!dict.ContainsKey(key))
            dict.Add(key, 0);

        // approach 2
        try
        {
            dict.Add(key, 0);
        }
        catch (ArgumentException)
        {

        }
    }
}, 1);

100% duplicates

100% 重复

var key = GetRandomAlphaNumeric();
dict.Add(key, 0);
Benchmark(() =>
{
    // approach 1
    if (!dict.ContainsKey(key))
        dict.Add(item, 0);

    // approach 2
    try
    {
        dict.Add(key, 0);
    }
    catch (ArgumentException)
    {

    }
}, 100000);

Results:

结果：

No duplicates
approach 1: debug -> 630 ms - 680 ms; release -> 620 ms - 640 ms
approach 2: debug -> 640 ms - 690 ms; release -> 640 ms - 670 ms
50% duplicates
approach 1: debug -> 26 ms - 39 ms; release -> 25 ms - 33 ms
approach 2: debug -> 1340 ms; release -> 1260 ms
100% duplicates
approach 1: debug -> 7 ms; release -> 7 ms
approach 2: debug -> 2600 ms; release -> 2400 ms

没有重复
方法 1：调试 -> 630 毫秒 - 680 毫秒；释放 -> 620 毫秒 - 640 毫秒
方法 2：调试 -> 640 毫秒 - 690 毫秒；释放 -> 640 毫秒 - 670 毫秒
50% 重复
方法 1：调试 -> 26 毫秒 - 39 毫秒；释放 -> 25 毫秒 - 33 毫秒
方法 2：调试 -> 1340 毫秒；释放 -> 1260 毫秒
100% 重复
方法 1：调试 -> 7 毫秒；释放 -> 7 毫秒
方法 2：调试 -> 2600 毫秒；释放 -> 2400 毫秒

You can see that as duplicates increase, try-catchperforms poorly. Even in the worst case where you have no duplicates at all, the performance gain of try-catchisn't anything substantial.

您可以看到，随着重复次数的增加，try-catch表现不佳。即使在根本没有重复项的最坏情况下，的性能增益try-catch也没有任何实质性的提升。

Answer 5

回答by Eric J.

Unless this is a very large dictionary or in a critical inner loop of code, you will probably not see a difference.

除非这是一个非常大的字典或在代码的关键内部循环中，否则您可能看不到区别。

The .ContainsKey check will cost you a little performance every time, while the thrown exception will cost you a bit more performance rarely. If the chances of a duplicate are indeed low, go with allowing the Exception.

.ContainsKey 检查每次都会让您损失一点性能，而抛出的异常很少会花费您更多的性能。如果重复的机会确实很低，请允许异常。

If you actually do want to be able to manage duplicate keys, you might look at MultiDictionary in PowerCollections

如果您确实希望能够管理重复的键，您可以查看PowerCollections中的 MultiDictionary

Answer 6

回答by cemdev

You could extend the Dictionary with an AddorUpdate() method:

您可以使用 AddorUpdate() 方法扩展字典：

http://williablog.net/williablog/post/2011/08/30/Generic-AddOrUpdate-Extension-for-IDictionary.aspx

Answer 7

回答by Dave Black

There are 2 ways to look at this...

有两种方法可以看这个...

Performance

表现

If you're looking at it from a performance perspective, you have to consider:

如果你从性能的角度来看它，你必须考虑：

how expensive it is to calculate the Hash of the type for the Key and
how expensive it is to create and throw exceptions

计算 Key 类型的 Hash 的成本是多少
创建和抛出异常的代价有多大

I can't think of any hash calculation that would be more expensive than throwing an exception. Remember, when exceptions are thrown, they have be marshaled across interop to Win32 API, be created, crawl the entire stack trace, and stop and process any catch block encountered. Exception throwing in the CLR is still treated as HRESULTs handled by Win32 API under the covers. From Chris Brumme's blog:

我想不出任何比抛出异常更昂贵的哈希计算。请记住，当抛出异常时，它们已通过互操作编组到 Win32 API，被创建，抓取整个堆栈跟踪，并停止和处理遇到的任何捕获块。CLR 中的异常抛出仍被视为由 Win32 API 在幕后处理的 HRESULT。来自Chris Brumme 的博客：

Of course, we can't talk about managed exceptions without first considering Windows Structured Exception Handling (SEH). And we also need to look at the C++ exception model. That's because both managed exceptions and C++ exceptions are implemented on top of the underlying SEH mechanism, and because managed exceptions must interoperate with both SEH and C++ exceptions.

当然，如果不首先考虑 Windows 结构化异常处理 (SEH)，我们就不能谈论托管异常。我们还需要查看 C++ 异常模型。这是因为托管异常和 C++ 异常都是在底层 SEH 机制之上实现的，并且托管异常必须与 SEH 和 C++ 异常进行互操作。

Performance Conslusion: Avoid exceptions

性能结论：避免异常

Best Practice

最佳实践

The .NET Framework Design Guidelines are a good ruleset to follow (with rare exception - pun intended). Design Guidelines Update: Exception Throwing. There is something called the "Try/Doer" pattern mention in the guidelines which in this case the recommendation is to avoid the exceptions:

.NET Framework 设计指南是一个很好的规则集（很少有例外 - 双关语）。设计指南更新：异常抛出。指南中提到了“Try/Doer”模式，在这种情况下，建议避免例外情况：

Consider the Tester-Doer pattern for members which may throw exceptions in common scenarios to avoid performance problems related to exceptions.

对于可能在常见场景中抛出异常的成员，请考虑测试者-执行者模式，以避免与异常相关的性能问题。

Exceptions should also never be used as a control-flow mechanism - again written in the CLR Design Guidelines.

异常也不应该被用作控制流机制——再次写在 CLR 设计指南中。

Best Practice Conclusion: Avoid Exceptions

最佳实践结论：避免异常

Answer 8

回答by iSpain17

This issue is a lot simpler as of now for .NET Standard 2.1+ and .NET Core 2.0+. Just use the TryAddmethod. It handles all the issues you have the most graceful way.

到目前为止，对于 .NET Standard 2.1+ 和 .NET Core 2.0+ 来说，这个问题要简单得多。就用这个TryAdd方法吧。它以最优雅的方式处理您遇到的所有问题。

Answer 9

回答by SqlRyan

I try to avoid Exceptions everywhere I can - they're expensive to handle, and they can complicate the code. Since you know that a collision is possible, and it's trivial to do the .Contains check, I'd do that.

我尽量避免在任何可能的地方出现异常——它们处理起来很昂贵，而且它们会使代码复杂化。由于您知道可能发生碰撞，并且执行 .Contains 检查很简单，因此我会这样做。

.net：添加字典项 - 检查它是否存在或允许异常？

提问by ScottE

回答by Fredrik M?rk

回答by Kelsey

回答by Pavel Minaev

回答by nawfal

No duplicates:

没有重复：

50% duplicates:

50% 重复：

100% duplicates

100% 重复

回答by Eric J.

回答by cemdev

回答by Dave Black

Performance

表现

Best Practice

最佳实践

回答by iSpain17

回答by SqlRyan

相关推荐

最近更新

标签

.net：添加字典项 - 检查它是否存在或允许异常？

提问by ScottE

回答by Fredrik M?rk

回答by Kelsey

回答by Pavel Minaev

回答by nawfal

No duplicates:

没有重复：

50% duplicates:

50% 重复：

100% duplicates

100% 重复

回答by Eric J.

回答by cemdev

回答by Dave Black

Performance

表现

Best Practice

最佳实践

回答by iSpain17

回答by SqlRyan

相关推荐

.net 如何正确等待BackgroundWorker完成？

.net Powershell卸载模块...完全

.net WPF - 如何强制命令通过其 CommandBindings 重新评估“CanExecute”

如何使 .NET 应用程序“识别大地址”？

相关推荐

最近更新

标签