C++ 比较 std::strings 的最佳方法

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/4772325/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-28 16:25:12  来源:igfitidea点击:

Best way to compare std::strings

c++stringcomparisoniostring-comparison

提问by Ben

What is the best way to compare std::strings? The obvious way would be with if/else:

比较std::strings的最佳方法是什么?显而易见的方法是使用if/ else

std::string input;
std::cin >> input;

if ( input == "blahblahblah" )
{
    // do something.
}

else if ( input == "blahblah" )
{
    // do something else.
}

else if ( input == "blah" )
{
    // do something else yet.
}

// etc. etc. etc.

Another possibility is to use an std::mapand a switch/case. What is the best way when doing lots (like 8, 10, 12+) of these comparisons?

另一种可能性是使用 anstd::map和 a switch/ case。进行大量(例如 8、10、12+)这些比较时,最好的方法是什么?

回答by Ben

Here's an example using std::map.

这是使用 std::map 的示例。

#include <map>
#include <string>
#include <iostream>
#include <utility>

void first()
{
  std::cout << "first\n";
}

void second()
{
  std::cout << "second\n";
}

void third()
{
  std::cout << "third\n";
}


int main()
{
  typedef void(*StringFunc)();
  std::map<std::string, StringFunc> stringToFuncMap;

  stringToFuncMap.insert(std::make_pair("blah", &first));
  stringToFuncMap.insert(std::make_pair("blahblah", &second));
  stringToFuncMap.insert(std::make_pair("blahblahblah", &third));

  stringToFuncMap["blahblah"]();
  stringToFuncMap["blahblahblah"]();
  stringToFuncMap["blah"]();
}

Output is:

输出是:

second
third
first

The benefits of this approach are:

这种方法的好处是:

  • It's easily extensible.
  • It forces you to break out the string-handling routines into separate functions (programming by intention).
  • Function lookup is O(log n), whereas your example is O(n)
  • 它很容易扩展。
  • 它迫使您将字符串处理例程分解为单独的函数(按意图编程)。
  • 函数查找是 O(log n),而你的例子是 O(n)

Look into using boost::function to make the syntax a bit nicer, especially with class member functions.

考虑使用 boost::function 使语法更好一些,尤其是对于类成员函数。

回答by Evan Teran

using operator==is pretty good, but if performance is really critical, you can improve it depending on your use case. If the goal is to pick one of a few choices and perform a specific action, you can use a TRIE. Also if the strings are different enough, you could do something like this:

usingoperator==非常好,但如果性能真的很重要,您可以根据您的用例改进它。如果目标是从几个选项中选择一个并执行特定操作,则可以使用TRIE。此外,如果字符串足够不同,您可以执行以下操作:

switch(s[0]) {
case 'a':
    // only compare to strings which start with an 'a'
    if(s == "abcd") {

    } else if (s == "abcde") {

    }
    break;
case 'b':
    // only compare to strings which start with an 'b'
    if(s == "bcd") {

    } else if (s == "bcde") {

    }
    break;
default:
    // we know right away it doesn't match any of the above 4 choices...
}

basically use a certain character in the string which good uniqueness (doesn't have to be the first if all strings are at least N in length any character before N will do!) to do a switchthen do a series of compares on a subset of the strings which match that unique characteristic

基本上使用字符串中具有良好唯一性的某个字符(如果所有字符串的长度至少为 N,switch则不必是第一个,N 之前的任何字符都可以!)然后对子集进行一系列比较匹配该独特特征的字符串

回答by Etienne de Martel

"12" isn't a lot... but anyway.

“12”不是很多……但无论如何。

You can only use a switchfor integral types (char, int, etc.), so that's out of the question for std::string. Using a map would probably be more readable.

您只能将 aswitch用于整数类型(charint等),因此对于std::string. 使用地图可能更具可读性。

Then again, it all depends on how you define "best".

再说一次,这完全取决于您如何定义“最佳”。

回答by Edward Strange

The answer to this question is all too dependent upon the problem. You've named two examples. You could add to your options things like hash tables, regular expressions, etc...

这个问题的答案完全取决于问题。您已命名了两个示例。您可以在选项中添加哈希表、正则表达式等...

回答by mike.dld

With 8, 10 and even 12 comparisons you can still use if ... else if ...scheme, nothing bad. If you want 100 or something, I'd recommend writing a function which would calculate a hash of string (even by simple xoring all the characters, but some other good method would be preferable for better distribution) and then switching over its result as Evan proposed. If function returns unique numbers for all the possible input strings - that's even better and doesn't require additional comparisons.

通过 8、10 甚至 12 次比较,您仍然可以使用if ... else if ...方案,没什么不好。如果你想要 100 或其他东西,我建议编写一个函数来计算字符串的散列(即使通过简单的异或所有字符,但其他一些好的方法会更好地分配),然后将其结果切换为 Evan建议的。如果函数为所有可能的输入字符串返回唯一的数字——那就更好了,不需要额外的比较。

回答by Haozhun

If you mean "most efficient" by "the best", read ahead.

如果“最佳”是指“最有效”,请提前阅读。

I'm suggesting using the following method if there really is a lot.
String in Switch is actually something will be in Java 7. (As part of Project Coin)

如果真的有很多,我建议使用以下方法。
Switch 中的 String 实际上将在 Java 7 中出现。(作为Project Coin 的一部分)

And according to the proposal, this is the way Java language will implement it.
First, hash value of each of the strings is calculated. This problem is then a "switch int" problem, which is available in most currently language, and is efficient. In each of the case statement, you then check if this is really the string (in very rare cases different strings could hash to the same int).
I personally don't do the last step in practice sometimes as it's necessity depends on the situation you specific program is in, i.e. whether the strings possible are under the programmer's control and how robust the program need to be.

根据提案,这是 Java 语言实现它的方式。
首先,计算每个字符串的哈希值。这个问题是一个“switch int”问题,它在大多数当前语言中可用,并且是有效的。在每个 case 语句中,您然后检查这是否真的是字符串(在极少数情况下,不同的字符串可能会散列到相同的 int)。
我个人有时不会在实践中做最后一步,因为它的必要性取决于您的特定程序所处的情况,即可能的字符串是否在程序员的控制之下以及程序需要有多健壮。

A sample pseudocode the corresponds

对应的示例伪代码

String s = ...
switch(s) {
 case "quux":
    processQuux(s);
    // fall-through

  case "foo":
  case "bar":
    processFooOrBar(s);
    break;

  case "baz":
     processBaz(s);
    // fall-through

  default:
    processDefault(s);
    break;
}

from the fore-mentioned proposalto help you understand.

前面提到的建议来帮助你理解。

// Advanced example
{  // new scope for synthetic variables
  boolean $take_default = false;
  boolean $fallthrough = false;
  $default_label: {
      switch(s.hashCode()) { // cause NPE if s is null
      case 3482567: // "quux".hashCode()
          if (!s.equals("quux")) {
              $take_default = true;
              break $default_label;
          }
          processQuux(s);
          $fallthrough = true;
                case 101574: // "foo".hashCode()
          if (!$fallthrough && !s.equals("foo")) {
              $take_default = true;
              break $default_label;
          }
          $fallthrough = true;
      case 97299:  // "bar".hashCode()
          if (!$fallthrough && !s.equals("bar")) {
              $take_default = true;
              break $default_label;
          }
          processFooOrBar(s);
          break;

      case 97307: // "baz".hashCode()
          if (!s.equals("baz")) {
              $take_default = true;
              break $default_label;
          }
          processBaz(s);
          $fallthrough = true;

      default:
          $take_default = true;
          break $default_label;
      }
  }
  if($take_default)
      processDefault(s);
}