C++ 将字符串转换为 int 的最有效方法(比 atoi 快)

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/16826422/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-27 20:40:51  来源:igfitidea点击:

C++ most efficient way to convert string to int (faster than atoi)

c++stringperformanceinttype-conversion

提问by user788171

As mentioned in the title, I'm looking for something that can give me more performance than atoi. Presently, the fastest way I know is

正如标题中提到的,我正在寻找比 atoi 可以提供更多性能的东西。目前,我所知道的最快的方法是

atoi(mystring.c_str())

Finally, I would prefer a solution that doesn't rely on Boost. Does anybody have good performance tricks for doing this?

最后,我更喜欢不依赖 Boost 的解决方案。有没有人有这样做的良好性能技巧?

Additional Information: int will not exceed 2 billion, it is always positive, the string has no decimal places in it.

附加信息:int 不会超过 20 亿,始终为正数,字符串中没有小数位。

回答by paddy

I experimented with solutions using lookup tables, but found them fraught with issues, and actually not very fast. The fastest solution turned out to be the least imaginitive:

我尝试了使用查找表的解决方案,但发现它们充满了问题,而且实际上速度不是很快。结果证明最快的解决方案是最缺乏想象力的:

int fast_atoi( const char * str )
{
    int val = 0;
    while( *str ) {
        val = val*10 + (*str++ - '0');
    }
    return val;
}

Running a benchmark with a million randomly generated strings:

使用一百万个随机生成的字符串运行基准测试:

fast_atoi : 0.0097 seconds
atoi      : 0.0414 seconds

To be fair, I also tested this function by forcing the compiler not to inline it. The results were still good:

公平地说,我还通过强制编译器不要内联它来测试这个函数。结果还是不错的:

fast_atoi : 0.0104 seconds
atoi      : 0.0426 seconds

Provided your data conforms to the requirements of the fast_atoifunction, that is pretty reasonable performance. The requirements are:

如果您的数据符合fast_atoi函数的要求,那是相当合理的性能。要求是:

  1. Input string contains only numeric characters, or is empty
  2. Input string represents a number from 0 up to INT_MAX
  1. 输入字符串只包含数字字符,或者为空
  2. 输入字符串表示从 0 到 INT_MAX

回答by Scott Jones

atoican be improved upon significantly, given certain assumptions. This was demonstrated powerfully in a presentation by Andrei Alexandrescu at the C++ and Beyond 2012 conference. Hi s replacement used loop unrolling and ALU parallelism to achieve orders of magnitude in perf improvement. I don't have his materials, but this link uses a similar technique: http://tombarta.wordpress.com/2008/04/23/specializing-atoi/

atoi考虑到某些假设,可以显着改进。Andrei Alexandrescu 在 C++ and Beyond 2012 会议上的演讲有力地证明了这一点。Hi 的替代品使用循环展开和 ALU 并行性来实现性能改进的数量级。我没有他的材料,但这个链接使用了类似的技术:http: //tombarta.wordpress.com/2008/04/23/specializing-atoi/

回答by x-x

This pagecompares conversion speed between different string->int functions using different compilers. The naive function, which offers no error checking, offers speeds roughly twice as fast as atoi(), according to the results presented.

本页比较了使用不同编译器的不同 string->int 函数之间的转换速度。根据显示的结果,不提供错误检查的天真函数提供的速度大约是 atoi() 的两倍。

// Taken from http://tinodidriksen.com/uploads/code/cpp/speed-string-to-int.cpp
int naive(const char *p) {
    int x = 0;
    bool neg = false;
    if (*p == '-') {
        neg = true;
        ++p;
    }
    while (*p >= '0' && *p <= '9') {
        x = (x*10) + (*p - '0');
        ++p;
    }
    if (neg) {
        x = -x;
    }
    return x;
}

it is always positive

它总是积极的

Remove the negative checks in the above code for a micro optimization.

删除上面代码中的否定检查以进行微优化。

If you can guarantee the string will not have anything but numeric characters, you can micro optimize further by changing the loop

如果你能保证字符串除了数字字符之外没有任何东西,你可以通过改变循环来进一步微优化

while (*p >= '0' && *p <= '9') {

to

while (*p != '
unsigned int naive(const char *p) {
    unsigned int x = 0;
    while (*p != '
uint64_t n = digit_value(*p);
unsigned d;

while ((d = digit_value(*++p)) <= 9)
{
   n = n * 10 + d;
}
') { x = (x*10) + (*p - '0'); ++p; } return x; }
' ) {

Which leaves you with

这让你有

unsigned digit_value (char c)
{
   return unsigned(c - '0');
}

bool is_digit (char c)
{
   return digit_value(c) <= 9;
}

uint64_t extract_uint64 (char const **read_ptr)
{
   char const *p = *read_ptr;
   uint64_t n = digit_value(*p);
   unsigned d;

   while ((d = digit_value(*++p)) <= 9)
   {
      n = n * 10 + d;
   }

   *read_ptr = p;

   return n;
}

回答by DarthGizka

Quite a few of the code examples here are quite complex and do unnecessary work, meaning the code could be slimmer and faster.

这里的相当多的代码示例非常复杂并且做了不必要的工作,这意味着代码可以更精简、更快。

Conversion loops are often written to do three different things with each character:

转换循环通常被编写为对每个字符执行三种不同的操作:

  • bail out if it is the end-of-string character
  • bail out if it is not a digit
  • convert it from its code point to the actual digit value
  • 如果它是字符串结尾字符,则退出
  • 如果不是数字则退出
  • 将其从其代码点转换为实际数字值

First observation: there is no need to check for the end-of-string character separately, since it is not a digit. Hence the check for 'digitness' covers the EOS condition implicitly.

第一个观察:不需要单独检查字符串结尾字符,因为它不是数字。因此,对“数字性”的检查隐含地涵盖了 EOS 条件。

Second observation: double conditions for range testing as in (c >= '0' && c <= '9')can be converted to a single test condition by using an unsigned type and anchoring the range at zero; that way there can be no unwanted values below the beginning of the range, all unwanted values are mapped to the range above the upper limit: (uint8_t(c - '0') <= 9)

第二个观察:(c >= '0' && c <= '9')通过使用无符号类型并将范围锚定为零,可以将范围测试的双重条件转换为单个测试条件;这样就不会有低于范围开始的不需要的值,所有不需要的值都映射到高于上限的范围:(uint8_t(c - '0') <= 9)

It just so happens that c - '0'needs to be computed here anyway...

碰巧c - '0'无论如何都需要在这里计算......

Hence the inner conversion loop can be slimmed to

因此,内部转换循环可以缩减为

unsigned int fast_atou(const char *str)
{
    unsigned int val = 0;
    while(*str) {
        val = (val << 1) + (val << 3) + *(str++) - 48;
    }
    return val;
}

The code here is called with the precondition that pbe pointing at a digit, which is why the first digit is extracted without further ado (which also avoids a superfluous MUL).

调用这里的代码的前提p是指向一个数字,这就是为什么不费吹灰之力就提取第一个数字的原因(这也避免了多余的 MUL)。

That precondition is less outlandish than might appear at first, since ppointing at a digit is the reason why this code is called by the parser in the first place. In my code the whole shebang looks like this (assertions and other production-quality noise elided):

这个先决条件不像最初可能出现的那样古怪,因为p指向一个数字是解析器首先调用此代码的原因。在我的代码中,整个shebang看起来像这样(断言和其他生产质量的噪音被忽略了):

int fast_atoi(const char *buff)
{
    int c = 0, sign = 0, x = 0;
    const char *p = buff;

    for(c = *(p++); (c < 48 || c > 57); c = *(p++)) {if (c == 45) {sign = 1; c = *(p++); break;}}; // eat whitespaces and check sign
    for(; c > 47 && c < 58; c = *(p++)) x = (x << 1) + (x << 3) + c - 48;

    return sign ? -x : x;
} 

The first call to digit_value()is often elided by the compiler, if the code gets inlined and the calling code has already computed that value by calling is_digit().

第一次调用digit_value()编译器往往省略掉,如果代码被内联,并通过调用调用代码已经计算出该值is_digit()

n * 10happens to be faster than manual shifting (e.g. n = (n << 3) + (n << 1) + d), at least on my machine with gcc 4.8.1 and VC++ 2013. My guess is that both compilers use LEAwith index scaling for adding up to three values in one go and scaling one of them by 2, 4, or 8.

n * 10恰好比手动移位(例如n = (n << 3) + (n << 1) + d)更快,至少在我的机器上使用 gcc 4.8.1 和 VC++ 2013。我的猜测是,两个编译器都使用LEA索引缩放来一次性添加三个值并缩放其中一个2、4 或 8。

In any case that's exactly how it should be: we write nice clean code in separate functions and express the desired logic (n * 10, x % CHAR_BIT, whatever) and the compiler converts it to shifting, masking, LEAing and so on, inlines everything into the big bad parser loop and takes care of all the required messiness under the hood to make things fast. We don't even have to stick inlinein front of everything anymore. If anything then we have to do the opposite, by using __declspec(noinline)judiciously when compilers get over-eager.

在任何情况下,它都应该是这样:我们在单独的函数中编写漂亮干净的代码并表达所需的逻辑(n * 10,x % CHAR_BIT,等等),编译器将其转换为移位、屏蔽、LEA 等,内联一切都进入大的坏解析器循环,并在引擎盖下处理所有必需的混乱以加快速度。我们甚至不必再坚持inline一切。如果有的话,那么我们必须做相反的事情,__declspec(noinline)在编译器过于急切时明智地使用。

I'm using the above code in a program that reads billions of numbers from text files and pipes; it converts 115 million uints per second if the length is 9..10 digits, and 60 million/s for length 19..20 digits (gcc 4.8.1). That's more than ten times as fast as strtoull()(and just barely enough for my purposes, but I digress...). That's the timing for converting text blobs containing 10 million numbers each (100..200 MB), meaning that memory timings make these numbers appear a bit worse than they would be in a synthetic benchmark running from cache.

我在一个从文本文件和管道中读取数十亿个数字的程序中使用了上面的代码;如果长度为 9..10 位,则每秒转换 1.15 亿个单位,长度为 19..20 位(gcc 4.8.1)时转换为 6000 万/s。这比我的速度快十倍多strtoull()(而且刚好够我的目的,但我离题了......)。这是转换每个包含 1000 万个数字 (100..200 MB) 的文本 blob 的时间,这意味着内存时间使这些数字看起来比在从缓存运行的合成基准测试中要差一些。

回答by soerium

Paddy's implementation of fast_atoiisfaster than atoi- without the shadow of the doubt - however it works only for unsigned integers.

帕迪实施fast_atoi不是更快的atoi-没有疑问的阴影-但它仅适用于无符号整数

Below, I put evaluated version of Paddy's fast_atoi that also allows only unsigned integers but speeds conversion up even more by replacing costly operation *with +

下面,我放了 Paddy 的 fast_atoi 的评估版本,它也只允许无符号整数,但通过将昂贵的操作*替换为+ 来加快转换速度

long atoi(const char *str)
{
    long num = 0;
    int neg = 0;
    while (isspace(*str)) str++;
    if (*str == '-')
    {
        neg=1;
        str++;
    }
    while (isdigit(*str))
    {
        num = 10*num + (*str - '0');
        str++;
    }
    if (neg)
        num = -num;
    return num;
 }

Here, I put complete versionof fast_atoi()that i'm using sometimes which converts singed integers as well:

在这里,我放了我有时使用的fast_atoi() 的完整版本也可以转换单数整数:

int myInt; 
string myString = "1561";
stringstream ss;
ss(myString);
ss >> myInt;

回答by Joel

Here's the entirety of the atoi function in gcc:

这是 gcc 中 atoi 函数的全部内容:

#include <stringstream> 

The whitespace and negative check are superfluous in your case, but also only use nanoseconds.

在您的情况下,空格和否定检查是多余的,但也只使用纳秒。

isdigit is almost certainly inlined, so that's not costing you any time.

isdigit 几乎肯定是内联的,所以这不会花费你任何时间。

I really don't see room for improvement here.

我真的看不出这里有改进的余地。

回答by Rome_Leader

Why not use a stringstream? I'm not sure of its particular overhead, but you could define:

为什么不使用字符串流?我不确定它的特定开销,但您可以定义:

int fast_atoi( const char * str )
{
    int val = 0;
    while( *str ) {
        val = (val << 3) + (val << 1) + (*str++ - '0');
    }
    return val;
}

Of course, you'd need to

当然,你需要

int value = t1[s[n-1]];
if (n > 1) value += t10[s[n-2]]; else return value;
if (n > 2) value += t100[s[n-3]]; else return value;
if (n > 3) value += t1000[s[n-4]]; else return value;
... continuing for how many digits you need to handle ...

回答by hamSh

A faster convert function only for positive integerswithout error checking.

一个更快的转换函数,仅用于没有错误检查的正整数

Multiplication is always slower that sum and shift, therefore change multiply with shift.

乘法总是比求和和移位慢,因此随着移位改变乘法。

int val = 0;
while( *str ) 
    val = val*10 + (*str++ - '0');

回答by 6502

The only definitive answer is with checking with your compiler, your real data.

唯一确定的答案是检查您的编译器,检查您的真实数据。

Something I'd try (even if it's using memory accesses so it may be slow depending on caching) is

我会尝试的东西(即使它使用内存访问,所以它可能会根据缓存而变慢)是

#define EQ1(a,a1) (BYTE(a) == BYTE(a1))
#define EQ1(a,a1,a2) (BYTE(a) == BYTE(a1) && EQ1(a,a2))
#define EQ1(a,a1,a2,a3) (BYTE(a) == BYTE(a1) && EQ1(a,a2,a3))

// Atoi is 4x faster than atoi.  There is also an overload that takes a cb argument.
template <typename T> 
T Atoi(LPCSTR sz) {
    T n = 0;
    bool fNeg = false;  // for unsigned T, this is removed by optimizer
    const BYTE* p = (const BYTE*)sz;
    BYTE ch;
    // test for most exceptions in the leading chars.  Most of the time
    // this test is skipped.  Note we skip over leading zeros to avoid the 
    // useless math in the second loop.  We expect leading 0 to be the most 
    // likely case, so we test it first, however the cpu might reorder that.
    for ( ; (ch=*p-'1') >= 9 ; ++p) { // unsigned trick for range compare
      // ignore leading 0's, spaces, and '+'
      if (EQ1(ch, '0'-'1', ' '-'1', '+'-'1'))
        continue;
      // for unsigned T this is removed by optimizer
      if (!((T)-1 > 0) && ch==BYTE('-'-'1')) {
        fNeg = !fNeg;
        continue;
      }
      // atoi ignores these.  Remove this code for a small perf increase.
      if (BYTE(*p-9) > 4)  // \t, \n, 11, 12, \r. unsigned trick for range compare
        break;
    }
    // deal with rest of digits, stop loop on non digit.
    for ( ; (ch=*p-'0') <= 9 ; ++p) // unsigned trick for range compare
      n = n*10 + ch; 
    // for unsigned T, (fNeg) test is removed by optimizer
    return (fNeg) ? -n : n;
}

// you could go with a single template that took a cb argument, but I could not
// get the optimizer to create good code when both the cb and !cb case were combined.
// above code contains the comments.
template <typename T>
T Atoi(LPCSTR sz, BYTE cb) {
    T n = 0;
    bool fNeg = false; 
    const BYTE* p = (const BYTE*)sz;
    const BYTE* p1 = p + cb;
    BYTE ch;
    for ( ; p<p1 && (ch=*p-'1') >= 9 ; ++p) {
      if (EQ1(ch,BYTE('0'-'1'),BYTE(' '-'1'),BYTE('+'-'1')))
        continue;
      if (!((T)-1 > 0) && ch == BYTE('-'-'1')) {
        fNeg = !fNeg;
        continue;
      }
      if (BYTE(*p-9) > 4)  // \t, \n, 11, 12, \r
        break;
    }
    for ( ; p<p1 && (ch=*p-'0') <= 9 ; ++p)
      n = n*10 + ch; 
    return (fNeg) ? -n : n;
}

if t1, t10and so on are statically allocated and constant the compiler shouldn't fear any aliasing and the machine code generated should be quite decent.

如果t1t10等等都是静态分配和不断的编译器不应该惧怕任何混淆和产生的应该是相当不错的机器代码。

回答by johnnycrash

Here is mine. Atoi is the fastest I could come up with. I compiled with msvc 2010 so it might be possible to combine both templates. In msvc 2010, when I combined templates it made the case where you provide a cb argument slower.

这是我的。Atoi 是我能想到的最快的。我是用 msvc 2010 编译的,所以可以将两个模板结合起来。在 msvc 2010 中,当我组合模板时,它使您提供 cb 参数的速度变慢。

Atoi handles nearly all the special atoi cases, and is as fast or faster than this:

Atoi 处理几乎所有特殊的 atoi 情况,并且速度与此一样快:

##代码##

Here is the code:

这是代码:

##代码##