C++ std::stoi 真的可以安全使用吗?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/11598990/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Is std::stoi actually safe to use?
提问by chris
I had a lovely conversation with someone about the downfalls of std::stoi
. To put it bluntly, it uses std::strtol
internally, and throws if that reports an error. According to them, though, std::strtol
shouldn't report an error for an input of "abcxyz"
, causing stoi
not to throw std::invalid_argument
.
我与某人就std::stoi
. 说白了就是std::strtol
内部使用,如果报错就抛出。但是,根据他们的说法,std::strtol
不应为 的输入报告错误"abcxyz"
,从而导致stoi
不抛出std::invalid_argument
。
First of all, here are two programs tested on GCC about the behaviours of these cases:
strtol
stoi
首先,这里有两个程序在 GCC 上测试了这些案例的行为:
strtol
stoi
Both of them show success on "123"
and failure on "abc"
.
它们都显示成功"123"
和失败"abc"
。
I looked in the standard to pull more info:
我查看了标准以获取更多信息:
§ 21.5
第 21.5 条
Throws: invalid_argument if strtol, strtoul, strtoll, or strtoull reports that
no conversion could be performed. Throws out_of_range if the converted value is
outside the range of representable values for the return type.
That sums up the behaviour of relying on strtol
. Now what about strtol
? I found this in the C11 draft:
这总结了依赖的行为strtol
。现在呢strtol
?我在 C11 草案中发现了这一点:
§7.22.1.4
§7.22.1.4
If the subject sequence is empty or does not have the expected form, no
conversion is performed; the value of nptr is stored in the object
pointed to by endptr, provided that endptr is not a null pointer.
Given the situation of passing in "abc"
, the C standard dictates that nptr
, which points to the beginning of the string, would be stored in endptr
, the pointer passed in. This seems consistent with the test. Also, 0 should be returned, as stated by this:
鉴于传入的情况"abc"
,C 标准规定nptr
,指向字符串开头的 将存储在endptr
传入的指针中。这似乎与测试一致。此外,应返回 0,如下所述:
§7.22.1.4
§7.22.1.4
If no conversion could be performed, zero is returned.
The previous reference said that no conversion would be performed, so it must return 0. These conditions now comply with the C++11 standard for stoi
throwing std::invalid_argument
.
之前的参考资料说不会执行任何转换,因此它必须返回 0。这些条件现在符合 C++11 的stoi
throwing标准std::invalid_argument
。
The result of this matters to me because I don't want to go around recommending stoi
as a better alternative to other methods of string to int conversion, or using it myself as if it worked the way you'd expect, if it doesn't catch text as an invalid conversion.
这样做的结果对我很重要,因为我不想四处推荐stoi
作为其他字符串到 int 转换方法的更好替代方法,或者自己使用它,就好像它按您期望的方式工作一样,如果它没有将文本捕获为无效转换。
So after all of this, did I go wrong somewhere? It seems to me that I have good proof of this exception being thrown. Is my proof valid, or is std::stoi
not guaranteed to throw that exception when given "abc"
?
那么在这一切之后,我是不是在某个地方出错了?在我看来,我有很好的证据证明抛出了这个异常。我的证明是否有效,或者std::stoi
在给出时不能保证抛出该异常"abc"
?
回答by Generic Human
Does std::stoi
throw an error on the input "abcxyz"
?
是否std::stoi
在输入上抛出错误"abcxyz"
?
Yes.
是的。
I think your confusion may come from the fact that strtol
never reports an errorexcept on overflow. It can report that no conversion was performed, but this is never referred to as an error condition in the C standard.
我认为您的困惑可能来自于除了溢出之外strtol
从不报告错误的事实。它可以报告未执行任何转换,但这在 C 标准中从未被称为错误条件。
strtol
is defined similarly by all three C standards, and I will spare you the boring details, but it basically defines a "subject sequence" that is a substring of the input string corresponding to the actual number. The following four conditions are equivalent:
strtol
所有三个 C 标准都对它进行了类似的定义,我将省略无聊的细节,但它基本上定义了一个“主题序列”,它是与实际数字对应的输入字符串的子字符串。以下四个条件是等价的:
- the subject sequence has the expected form (in plain English: it is a number)
- the subject sequence is non-empty
- a conversion has occurred
*endptr != nptr
(this only makes sense whenendptr
is non-null)
- 主题序列具有预期的形式(简单来说:它是一个数字)
- 主题序列非空
- 发生了转换
*endptr != nptr
(这仅在endptr
非空时才有意义)
When there is an overflow, the conversion is still said to have occurred.
当发生溢出时,仍称转换已发生。
Now, it is quite clear that because "abcxyz"
does not contain a number, the subject sequence of the string "abcxyz"
must be empty, so that no conversion can be performed. The following C90/C99/C11 program will confirm it experimentally:
现在,很明显,因为"abcxyz"
不包含数字,字符串的主题序列"abcxyz"
必须为空,这样就不能进行转换。下面的 C90/C99/C11 程序将通过实验确认:
#include <stdio.h>
#include <stdlib.h>
int main() {
char *nptr = "abcxyz", *endptr[1];
strtol(nptr, endptr, 0);
if (*endptr == nptr)
printf("No conversion could be performed.\n");
return 0;
}
This implies that any conformant implementation of std::stoi
mustthrow invalid_argument
when given the input "abcxyz"
without an optional base argument.
这意味着当给定没有可选基本参数的输入时,任何符合的实现都std::stoi
必须抛出。invalid_argument
"abcxyz"
Does this mean that std::stoi
has satisfactory error checking?
这是否意味着std::stoi
具有令人满意的错误检查?
No. The person you were talking to is correct when she says that std::stoi
is more lenient than performing the full check errno == 0 && end != start && *end=='\0'
after std::strtol
, because std::stoi
silently strips away all characters starting from the first non-numeric character in the string.
不。当她说这std::stoi
比在errno == 0 && end != start && *end=='\0'
之后执行完整检查更宽松时,您正在与之交谈的人是正确的std::strtol
,因为std::stoi
从字符串中的第一个非数字字符开始默默地剥离所有字符。
In fact off the top of my head the only language whose native conversion behaves somewhat like std::stoi
is Javascript, and even then you have to force base 10 with parseInt(n, 10)
to avoid the special case of hexadecimal numbers:
事实上,在我的脑海里,唯一一种本地转换行为有点像std::stoi
Javascript 的语言,即使这样,你也必须强制以 10parseInt(n, 10)
为基数,以避免十六进制数的特殊情况:
input | std::atoi std::stoi Javascript full check
===========+=============================================================
hello | 0 error error(NaN) error
0xygen | 0 0 error(NaN) error
0x42 | 0 0 66 error
42x0 | 42 42 42 error
42 | 42 42 42 42
-----------+-------------------------------------------------------------
languages | Perl, Ruby, Javascript Javascript C#, Java,
| PHP, C... (base 10) Python...
Note: there are also differences among languages in the handling of whitespace and redundant + signs.
注意:在处理空格和冗余 + 符号方面,语言之间也存在差异。
Ok, so I want full error checking, what should I use?
好的,所以我想要完整的错误检查,我应该使用什么?
I'm not aware of any built-in function that does this, but boost::lexical_cast<int>
will do what you want. It is particularly strict since it even rejects surrounding whitespace, unlike Python's int()
function. Note that invalid characters and overflows result in the same exception, boost::bad_lexical_cast
.
我不知道有任何内置函数可以做到这一点,但boost::lexical_cast<int>
会做你想做的事。它特别严格,因为它甚至拒绝周围的空格,这与 Python 的int()
函数不同。请注意,无效字符和溢出会导致相同的异常boost::bad_lexical_cast
.
#include <boost/lexical_cast.hpp>
int main() {
std::string s = "42";
try {
int n = boost::lexical_cast<int>(s);
std::cout << "n = " << n << std::endl;
} catch (boost::bad_lexical_cast) {
std::cout << "conversion failed" << std::endl;
}
}