还有理由在 C++ 代码中使用 `int` 吗?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/48729384/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Is there still a reason to use `int` in C++ code?
提问by InsideLoop
Many style guides such as the Google one recommend using int
as a default integer when indexing arrays for instance. With the rise of 64-bit platforms where most of the time an int
is only 32 bits which is not the natural width of the platform. As a consequence, I see no reason, apart from the simple same, to keep that choice. We clearly see that where compiling the following code:
例如,int
在索引数组时,许多样式指南(例如 Google 的指南)建议将其用作默认整数。随着 64 位平台的兴起,大部分时间int
只有 32 位,这不是平台的自然宽度。因此,除了简单的相同之外,我认为没有理由保留该选择。我们清楚地看到编译以下代码的地方:
double get(const double* p, int k) {
return p[k];
}
which gets compiled into
它被编译成
movslq %esi, %rsi
vmovsd (%rdi,%rsi,8), %xmm0
ret
where the first instruction promotes the 32 bits integer into a 64 bits integer.
其中第一条指令将 32 位整数提升为 64 位整数。
If the code is transformed into
如果代码转换为
double get(const double* p, std::ptrdiff_t k) {
return p[k];
}
the generated assembly is now
生成的程序集现在是
vmovsd (%rdi,%rsi,8), %xmm0
ret
which clearly shows that the CPU feels more at home with std::ptrdiff_t
than with an int
. Many C++ users have moved to std::size_t
, but I don't want to use unsigned integers unless I really need modulo 2^n
behaviour.
这清楚地表明,std::ptrdiff_t
与int
. 许多 C++ 用户已经转移到std::size_t
,但我不想使用无符号整数,除非我真的需要模2^n
行为。
In most cases, using int
does not hurt performance as the undefined behaviour or signed integer overflows allow the compiler to internally promote any int
to a std::ptrdiff_t
in loops that deal with indices, but we clearly see from the above that the compiler does not feel at home with int
. Also, using std::ptrdiff_t
on a 64-bit platform would make overflows less likely to happen as I see more and more people getting trapped by int
overflows when they have to deal with integers larger than 2^31 - 1
which become really common these days.
在大多数情况下, usingint
不会损害性能,因为未定义的行为或有符号整数溢出允许编译器在内部将任何处理索引的循环提升int
为 a std::ptrdiff_t
in 循环,但我们从上面清楚地看到,编译器对int
. 此外,std::ptrdiff_t
在 64 位平台上使用将使溢出不太可能发生,因为我看到越来越多的人int
在不得不处理比2^31 - 1
现在变得非常普遍的整数时被溢出所困。
From what I have seen, the only thing that makes int
stand apart seems to be the fact that literals such as 5
are int
, but I don't see where it might cause any problem if we move to std::ptrdiff_t
as a default integer.
从我所看到的,唯一int
与众不同的似乎是像5
are这样的文字int
,但我看不出如果我们移动到std::ptrdiff_t
默认整数,它可能会导致什么问题。
I am on the verge of making std::ptrdiff_t
as the de facto standard integer for all the code written in my small company. Is there a reason why it could be a bad choice?
我即将成为std::ptrdiff_t
我的小公司编写的所有代码的事实上的标准整数。有什么理由说明它可能是一个糟糕的选择吗?
PS: I agree with the fact that the name std::ptrdiff_t
is ugly which is the reason why I have typedef'ed it to il::int_t
which look a bit better.
PS:我同意这个名字std::ptrdiff_t
很难看的事实,这就是为什么我将它的 typedef'edil::int_t
看起来更好一点的原因。
PS: As I know that many people will recommend me to use std::size_t
as a default integer, I really want to make it clear that I don't want to use an unsigned integer as my default integer. The use of std::size_t
as a default integer in the STL has been a mistake as acknowledged by Bjarne Stroustrup and the standard committee in the video Interactive Panel: Ask Us Anythingat time 42:38 and 1:02:50.
PS:据我所知,很多人会推荐我使用std::size_t
作为默认整数,我真的想明确表示我不想使用无符号整数作为我的默认整数。std::size_t
正如 Bjarne Stroustrup 和标准委员会在视频互动小组:在时间 42:38 和 1:02:50向我们提出任何问题所承认的那样,在 STL 中使用作为默认整数是一个错误。
PS: In terms of performance, on any 64-bit platform that I know of, +
, -
and *
gets compiled the same way for both int
and std::ptrdiff_t
. So there is no difference in speed. If you divide by a compile-time constant, the speed is the same. It's only when you divide a/b
when you know nothing about b
that using 32 bits integer on a 64-bit platform gives you a slight advantage in performance. But this case is so rare as I don't see as a choice from moving away from std::ptrdiff_t
. When we deal with vectorized code, here there is a clear difference, and the smaller, the better, but that's a different story, and there would be no reason to stick with int
. In those cases, I would recommend going to the fixed size types of C++.
PS:在性能方面,任何64位平台,我知道的,+
,-
和*
被编译为通过相同的方式int
和std::ptrdiff_t
。所以速度上没有区别。如果除以编译时常数,速度是一样的。只有当你对在 64 位平台上使用 32 位整数a/b
一无所知的情况下进行除法时,才会b
在性能上略有优势。但这种情况非常罕见,因为我不认为离开std::ptrdiff_t
. 当我们处理矢量化代码时,这里有一个明显的区别,越小越好,但那是另一回事了,没有理由坚持使用int
. 在这些情况下,我建议使用 C++ 的固定大小类型。
回答by Robert Andrzejuk
There was a discussion on the C++ Core Guidelines what to use:
有一个关于 C++ 核心指南使用什么的讨论:
https://github.com/isocpp/CppCoreGuidelines/pull/1115
https://github.com/isocpp/CppCoreGuidelines/pull/1115
Herb Sutter wrote that gsl::index
will be added (in the future maybe std::index
), which will be defined as ptrdiff_t
.
Herb Sutter 写道gsl::index
将添加(将来可能std::index
),将定义为ptrdiff_t
。
hsutter commented on 26 Dec 2017 ?
(Thanks to many WG21 experts for their comments and feedback into this note.)
Add the following typedef to GSL
namespace gsl { using index = ptrdiff_t; }
and recommend
gsl::index
for all container indexes/subscripts/sizes.Rationale
The Guidelines recommend using a signed type for subscripts/indices. See ES.100 through ES.107. C++ already uses signed integers for array subscripts.
We want to be able to teach people to write "new clean modern code" that is simple, natural, warning-free at high warning levels, and doesn't make us write a "pitfall" footnote about simple code.
If we don't have a short adoptable word like
index
that is competitive withint
andauto
, people will still useint
andauto
and get their bugs. For example, they will writefor(int i=0; i<v.size(); ++i)
orfor(auto i=0; i<v.size(); ++i)
which have 32-bit size bugs on widely used platforms, andfor(auto i=v.size()-1; i>=0; ++i)
which just doesn't work. I don't think we can teachfor(ptrdiff_t i = ...
with a straight face, or that people would accept it.If we had a saturating arithmetic type, we might use that. Otherwise, the best option is
ptrdiff_t
which has nearly all the advantages of a saturating arithmetic unsigned type, except only thatptrdiff_t
still makes the pervasive loop stylefor(ptrdiff_t i=0; i<v.size(); ++i)
emit signed/unsigned mismatches oni<v.size()
(and similarly fori!=v.size()
) for today's STL containers. (If a future STL changes its size_type to be signed, even this last drawback goes away.)However, it would be hopeless (and embarrassing) to try to teach people to routinely write
for (ptrdiff_t i = ... ; ... ; ...)
. (Even the Guidelines currently use it in only one place, and that's a "bad" example that is unrelated to indexing`.)Therefore we should provide
gsl::index
(which can later be proposed for consideration asstd::index
) as a typedef forptrdiff_t
, so we can hopefully (and not embarrassingly) teach people to routinely write for(index i = ... ; ... ; ...)
.Why not just tell people to write
ptrdiff_t
?Because we believe it would be embarrassing to tell people that's what you have to do in C++, and even if we did people won't do it. Writingptrdiff_t
is too ugly and unadoptable compared toauto
andint
. The point of adding the nameindex
is to make it as easy and attractive as possible to use a correctly sized signed type.
hsutter 于 2017 年 12 月 26 日发表评论?
(感谢许多 WG21 专家对本笔记的评论和反馈。)
将以下 typedef 添加到 GSL
namespace gsl { using index = ptrdiff_t; }
并推荐
gsl::index
用于所有容器索引/下标/大小。基本原理
指南建议对下标/索引使用有符号类型。请参阅 ES.100 到 ES.107。C++ 已经将有符号整数用于数组下标。
我们希望能够教人们编写简单、自然、在高警告级别无警告的“新的干净的现代代码”,并且不会让我们编写关于简单代码的“陷阱”脚注。
如果我们没有一个短字可采用类似
index
是有竞争力的int
和auto
,人们仍然会使用int
和auto
并得到他们的错误。例如,他们会在广泛使用的平台上编写for(int i=0; i<v.size(); ++i)
或for(auto i=0; i<v.size(); ++i)
存在 32 位大小的错误,而for(auto i=v.size()-1; i>=0; ++i)
哪些是行不通的。我不认为我们可以板for(ptrdiff_t i = ...
着脸教,否则人们会接受。如果我们有一个饱和算术类型,我们可能会使用它。否则,最好的选择是
ptrdiff_t
它具有饱和算术无符号类型的几乎所有优点,除了ptrdiff_t
仍然使普遍循环样式for(ptrdiff_t i=0; i<v.size(); ++i)
在当今的 STL 容器上i<v.size()
(以及类似地i!=v.size()
)发出有符号/无符号不匹配。(如果未来的 STL 将其 size_type 更改为要签名,即使最后一个缺点也会消失。)然而,试图教人们常规写作是没有希望的(而且令人尴尬)
for (ptrdiff_t i = ... ; ... ; ...)
。(即使指南目前也只在一个地方使用它,这是一个与索引无关的“坏”示例。)因此,我们应该提供
gsl::index
(以后可以作为 考虑作为std::index
)作为 的 typedefptrdiff_t
,这样我们就可以希望(而不是尴尬地)教人们定期编写 for(index i = ... ; ... ; ...)
。为什么不直接告诉人们写作
ptrdiff_t
?因为我们相信告诉人们这就是你必须在 C++ 中做的事情会很尴尬,即使我们做了,人们也不会这样做。写作ptrdiff_t
相比,是太丑陋和unadoptableauto
和int
。添加名称的目的index
是使使用正确大小的签名类型尽可能简单和有吸引力。
Edit: More rationale from Herb Sutter
编辑:Herb Sutter 的更多理由
Is
ptrdiff_t
big enough?Yes. Standard containers are already required to have no more elements than can be represented byptrdiff_t
, because subtracting two iterators must fit in a difference_type.But is
ptrdiff_t
really big enough, if I have a built-in array ofchar
orbyte
that is bigger than half the size of the memory address space and so has more elements than can be represented in aptrdiff_t
?Yes. C++ already uses signed integers for array subscripts. So useindex
as the default option for the vast majority of uses including all built-in arrays. (If you do encounter the extremely rare case of an array, or array-like type, that is bigger than half the address space and whose elements aresizeof(1)
, and you're careful about avoiding truncation issues, go ahead and use asize_t
for indexes into that very special container only. Such beasts are very rare in practice, and when they do arise often won't be indexed directly by user code. For example, they typically arise in a memory manager that takes over system allocation and parcels out individual smaller allocations that its users use, or in an MPEG or similar which provides its own interface; in both cases thesize_t
should only be needed internally within the memory manager or the MPEG class implementation.)
是否
ptrdiff_t
足够大?是的。标准容器已经被要求不能有比 可以表示的更多的元素ptrdiff_t
,因为减去两个迭代器必须适合一个 difference_type。但是
ptrdiff_t
真的足够大,如果我有一个内置的char
or数组byte
大于内存地址空间大小的一半,因此有更多的元素可以用ptrdiff_t
? 是的。C++ 已经将有符号整数用于数组下标。因此index
用作绝大多数用途的默认选项,包括所有内置数组。(如果您确实遇到了非常罕见的数组或类数组类型的情况,即大于地址空间的一半且其元素为sizeof(1)
,并且您很小心避免截断问题,请继续使用size_t
仅用于索引到那个非常特殊的容器中。这样的野兽在实践中非常罕见,当它们出现时,通常不会被用户代码直接索引。例如,它们通常出现在内存管理器中,该管理器接管系统分配并将其用户使用的单个较小分配分开,或者出现在提供自己接口的 MPEG 或类似程序中;在这两种情况下,size_t
应该只在内存管理器或 MPEG 类实现内部需要。)
回答by little_birdie
I come at this from the perspective of an old timer (pre C++)... It was understood back in the day that int
was the native word of the platform and was likely to give the best performance.
我是从一个老手(C++ 之前)的角度来看待这个问题的……当时人们认为这int
是平台的原生词,可能会提供最佳性能。
If you needed something bigger, then you'd use it and pay the price in performance. If you needed something smaller (limited memory, or specific need for a fixed size), same thing.. otherwise use int
. And yeah, if your value was in the range where int on one target platform could accommodate it and int on another target platform could not.. then we had our compile time size specific defines (prior to them becoming standardized we made our own).
如果您需要更大的东西,那么您会使用它并为性能付出代价。如果您需要更小的东西(有限的内存,或对固定大小的特定需求),同样的事情.. 否则使用int
. 是的,如果您的值在一个目标平台上的 int 可以容纳它而另一个目标平台上的 int 不能容纳的范围内..那么我们有我们的编译时大小特定定义(在它们变得标准化之前,我们自己做了)。
But now, present day, processors and compilers are much more sophisticated and these rules don't apply so easily. It is also harder to predict what the performance impact of your choice will be on some unknown future platform or compiler ... How do we really know that uint64_t for example will perform better or worse than uint32_t on any particular future target? Unless you're a processor/compiler guru, you don't...
但是现在,处理器和编译器要复杂得多,而且这些规则并不那么容易应用。也更难预测您选择的性能对某些未知的未来平台或编译器的影响......我们如何真正知道例如 uint64_t 在任何特定的未来目标上会比 uint32_t 表现更好或更差?除非您是处理器/编译器专家,否则您不会...
So... maybe it's old fashioned, but unless I am writing code for a constrained environment like Arduino, etc. I still use int
for general purpose values that I know will be within int
size on all reasonable targets for the application I am writing. And the compiler takes it from there... These days that generally means 32 bits signed. Even if one assumes that 16 bits is the minimum integer size, it covers most use cases.. and the use cases for numbers larger than that are easily identified and handled with appropriate types.
所以...也许它是老式的,但除非我正在为像 Arduino 等受限环境编写代码。我仍然使用int
通用值,我知道这些值在int
我正在编写的应用程序的所有合理目标上都在大小范围内。编译器从那里获取它......这些天通常意味着 32 位有符号。即使假设 16 位是最小整数大小,它也涵盖了大多数用例……并且可以使用适当的类型轻松识别和处理大于该数字的用例。
回答by Eyal K.
Most programs do not live and die on the edge of a few CPU cycles, and int
is very easy to write. However, if you are performance-sensitive, I suggest using the fixed-width integer types defined in <cstdint>
, such as int32_t
or uint64_t
.
These have the benefit of being very clear in their intended behavior in regards to being signed or unsigned, as well as their size in memory. This header also includes the fast variants such as int_fast32_t
, which are at leastthe stated size, but might be more, if it helps performance.
大多数程序不会在几个 CPU 周期的边缘生存和死亡,并且int
非常容易编写。但是,如果您对性能敏感,我建议使用 中定义的固定宽度整数类型<cstdint>
,例如int32_t
或uint64_t
。它们的好处是在有关有符号或无符号的预期行为以及它们在内存中的大小方面非常清楚。此标头还包括快速变体,例如int_fast32_t
,它们至少是规定的大小,但如果有助于提高性能,则可能更多。
回答by Uprooted
No formal reason to use int
. It doesn't correspond to anything sane as per standard. For indices you almost always want signed pointer-sized integer.
没有正式的理由使用int
. 它不符合标准的任何理智。对于索引,您几乎总是想要有符号指针大小的整数。
That said, typing int
feels like you just said hey to Ritchie and typing std::ptrdiff_t
feels like Stroustrup just kicked you in the butt. Coders are people too, don't bring too much ugliness into their life. I would prefer to use some easily typed typedef like long
orindex
instead of std::ptrdiff_t
.
也就是说,打字int
感觉就像你刚刚对 Ritchie 说嘿,打字std::ptrdiff_t
感觉就像 Stroustrup 刚刚踢了你的屁股。程序员也是人,不要给他们的生活带来太多丑陋。我更喜欢使用一些易于键入的 typedef,long
或index
而不是std::ptrdiff_t
.
回答by Damon
This is somewhat opinion-based, but alas, the question somewhat begs for it, too.
这有点基于意见,但唉,这个问题也有点乞求它。
First of all, you talk about integers and indices as if they were the same thing, which is not the case. For any such thing as "integer of sorts, not sure what size", simply using int
is of course, most of the time, still appropriate. This works fine most of the time, for most applications, and the compiler is comfortable with it. As a default, that's fine.
首先,您在谈论整数和索引时好像它们是同一回事,但事实并非如此。对于任何诸如“各种整数,不确定大小”之类的东西int
,在大多数情况下,简单地使用当然仍然合适。对于大多数应用程序,这在大多数情况下都可以正常工作,并且编译器对此很满意。默认情况下,这很好。
For array indices, it's a different story.
对于数组索引,情况就不同了。
There is to date one single formally correct thing, and that's std::size_t
. In the future, there may be a std::index_t
which makes the intent clearer on the source level, but so far there is not.std::ptrdiff_t
as an index "works" but is just as incorrect as int
since it allows for negative indices.
Yes, this happens what Mr. Sutter deems correct, but I beg to differ. Yes, on an assembly language instruction level, this is supported just fine, but I still object. The standard says:
迄今为止,只有一件正式正确的事情,那就是std::size_t
. 未来可能会有一个std::index_t
在源码层面上更清晰的意图,但目前还没有。std::ptrdiff_t
作为索引“有效”,但与int
允许负索引一样不正确。
是的,这发生在萨特先生认为正确的情况下,但我不同意。是的,在汇编语言指令级别上,这得到了很好的支持,但我仍然反对。标准说:
8.3.4/6:
E1[E2]
is identical to*((E1)+(E2))
[...] Because of the conversion rules that apply to+
, ifE1
is an array andE2
an integer, thenE1[E2]
refers to theE2
-th member ofE1
.
5.7/5: [...] If both the pointer operand and the result point to elements of the same array object, or one past the last element of the array object [...] otherwise, the behavior is undefined.
8.3.4/6:
E1[E2]
等同于*((E1)+(E2))
[...] 由于适用于 的转换规则+
,如果E1
是数组和E2
整数,则E1[E2]
引用 的E2
第 -th 个成员E1
。
5.7/5: [...] 如果指针操作数和结果都指向同一个数组对象的元素,或者超过数组对象的最后一个元素 [...] 否则,行为是 undefined。
An array subscription refers to the E2
-th member of E1
. There is no such thing as a negative-th element of an array. But more importantly, the pointer arithmetic with a negative additive expression invokes undefined behavior.
数组订阅是指 的E2
第 -th 个成员E1
。没有数组的第一个负元素这样的东西。但更重要的是,带有负加法表达式的指针算术会调用未定义的行为。
In other words: signed indices of whatever size are a wrong choice. Indices are unsigned. Yes, signed indices work, but they're still wrong.
换句话说:任何大小的有符号索引都是错误的选择。指数是无符号的。是的,签名索引有效,但它们仍然是错误的。
Now, although size_t
is by definition the correct choice (an unsigned integer type that is large enough to contain the size of any object), it may be debatable whether it is trulygood choice for the average case, or as a default.
现在,虽然size_t
根据定义是正确的选择(一种大到足以包含任何对象大小的无符号整数类型),但对于一般情况来说,它是真正的好选择,还是作为默认选择,这可能是有争议的。
Be honest, when was the last time you created an array with 1019elements?
老实说,您最后一次创建一个包含 10 19 个元素的数组是什么时候?
I am personally using unsigned int
as a default because the 4 billion elements that this allows for is way enough for (almost) every application, and it already pushes the average user's computer rather close to its limit (if merely subscribing an array of integers, that assumes 16GB of contiguous memory allocated). I personally deem defaulting to 64-bit indices as ridiculous.
我个人将其unsigned int
用作默认值,因为这允许的 40 亿个元素对于(几乎)每个应用程序来说已经足够了,并且它已经将普通用户的计算机推到了相当接近其极限(如果只是订阅一个整数数组,假设分配了 16GB 的连续内存)。我个人认为默认使用 64 位索引是荒谬的。
If you are programming a relational database or a filesystem, then yes, you will need64-bit indices. But for the average "normal" program, 32-bit indices are just good enough, and they only consume half as much storage.
如果您正在对关系数据库或文件系统进行编程,那么是的,您将需要64 位索引。但是对于一般的“普通”程序来说,32 位索引就足够了,而且它们只消耗一半的存储空间。
When keeping around considerably more than a handful of indices, and if I can afford (because arrays are not larger than 64k elements), I even go down to uint16_t
. No, I'm not joking there.
当保留大量索引时,如果我负担得起(因为数组不超过 64k 个元素),我什至会使用uint16_t
. 不,我不是在开玩笑。
Is storage really such a problem? It's ridiculous to greed about two or four bytes saved, isn't it! Well, no...
存储真的有这样的问题吗?贪婪地节省大约两个或四个字节是荒谬的,不是吗!嗯,不...
Size can be a problem for pointers, so sure enough it can be for indices as well. The x32 ABI does not exist for no reason. You will not notice the overhead of needlessly large indices if you have only a handful of them in total (just like pointers, they will be in registers anyway, nobody will notice whether they're 4 or 8 bytes in size).
大小对于指针来说可能是一个问题,所以肯定它也可以用于索引。x32 ABI 无缘无故不存在。如果您总共只有少数几个索引,您将不会注意到不必要的大索引的开销(就像指针一样,无论如何它们都会在寄存器中,没有人会注意到它们的大小是 4 字节还是 8 字节)。
But think for example of a slot map where you store an index for every element (depending on the implementation, twoindices per element). Oh heck, it sure does make a bummer of a difference whether you hit L2 every time, or whether you have a cache miss on every access! Bigger is not always better.
但是,请考虑以插槽映射为例,您可以在其中为每个元素存储一个索引(取决于实现,每个元素有两个索引)。哎呀,无论您是每次都访问 L2,还是每次访问都没有缓存未命中,这确实会带来很大的不同!更大并不总是更好。
At the end of the day, you must ask yourself what you pay for, and what you get in return. With that in mind, my style recommendation would be:
归根结底,你必须问问自己付出了什么,得到了什么。考虑到这一点,我的风格建议是:
If it costs you "nothing" because you only have e.g. one pointer and a few indices to keep around, then just use what's formally correct (that'd be size_t
). Formally correct is good, correct always works, it's readable and intellegible, and correct is... never wrong.
如果因为您只有一个指针和一些索引要保留而“没有任何费用”,那么只需使用正式正确的内容(即size_t
)。形式上正确是好的,正确总是有效的,它具有可读性和可理解性,正确是......永远不会错。
If, however, it does costyou (you have maybe several hundred or thousand or ten thousand indices), and what you get back is worth nothing (because e.g. you cannot even store 220elements, so whether you couldsubscribe 232or 264makes no difference), you should think twice about being too wasteful.
但是,如果它确实花费了您(您可能有数百、数千或一万个索引),并且您得到的东西一文不值(因为例如,您甚至无法存储 2 20 个元素,因此您是否可以订阅 2 32或 2 64没有区别),你应该三思而后行太浪费。
回答by jick
On most modern 64-bit architectures, int
is 4 bytes and ptrdiff_t
is 8 bytes. If your program uses a lot of integers, using ptrdiff_t
instead of int
could doubleyour program's memory requirement.
在大多数现代 64 位体系结构上,分别int
为 4 个字节和ptrdiff_t
8 个字节。如果您的程序使用大量整数,使用ptrdiff_t
而不是int
可能会使程序的内存需求加倍。
Also consider that modern CPUs are frequently bottlenecked by memory performance. Using 8-byte integers also means your CPU cache now has half as many elements as before, so now it must wait for the slow main memory more often (which can easily take several hundred cycles).
还要考虑到现代 CPU 经常受到内存性能的限制。使用 8 字节整数还意味着您的 CPU 缓存现在拥有的元素数量是以前的一半,因此现在它必须更频繁地等待慢速主内存(这很容易花费数百个周期)。
In many cases, the cost of executing "32-to-64-bit conversion" operations is completely dwarfed by memory performance.
在许多情况下,执行“32 位到 64 位转换”操作的成本与内存性能完全相形见绌。
So this is a practical reason int
is still popular on 64-bit machines.
所以这是一个int
在 64 位机器上仍然流行的实际原因。
- Now you may argue about two dozen different integer types and portability and standard committees and everything, but the truth is that for a lot of C++ programs written out there, there's a "canonical" architecture they're thinking of, which is frequently the only architecture they're ever concerned about. (If you're writing a 3D graphics routine for a Windows game, you're sure it won't run on an IBM mainframe.) So for them, the question boils down to: "Do I need a 4-byte integer or an 8-byte one here?"
- 现在你可能会争论关于两打不同的整数类型、可移植性和标准委员会以及所有事情,但事实是,对于在那里编写的许多 C++ 程序,他们正在考虑一种“规范”架构,这通常是唯一的他们曾经关心的架构。(如果您正在为 Windows 游戏编写 3D 图形例程,您肯定它不会在 IBM 大型机上运行。)所以对于他们来说,问题归结为:“我需要一个 4 字节的整数还是这里是 8 字节的吗?”
回答by Steve Summit
My advice to you is not to look at assembly language output too much, not to worry too much about exactly what size each variable is, and not to say things like "the compiler feels at home with". (I truly don't know what you mean by that last one.)
我给你的建议是不要过多地查看汇编语言输出,不要太担心每个变量的确切大小,也不要说“编译器感觉很自在”之类的话。(我真的不知道你说的最后一个是什么意思。)
For garden-variety integers, the ones that most programs are full of, plain int
is supposed to be a good type to use. It's supposed to be the natural word size of the machine. It's supposed to be efficient to use, neither wasting unnecessary memory nor inducing lots of extra conversions when moving between memory and computation registers.
对于大多数程序充满的各种类型的整数,plainint
应该是一种很好的使用类型。它应该是机器的自然字长。它应该是有效的使用,既不会浪费不必要的内存,也不会在内存和计算寄存器之间移动时引起大量额外的转换。
Now, it's true that there are plenty of more specialized uses for which plain int
is no longer appropriate. In particular, sizes of objects, counts of elements, and indices into arrays are almost always size_t
. But that doesn't mean all integers should be size_t
!
现在,确实有许多更专业的用途,而plainint
不再适用于这些用途。特别是,对象的大小、元素的数量和数组的索引几乎总是size_t
。但这并不意味着所有整数都应该是size_t
!
It's also true that mixtures of signed and unsigned types, and mixtures of different-size types, can cause problems. But most of those are well taken care of by modern compilers and the warnings they emit for unsafe combinations. So as long as you're using a modern compiler and paying attention to its warnings, you don't need to pick an unnatural type just to try to avoid type mismatch problems.
有符号和无符号类型的混合以及不同大小类型的混合也会导致问题,这也是事实。但是现代编译器和它们针对不安全组合发出的警告已经很好地处理了其中的大多数问题。因此,只要您使用的是现代编译器并注意其警告,您就不需要为了避免类型不匹配问题而选择不自然的类型。
回答by geza
I don't think that there's realreason for using int
.
我认为没有真正的理由使用int
.
How to choose the integer type?
如何选择整数类型?
- If it is for bit operations, you can use an unsigned type, otherwise use a signed one
- If it is for memory-related thing (index, container size, etc.), for which you don't know the upper bound, use
std::ptrdiff_t
(the only problem is when size is larger thanPTRDIFF_MAX
, which is rare in practice) - Otherwise use
intXX_t
orint(_least)/(_fast)XX_t
.
- 如果是位运算,可以使用无符号类型,否则使用有符号类型
- 如果是内存相关的东西(索引、容器大小等),不知道上限,就使用
std::ptrdiff_t
(唯一的问题是大小大于PTRDIFF_MAX
,实际中很少见) - 否则使用
intXX_t
或int(_least)/(_fast)XX_t
。
These rules cover all the possible usages for int
, and they give a better solution:
这些规则涵盖了 的所有可能用法int
,并且提供了更好的解决方案:
int
is not good for storing memory related things, as its range can be smaller than an index can be (this is not a theoretical thing: for 64-bit machines,int
is usually 32-bit, so withint
, you can only handle 2 billion elements)int
is not good for storing "general" integers, as its range may be smaller than needed (undefined behavior happens if range is not enough), or on the contrary, its range may be much larger than needed (so memory is wasted)
int
不适合存储与内存相关的东西,因为它的范围可能小于索引所能达到的范围(这不是理论上的:对于 64 位机器,int
通常是 32 位,因此使用int
,您只能处理 20 亿个元素)int
不适合存储“通用”整数,因为它的范围可能比需要的小(如果范围不够,则会发生未定义的行为),或者相反,它的范围可能比需要的大得多(因此浪费了内存)
The only reason one could use an int
, if one does a calculation, and knows that the range fit into [-32767;32767] (the standard only guarantees this range. Note however, that implementations are free to provide bigger sized int
s, and they usually do so. Currently int
is 32-bit on a lot of platforms).
可以使用 的唯一原因int
,如果进行计算,并且知道范围适合 [-32767;32767](标准仅保证此范围。但是请注意,实现可以自由提供更大尺寸的int
s,并且它们通常这样做。目前int
在很多平台上都是 32 位)。
As the mentioned std
types are a little bit tedious to write, one could typedef
them to be shorter (I use s8
/u8
/.../s64
/u64
, and spt
/upt
("(un)signed pointer sized type") for ptrdiff_t
/size_t
. I've been using these typedefs for 15 years, and I've never written a single int
since...).
由于提到的std
类型写起来有点乏味,可以将typedef
它们更短(我使用s8
/ u8
/.../ s64
/u64
和spt
/ upt
(“(未)签名指针大小的类型”)用于ptrdiff_t
/ size_t
。我一直在使用这些 typedef 已经 15 年了,int
从那以后我再也没有写过一个……)。
回答by Davislor
Pro
亲
Easier to type, I guess? But you can always typedef
.
更容易打字,我猜?但你总是可以typedef
。
Many APIs use int, including parts of the standard library. This has historically caused problems, for example during the transition to 64-bit file sizes.
许多 API 使用 int,包括部分标准库。这在历史上会引起问题,例如在过渡到 64 位文件大小期间。
Because of the default type promotion rules, types narrower than int could be widened to int or unsigned int unless you add explicit casts in a lot of places, and a lot of different types could be narrower than int on some implementation somewhere. So, if you care about portability, it's a minor headache.
由于默认类型提升规则,比 int 窄的类型可以扩展为 int 或 unsigned int ,除非您在很多地方添加显式强制转换,并且在某些地方的某些实现中,许多不同的类型可能比 int 窄。所以,如果你关心便携性,那就有点头疼了。
Con
骗局
I also use ptrdiff_t
for indices, most of the time. (I agree with Google that unsigned indices are a bug attractor.) For other kinds of math, there's int_fast64_t
. int_fast32_t
, and so on, which will also be as good as or better than int
. Almost no real-world systems, with the exception of a few defunct Unices from last century, use ILP64, but there are plenty of CPUs where you would want 64-bit math. And a compiler is technically allowed, by standard, to break your program if your int
is greater than 32,767.
ptrdiff_t
大多数情况下,我也用于索引。(我同意 Google 的观点,即无符号索引是一个错误吸引器。)对于其他类型的数学,有int_fast64_t
. int_fast32_t
,等等,这也将与 一样好或更好int
。几乎没有现实世界的系统使用 ILP64,除了上个世纪的一些不复存在的 Unices,但有很多 CPU 需要 64 位数学。如果您的程序int
大于 32,767 ,则按照标准,技术上允许编译器破坏您的程序。
That said, any C compiler worth its salt will be tested on a lot of code that adds an int
to a pointer within an inner loop. So it can't do anything too dumb. Worst-case scenario on present-day hardwareis that it needs an extra instruction to sign-extend a 32-bit signed value to 64 bits. But, if what you really want is the fastest pointer math, the fastest math for values with magnitude between 32 kibi and 2 gibi, or the least wasted memoey, you should say what you mean, not make the compiler guess.
也就是说,任何物有所值的 C 编译器都将在大量代码上进行测试,这些代码将int
a添加到内部循环中的指针。所以它不能做任何太愚蠢的事情。当前硬件的最坏情况是它需要一个额外的指令来将 32 位有符号值符号扩展到 64 位。但是,如果您真正想要的是最快的指针数学,对于幅度在 32 kibi 和 2 gibi 之间的值的最快数学,或者浪费最少的 memoey,您应该说出您的意思,而不是让编译器猜测。
回答by ead
I guess 99% of cases there is no reason to use int
(or signed integer of other sizes). However, there are still situations, when using int
is a good option.
我猜 99% 的情况下没有理由使用int
(或其他大小的有符号整数)。但是,仍然有一些情况,使用时int
是一个不错的选择。
A) Performance:
一场表演:
One difference between int
and size_t
is that i++
can be undefined behavior for int
- if i
is MAX_INT
. This actually might be a good thing because compiler could use this undefined behavior to speed things up.
之间的一个区别int
和size_t
是i++
可以为未定义的行为int
-如果i
是MAX_INT
。这实际上可能是一件好事,因为编译器可以使用这种未定义的行为来加快速度。
For example in this questionthe difference was about factor 2 between exploiting the undefined behavior and using compiler flag -fwrapv
which prohibits this exploit.
例如,在这个问题中,利用未定义行为和使用-fwrapv
禁止这种利用的编译器标志之间的差异大约是因子 2 。
If my working-horse-for-loop becomes twice as fast by using int
s - sure I will use it
如果使用int
s 使我的工作马for-loop 速度提高两倍- 我肯定会使用它
B) Less error prone code
B) 不易出错的代码
Reversed for-loops with size_t
look strange and is a source for errors (I hope I got it right):
反转的 for 循环size_t
看起来很奇怪并且是错误的来源(我希望我做对了):
for(size_t i = N-1; i < N; i--){...}
By using
通过使用
for(int i = N-1; i >= 0; i--){...}
you will deserve the gratitude of less experienced C++-programmers, who will have to manage your code some day.
您将得到经验不足的 C++ 程序员的感谢,他们有一天将不得不管理您的代码。
C) Design using signed indices
C) 使用有符号索引进行设计
By using int
as indices you one could signal wrong values/out of range with negative values, something that comes handy and can lead to a clearer code.
通过使用int
索引,您可以用负值表示错误值/超出范围,这很方便并且可以导致更清晰的代码。
"find index of an element in array" could return
-1
if element is not present. For detecting this "error" you don't have to know the size of the array.binary search could return positive index if element is in the array, and
-index
for the position where the element would be inserted into array (and is not in the array).
-1
如果元素不存在,“查找数组中元素的索引”可能会返回。要检测此“错误”,您不必知道数组的大小。如果元素在数组中,并且
-index
元素将被插入到数组中的位置(并且不在数组中),二进制搜索可以返回正索引。
Clearly, the same information could be encoded with positive index-values, but the code becomes somewhat less intuitive.
显然,相同的信息可以用正索引值编码,但代码变得不那么直观。
Clearly, there are also reasons to choose int
over std::ptrdiff_t
- one of them is memory bandwidth. There are a lot of memory-bound algorithms, for them it is important to reduce the amount of memory transfered from RAM to cache.
显然,还有理由选择int
了std::ptrdiff_t
-其中之一是内存带宽。有很多内存绑定算法,对于它们来说,减少从 RAM 传输到缓存的内存量非常重要。
If you know, that all numbers are less then 2^31
that would be an advantage to use int
because otherwise a half of memory transfer would be writing only 0
of which you already know, that they are there.
如果您知道,所有数字都小于那么2^31
这将是一个优势,int
因为否则一半的内存传输将只写入0
您已经知道的,它们在那里。
An example are compressed sparse row (crs) matrices - their indices are stored as ints
and not long long
. Because many operations with sparse matrices are memory bound, there is really a different between using 32 or 64 bits.
一个例子是压缩稀疏行 (crs) 矩阵 - 它们的索引存储为ints
而不是long long
。因为稀疏矩阵的许多操作都受内存限制,所以使用 32 位或 64 位之间确实存在差异。