为什么 Java API 使用 int 而不是 short 或 byte?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/27122610/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-11-02 11:16:38  来源:igfitidea点击:

Why does the Java API use int instead of short or byte?

javaoptimizationtypesjava-api

提问by Willi Mentzel

Why does the Java API use int, when shortor even bytewould be sufficient?

为什么 Java API 使用int,何时short甚至byte足够?

Example: The DAY_OF_WEEKfield in class Calendaruses int.

示例:DAY_OF_WEEK类中的字段Calendar使用int.

If the difference is too minimal, then why do those datatypes (short, int) exist at all?

如果差异太小,那么为什么这些数据类型 ( short, int) 存在呢?

回答by Marco13

Some of the reasons have already been pointed out. For example, the fact that "...(Almost) All operations on byte, short will promote these primitives to int". However, the obvious next question would be: WHYare these types promoted to int?

一些原因已经指出。例如,事实上"...(Almost) All operations on byte, short 会将这些原语提升为 int"。然而,显而易见的下一个问题是:为什么这些类型被提升为int

So to go one level deeper: The answer may simply be related to the Java Virtual Machine Instruction Set. As summarized in the Table in the Java Virtual Machine Specification, allintegral arithmetic operations, like adding, dividing and others, are only available for the type intand the type long, and notfor the smaller types.

所以更深入一层:答案可能只是与 Java 虚拟机指令集有关。正如Java 虚拟机规范表中总结的那样,所有积分算术运算,如加法、除法等,仅适用于 typeint和 type long而不适用于较小的类型。

(An aside: The smaller types (byteand short) are basically only intended for arrays. An arraylike new byte[1000]will take 1000 bytes, and an array like new int[1000]will take 4000 bytes)

(旁白:较小的类型 ( byteand short) 基本上仅用于数组数组likenew byte[1000]将占用 1000 字节,而数组 likenew int[1000]将占用 4000 字节)

Now, of course, one could say that "...the obvious next question would be: WHYare these instructions only offered for int(and long)?".

现在,当然,人们可以说“……显而易见的下一个问题是:为什么这些指令只提供给int(和long)?” .

One reason is mentioned in the JVM Spec mentioned above:

上面提到的JVM Spec中提到了一个原因:

If each typed instruction supported all of the Java Virtual Machine's run-time data types, there would be more instructions than could be represented in a byte

如果每个类型化指令都支持 Java 虚拟机的所有运行时数据类型,那么指令数量将超过一个字节所能表示的数量

Additionally, the Java Virtual Machine can be considered as an abstraction of a real processor. And introducing dedicated Arithmetic Logic Unitfor smaller types would not be worth the effort: It would need additional transistors, but it still could only execute one addition in one clock cycle. The dominant architecture when the JVM was designed was 32bits, just right for a 32bit int. (The operations that involve a 64bit longvalue are implemented as a special case).

此外,可以将 Java 虚拟机视为真实处理器的抽象。为较小的类型引入专用算术逻辑单元是不值得的:它需要额外的晶体管,但它仍然只能在一个时钟周期内执行一次加法。设计 JVM 时的主导架构是 32 位,正好适合 32 位int. (涉及 64 位long值的操作作为特例实现)。

(Note: The last paragraph is a bit oversimplified, considering possible vectorization etc., but should give the basic idea without diving too deep into processor design topics)

(注意:最后一段有点过于简化,考虑了可能的矢量化等,但应该给出基本思想而不深入研究处理器设计主题)



EDIT: A short addendum, focussing on the example from the question, but in an more general sense: One could also ask whether it would not be beneficial to store fieldsusing the smaller types. For example, one might think that memory could be saved by storing Calendar.DAY_OF_WEEKas a byte. But here, the Java Class File Format comes into play: All the Fields in a Class Fileoccupy at least one "slot", which has the size of one int(32 bits). (The "wide" fields, doubleand long, occupy two slots). So explicitly declaring a field as shortor bytewould not save any memory either.

编辑:一个简短的附录,侧重于问题中的示例,但在更一般的意义上:人们还可以问使用较小的类型存储字段是否有益。例如,人们可能认为可以通过存储Calendar.DAY_OF_WEEKbyte. 但是在这里,Java 类文件格式发挥了作用:类文件中的所有字段至少占据一个“槽”,其大小为 1 int(32 位)。(“宽”字段doublelong,占据两个插槽)。因此,明确地将字段声明为shortbyte也不会保存任何内存。

回答by Maroun

(Almost) All operations on byte, shortwill promote them to int, for example, you cannot write:

(几乎)对byte, 的所有操作short都会将它们提升为int,例如,您不能这样写:

short x = 1;
short y = 2;

short z = x + y; //error

Arithmetics are easier and straightforward when using int, no need to cast.

使用时算术更简单直接int,无需强制转换。

In terms of space, it makes a verylittle difference. byteand shortwould complicate things, I don't think this micro optimization worth it since we are talking about a fixed amount of variables.

在空间方面,它使一个非常小的差异。byte并且short会使事情复杂化,我认为这种微优化不值得,因为我们谈论的是固定数量的变量。

byteis relevant and useful when you program for embedded devices or dealing with files/networks. Also these primitives are limited, what if the calculations might exceed their limits in the future? Try to think about an extension for Calendarclass that might evolve bigger numbers.

byte当您为嵌入式设备编程或处理文件/网络时,它是相关且有用的。此外,这些原语是有限的,如果未来的计算可能超出它们的限制怎么办?尝试考虑Calendar可能会演化出更大数字的类的扩展。

Also note that in a 64-bit processors, locals will be saved in registers and won't use any resources, so using int, shortand other primitives won't make any difference at all. Moreover, many Java implementations align variables*(and objects).

还要注意的是,在64位处理器,当地人将因此使用保存在寄存器中,不会使用任何资源intshort和其他原语不会让所有任何区别。此外,许多 Java 实现对齐变量*(和对象)。



*byteand shortoccupy the same space as intif they are localvariables, classvariables or even instancevariables. Why? Because in (most) computer systems, variables addresses are aligned, so for example if you use a single byte, you'll actually end up with two bytes - one for the variable itself and another for the padding.

*byteshort占用相同的空间,int如果他们是局部变量,变量,甚至实例变量。为什么?因为在(大多数)计算机系统中,变量地址是对齐的,因此例如,如果您使用单个字节,实际上最终会得到两个字节——一个用于变量本身,另一个用于填充。

On the other hand, in arrays, bytetake 1 byte, shorttake 2 bytes and inttake four bytes, because in arrays only the start and maybe the end of it has to be aligned. This will make a difference in case you want to use, for example, System.arraycopy(), then you'll really note a performance difference.

另一方面,在数组中,byte取 1 个字节,short取 2 个字节和int取 4 个字节,因为在数组中只有开头和结尾必须对齐。这将在您想要使用的情况下有所不同,例如,System.arraycopy()您会真正注意到性能差异。

回答by Rafael Winterhalter

Because arithmetic operations are easier when using integers compared to shorts. Assume that the constants were indeed modeled by shortvalues. Then you would have to use the API in this manner:

因为与 short 相比,使用整数时算术运算更容易。假设常量确实是按short值建模的。那么你将不得不以这种方式使用 API:

short month = Calendar.JUNE;
month = month + (short) 1; // is july

Notice the explicit casting. Short values are implicitly promoted to intvalues when they are used in arithmetic operations. (On the operand stack, shorts are even expressed as ints.) This would be quite cumbersome to use which is why intvalues are often preferred for constants.

注意显式转换。短值int在算术运算中使用时会隐式提升为值。(在操作数堆栈上,shorts 甚至表示为整数。)这使用起来非常麻烦,这就是为什么int常量通常首选值。

Compared to that, the gain in storage efficiency is minimal because there only exists a fixed number of such constants. We are talking about 40 constants. Changing their storage from intto shortwould safe you 40 * 16 bit = 80 byte. See this answerfor further reference.

与此相比,存储效率的提高很小,因为只存在固定数量的此类常量。我们正在谈论 40 个常量。将它们的存储从 更改为intshort您来说是安全的40 * 16 bit = 80 byte。请参阅此答案以获取进一步参考。

回答by Rafael Winterhalter

If you used the philosophy where integral constants are stored in the smallest type that they fit in, then Java would have a serious problem: whenever programmers write code using integral constants, they have to pay careful attention to their code to check if the type of the constants matter, and if so look up the type in the documentation and/or do whatever type conversions are needed.

如果使用整数常量存储在它们适合的最小类型中的哲学,那么 Java 将有一个严重的问题:每当程序员使用整数常量编写代码时,他们必须仔细注意他们的代码以检查其类型是否正确常量很重要,如果是这样,请在文档中查找类型和/或进行任何需要的类型转换。

So now that we've outlined a serious problem, what benefits could you hope to achieve with that philosophy? I would be unsurprised if the onlyruntime-observable effect of that change would be what type you get when you look the constant up via reflection. (and, of course, whatever errors are introduced by lazy/unwitting programmers not correctly accounting for the types of the constants)

既然我们已经概述了一个严重的问题,那么您希望通过这种理念获得什么好处?如果该更改的唯一运行时可观察效果是您通过反射查找常量时获得的类型,我不会感到惊讶。(当然,懒惰/不知情的程序员引入的任何错误都没有正确解释常量的类型)

Weighing the pros and the cons is very easy: it's a bad philosophy.

权衡利弊很容易:这是一种糟糕的哲学。

回答by supercat

The design complexity of a virtual machine is a function of how many kinds of operations it can perform. It's easier to having four implementations of an instruction like "multiply"--one each for 32-bit integer, 64-bit integer, 32-bit floating-point, and 64-bit floating-point--than to have, in addition to the above, versions for the smaller numerical types as well. A more interesting design question is why there should be four types, rather than fewer (performing all integer computations with 64-bit integers and/or doing all floating-point computations with 64-bit floating-point values). The reason for using 32-bit integers is that Java was expected to run on many platforms where 32-bit types could be acted upon just as quickly as 16-bit or 8-bit types, but operations on 64-bit types would be noticeably slower. Even on platforms where 16-bit types would be faster to work with, the extra cost of working with 32-bit quantities would be offset by the simplicity afforded by onlyhaving 32-bit types.

虚拟机的设计复杂性是它可以执行多少种操作的函数。有四个像“乘法”这样的指令的实现——一个分别用于 32 位整数、64 位整数、32 位浮点数和 64 位浮点数——比另外实现更容易对于上述,较小数字类型的版本也是如此。一个更有趣的设计问题是为什么应该有四种类型,而不是更少(使用 64 位整数执行所有整数计算和/或使用 64 位浮点值执行所有浮点计算)。使用 32 位整数的原因是,Java 被期望在许多平台上运行,在这些平台上,32 位类型可以像 16 位或 8 位类型一样快速,但对 64 位类型的操作会很明显慢点。只有32 位类型。

As for performing floating-point computations on 32-bit values, the advantages are a bit less clear. There are some platforms where a computation like float a=b+c+d;could be performed most quickly by converting all operands to a higher-precision type, adding them, and then converting the result back to a 32-bit floating-point number for storage. There are other platforms where it would be more efficient to perform all computations using 32-bit floating-point values. The creators of Java decided that all platforms should be required to do things the same way, and that they should favor the hardware platforms for which 32-bit floating-point computations are faster than longer ones, even though this severely degraded PC both the speed and precision of floating-point math on a typical PC, as well as on many machines without floating-point units. Note, btw, that depending upon the values of b, c, and d, using higher-precision intermediate computations when computing expressions like the aforementioned float a=b+c+d;will sometimes yield results which are significantly more accurate than would be achieved of all intermediate operands were computed at floatprecision, but will sometimes yield a value which is a tiny bit less accurate. In any case, Sun decided everything should be done the same way, and they opted for using minimal-precision floatvalues.

至于对 32 位值执行浮点计算,优势有点不太明显。有一些平台,其中的计算像float a=b+c+d;通过将所有操作数转换为更高精度的类型,将它们相加,然后将结果转换回 32 位浮点数以进行存储,可以最快地执行此操作。在其他平台上,使用 32 位浮点值执行所有计算会更有效。Java 的创造者决定应该要求所有平台以相同的方式做事,并且他们应该支持 32 位浮点计算比长计算更快的硬件平台,尽管这严重降低了 PC 的速度以及在典型 PC 上以及许多没有浮点单元的机器上的浮点数学精度。请注意,顺便说一句,根据 b、c 和 d 的值,在计算上述表达式时使用更高精度的中间计算float a=b+c+d;有时会产生比以float精度计算的所有中间操作数更准确的结果,但有时会产生一个稍微不太准确的值。无论如何,Sun 决定一切都应该以相同的方式完成,并且他们选择使用最小精度float值。

Note that the primary advantages of smaller data types become apparent when large numbers of them are stored together in an array; even if there were no advantage to having individual variables of types smaller than 64-bits, it's worthwhile to have arrays which can store smaller values more compactly; having a local variable be a byterather than an longsaves seven bytes; having an array of 1,000,000 numbers hold each number as a byterather than a longwaves 7,000,000 bytes. Since each array type only needs to support a few operations (most notably read one item, store one item, copy a range of items within an array, or copy a range of items from one array to another), the added complexity of having more array types is not as severe as the complexity of having more types of directly-usable discrete numerical values.

请注意,当大量数据类型一起存储在一个数组中时,较小数据类型的主要优势变得明显。即使拥有小于 64 位类型的单个变量没有任何优势,拥有可以更紧凑地存储较小值的数组也是值得的;将局部变量设为 abyte而不是 an 可long节省七个字节;拥有 1,000,000 个数字的数组将每个数字保存为一个byte而不是一个long波 7,000,000 字节。由于每种数组类型只需要支持几个操作(最显着的是读取一个项目、存储一个项目、复制数组中的一系列项目或将一系列项目从一个数组复制到另一个),因此增加了更多的复杂性数组类型不像拥有更多类型的可直接使用的离散数值那么复杂。

回答by maaartinus

Actually, there'd be a small advantage. If you have a

实际上,会有一个小优势。如果你有一个

class MyTimeAndDayOfWeek {
    byte dayOfWeek;
    byte hour;
    byte minute;
    byte second;
}

then on a typical JVM it needs as much space as a class containing a single int. The memory consumption gets rounded to a next multiple of 8 or 16 bytes (IIRC, that's configurable), so the cases when there are real saving are rather rare.

那么在典型的 JVM 上,它需要的空间与包含单个int. 内存消耗四舍五入为 8 或 16 字节的下一个倍数(IIRC,这是可配置的),因此真正节省的情况很少见。

This class would be slightly easier to use if the corresponding Calendarmethods returned a byte. But there are no such Calendarmethods, only get(int)which must returns an intbecause of other fields. Each operation on smaller types promotes to int, so you need a lot of casting.

如果相应的Calendar方法返回一个byte. 但是没有这样的Calendar方法,只有因为其他字段get(int)必须返回一个int。对较小类型的每个操作都会提升到int,因此您需要进行大量转换。

Most probably, you'll either give up and switch to an intor write setters like

最有可能的是,您要么放弃并切换到 anint或编写 setter 之类的

void setDayOfWeek(int dayOfWeek) {
    this.dayOfWeek = checkedCastToByte(dayOfWeek);
}

Then the type of DAY_OF_WEEKdoesn't matter, anyway.

DAY_OF_WEEK无论如何,类型无关紧要。