java 有没有办法在没有初始化的情况下创建原始数组?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/13780350/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-10-31 13:57:25  来源:igfitidea点击:

Is there any way to create a primitive array without initialization?

javaarrays

提问by Evgeniy Dorofeev

As we know Java always initialises arrays upon creation. I.e. new int[1000000]always returns an array with all elements = 0. I understand that it's a must for Object arrays, but for primitive arrays (except may be Boolean) in most cases we don't care about the initial values.

正如我们所知,Java 总是在创建时初始化数组。即new int[1000000]总是返回一个所有元素都为 0 的数组。我知道它对于 Object 数组是必须的,但是对于原始数组(除了可能是 Boolean)在大多数情况下我们不关心初始值。

Does anybody know a way to avoid this intialization?

有人知道避免这种初始化的方法吗?

回答by Evgeniy Dorofeev

I've done some investigation. There is no legal way to create uninitialized array in Java. Even JNI NewXxxArray creates initialized arrays. So it is impossible to know exactly the cost of array zeroing. Nevertheless I've done some measurements:

我做了一些调查。在 Java 中没有合法的方法来创建未初始化的数组。甚至 JNI NewXxxArray 也会创建初始化数组。所以不可能确切地知道数组归零的成本。尽管如此,我还是做了一些测量:

1) 1000 byte arrays creation with different array size

1) 使用不同的数组大小创建 1000 字节数组

        long t0 = System.currentTimeMillis();
        for(int i = 0; i < 1000; i++) {
//          byte[] a1 = new byte[1];
            byte[] a1 = new byte[1000000];
        }
        System.out.println(System.currentTimeMillis() - t0);

on my PC it gives < 1ms for byte[1] and ~500 ms for byte[1000000]. Sounds impressive to me.

在我的 PC 上,它为字节 [1] 提供 < 1 毫秒,为字节 [1000000] 提供约 500 毫秒。听起来让我印象深刻。

2) We don't have a fast (native) method in JDK for filling arrays, Arrays.fill is too slow, so let's see at least how much 1000 copying of 1,000,000 size array takes with native System.arraycopy

2)我们在JDK中没有快速(原生)的方法来填充数组,Arrays.fill太慢了,所以让我们看看用原生System.arraycopy至少1000次复制1,000,000大小的数组需要多少

    byte[] a1 = new byte[1000000];
    byte[] a2 = new byte[1000000];
    for(int i = 0; i < 1000; i++) {
        System.arraycopy(a1, 0, a2, 0, 1000000);
    }

It is 700 ms.

它是 700 毫秒。

It gives me reasons to believe that a) creating long arrays is expensive b) it seems to be expensive because of useless initialization.

它让我有理由相信 a) 创建长数组很昂贵 b) 由于无用的初始化,它似乎很昂贵。

3) Let's take sun.misc.Unsafe http://www.javasourcecode.org/html/open-source/jdk/jdk-6u23/sun/misc/Unsafe.html. It is protected from external usage but not too much

3) 让我们以 sun.misc.Unsafe http://www.javasourcecode.org/html/open-source/jdk/jdk-6u23/sun/misc/Unsafe.html为例。它可以防止外部使用,但不会太多

    Field f = Unsafe.class.getDeclaredField("theUnsafe");
    f.setAccessible(true);
    Unsafe unsafe = (Unsafe)f.get(null);

Here is the cost of memory allocation test

这是内存分配测试的成本

    for(int i = 0; i < 1000; i++) {
        long m = u.allocateMemory(1000000);
    }

It takes < 1 ms, if you remember, for new byte[1000000] it took 500ms.

如果您还记得,它需要 < 1 毫秒,对于新字节 [1000000] 需要 500 毫秒。

4) Unsafe has no direct methods to work with arrays. It needs to know class fields, but reflection shows no fields in an array. There is not much info about arrays internals, I guess it is JVM / platform specific. Nevertheless, it is, like any other Java Object, header + fields. On my PC/JVM it looks like

4) Unsafe 没有处理数组的直接方法。它需要知道类字段,但反射显示数组中没有字段。关于数组内部的信息不多,我猜它是特定于 JVM/平台的。尽管如此,它与任何其他 Java 对象一样,具有标头 + 字段。在我的 PC/JVM 上,它看起来像

header - 8 bytes
int length - 4 bytes
long bufferAddress - 8 bytes

Now, using Unsafe, I will create byte[10], allocate a 10 byte memory buffer and use it as my array's elements:

现在,使用 Unsafe,我将创建 byte[10],分配一个 10 字节的内存缓冲区并将其用作我的数组元素:

    byte[] a = new byte[10];
    System.out.println(Arrays.toString(a));
    long mem = unsafe.allocateMemory(10);
    unsafe.putLong(a, 12, mem);
    System.out.println(Arrays.toString(a));

it prints

它打印

[0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
[8, 15, -114, 24, 0, 0, 0, 0, 0, 0]

You can see thay array's data are not initialized.

您可以看到数组的数据未初始化。

Now I'll change our array length (though it still points to 10 bytes memory)

现在我将更改我们的数组长度(尽管它仍然指向 10 字节内存)

    unsafe.putInt(a, 8, 1000000);
    System.out.println(a.length);

it shows 1000000. It was just to prove that the idea works.

它显示 1000000。这只是为了证明这个想法有效。

Now performance test. I will create an empty byte array a1, allocate a buffer of 1000000 bytes, assign this buffer to a1 an set a1.length = 10000000

现在进行性能测试。我将创建一个空字节数组a1,分配一个1000000字节的缓冲区,将此缓冲区分配给a1并设置a1.length = 10000000

    long t0 = System.currentTimeMillis();
    for(int i = 0; i < 1000; i++) {
        byte[] a1 = new byte[0];
        long mem1 = unsafe.allocateMemory(1000000);
        unsafe.putLong(a1, 12, mem);
        unsafe.putInt(a1, 8, 1000000);
    }
    System.out.println(System.currentTimeMillis() - t0);

it takes 10ms.

需要 10 毫秒。

5) There are malloc and alloc in C++, malloc just allocates memory block , calloc also initializes it with zeroes.

5)C++中有malloc和alloc,malloc只分配内存块,calloc也用0初始化。

cpp

cp

...
JNIEXPORT void JNICALL Java_Test_malloc(JNIEnv *env, jobject obj, jint n) {
     malloc(n);
} 

java

爪哇

private native static void malloc(int n);

for (int i = 0; i < 500; i++) {
    malloc(1000000);
}

results malloc - 78 ms; calloc - 468 ms

结果 malloc - 78 毫秒;calloc - 468 毫秒

Conclusions

结论

  1. It seems that Java array creation is slow because of useless element zeroing.
  2. We cannot change it, but Oracle can. No need to change anything in JLS, just add native methods to java.lang.reflect.Array like

    public static native xxx[] newUninitialziedXxxArray(int size);

  1. 由于无用的元素归零,Java 数组创建似乎很慢。
  2. 我们无法更改它,但 Oracle 可以。无需更改 JLS 中的任何内容,只需将本机方法添加到 java.lang.reflect.Array 中即可

    公共静态本机 xxx[] newUninitialziedXxxArray(int size);

for all primitive numeric types (byte - double) and char type. It could be used all over the JDK, like in java.util.Arrays

适用于所有原始数字类型(byte - double)和 char 类型。它可以在整个 JDK 中使用,就像在 java.util.Arrays 中一样

    public static int[] copyOf(int[] original, int newLength) {
        int[] copy = Array.newUninitializedIntArray(newLength);
        System.arraycopy(original, 0, copy, 0, Math.min(original.length, newLength));
        ...

or java.lang.String

或 java.lang.String

   public String concat(String str) {
        ...   
        char[] buf = Array.newUninitializedCharArray(count + otherLen);
        getChars(0, count, buf, 0);
        ...

回答by Brian Roach

I'm going to move this to an answer because it probably should be.

我将把它移到答案中,因为它可能应该是。

An "Array" in java is not what you think it is. It's not just a pointer to a chunk of contiguous memory on the stack or heap.

Java 中的“数组”不是您认为的那样。它不仅仅是指向堆栈或堆上一块连续内存的指针。

An Array in Java is an Object just like everything else (except primitives) and is on the heap. When you call new int[100000]you're creating a new object just like every other object, and it gets initialized, etc.

Java 中的数组和其他所有东西(基本类型除外)一样是一个对象,并且在堆上。当您调用时,new int[100000]您正在创建一个新对象,就像其他所有对象一样,它会被初始化,等等。

The JLS provides all the specific info about this:

JLS 提供了有关此的所有特定信息:

http://docs.oracle.com/javase/specs/jls/se5.0/html/arrays.html

http://docs.oracle.com/javase/specs/jls/se5.0/html/arrays.html

So, no. You can't avoid "initializing" an array. That's just not how Java works. There's simply no such thing as uninitialized heap memory; many people call that a "feature" as it prevents you from accessing uninitialized memory.

所以不行。您无法避免“初始化”数组。这不是 Java 的工作方式。根本没有未初始化的堆内存之类的东西;许多人称其为“功能”,因为它可以防止您访问未初始化的内存。

回答by Nat

Java 9 actually starts to expose this via jdk.internal.misc.Unsafe.allocateUninitializedArraymethod. It would actually require JDK.Unsupported module declaration.

Java 9 实际上开始通过jdk.internal.misc.Unsafe.allocateUninitializedArray方法公开它。它实际上需要 JDK.Unsupported 模块声明

回答by 0kcats

I can imagine that O(n) cost of new int[n] could be a burden in some data structures or algorithms.

我可以想象 new int[n] 的 O(n) 成本可能是某些数据结构或算法的负担。

A way to have amortized O(1) cost of memory allocation in Java for a primitive array of size n is to do recycling of allocated arrays with an object pool or some other strategy. Recycled array can be considered "uninitialized" for the next allocation.

在 Java 中为大小为 n 的原始数组分摊 O(1) 内存分配成本的一种方法是使用对象池或其他策略对分配的数组进行回收。对于下一次分配,回收的数组可以被视为“未初始化”。