Java 创建字节数组,其大小由 long 表示

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/1071858/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-11 23:06:57  来源:igfitidea点击:

Java creating byte array whose size is represented by a long

javaarraysbytelong-integer

提问by jbu

I'm trying to create a byte array whose size is of type long. For example, think of it as:

我正在尝试创建一个大小为类型的字节数组long。例如,把它想象成:

long x = _________;
byte[] b = new byte[x]; 

Apparently you can only specify an intfor the size of a byte array.

显然,您只能int为字节数组的大小指定一个。

Before anyone asks why I would need a byte array so large, I'll say I need to encapsulate data of message formats that I am not writing, and one of these message types has a length of an unsigned int (longin Java).

在有人问我为什么需要这么大的字节数组之前,我会说我需要封装我没有编写的消息格式的数据,并且这些消息类型之一的长度为 unsigned int(long在 Java 中)。

Is there a way to create this byte array?

有没有办法创建这个字节数组?

I am thinking if there's no way around it, I can create a byte array output stream and keep feeding it bytes, but I don't know if there's any restriction on a size of a byte array...

我在想如果没有办法解决它,我可以创建一个字节数组输出流并继续向它提供字节,但我不知道对字节数组的大小是否有任何限制......

回答by Mehrdad Afshari

A byte[]with size of the maximum 32-bit signed integer would require 2GB of contiguous address space. You shouldn't try to create such an array. Otherwise, if the size is not really that large (and it's just a larger type), you could safely cast it to an intand use it to create the array.

byte[]具有最大32位带符号整数的尺寸将需要的连续的地址空间2GB。您不应该尝试创建这样的数组。否则,如果大小不是真的那么大(并且它只是一个更大的类型),您可以安全地将它转换为 anint并使用它来创建数组。

回答by Bill K

You should probably be using a stream to read your data in and another to write it out. If you are gong to need access to data later on in the file, save it. If you need access to something you haven't ran into yet, you need a two-pass system where you run through once and store the "stuff you'll need for the second pass, then run through again".

您可能应该使用一个流来读入数据,并使用另一个流来写出数据。如果您稍后需要访问文件中的数据,请保存它。如果您需要访问尚未遇到的内容,则需要一个双通道系统,您可以在其中运行一次并存储“第二次运行所需的东西,然后再次运行”。

Compilers work this way.

编译器以这种方式工作。

The only case for loading in the entire array at once is if you have to repeatedly randomly access many locations throughout the array. If this is the case, I suggest you load it into multiple byte arrays all stored in a single container class.

一次加载整个阵列的唯一情况是您必须重复随机访问整个阵列中的许多位置。如果是这种情况,我建议您将其加载到所有存储在单个容器类中的多个字节数组中。

The container class would have an array of byte arrays, but from outside all the accesses would seem contiguous. You would just ask for byte 49874329128714391837 and your class would divide your Long by the size of each byte array to calculate which array to access, then use the remainder to determine the byte.

容器类将有一个字节数组,但从外部看,所有访问似乎是连续的。您只需要求字节 49874329128714391837,您的班级会将您的 Long 除以每个字节数组的大小来计算要访问的数组,然后使用余数来确定字节。

It could also have methods to store and retrieve "Chunks" that could span byte-array boundaries that would require creating a temporary copy--but the cost of creating a few temporary arrays would be more than made up for by the fact that you don't have a locked 2gb space allocated which I think could just destroy your performance.

它还可以具有存储和检索“块”的方法,这些“块”可以跨越需要​​创建临时副本的字节数组边界——但是创建几个临时数组的成本将远远超过您不这样做的事实没有分配锁定的 2GB 空间,我认为这可能会破坏您的性能。

Edit: ps. If you really need the random access and can't use streams then implementing a containing class is a Very Good Idea. It will let you change the implementation on the fly from a single byte array to a group of byte arrays to a file-based system without any change to the rest of your code.

编辑:ps。如果你真的需要随机访问并且不能使用流,那么实现一个包含类是一个很好的主意。它将允许您将实现从单个字节数组动态更改为一组字节数组,再到基于文件的系统,而无需对其余代码进行任何更改。

回答by Kathy Van Stone

One way to "store" the array is to write it to a file and then access it (if you need to access it like an array) using a RandomAccessFile. The api for that file uses long as an index into file instead of int. It will be slower, but much less hard on the memory.

“存储”数组的一种方法是将其写入文件,然后使用 RandomAccessFile 访问它(如果您需要像访问数组一样访问它)。该文件的 api 使用 long 作为文件索引而不是 int。它会更慢,但对内存的影响要小得多。

This is when you can't extract what you need during the initial input scan.

这是您无法在初始输入扫描期间提取所需内容的时候。

回答by Brian Agnew

It's not of immediate help but creating arrays with larger sizes (via longs) is a proposed language change for Java 7. Check out the Project Coin proposals for more info

这不是立即的帮助,但创建更大尺寸的数组(通过 longs)是 Java 7 的语言更改建议。查看 Project Coin 提案以获取更多信息

回答by thkala

(It is probably a bit late for the OP, but it might still be useful for others)

(对于 OP 来说可能有点晚了,但对其他人可能仍然有用)

Unfortunately Java does not support arrays with more than 231?1 elements. The maximum consumption is 2 GiB of space for a byte[]array, or 16 GiB of space for a long[]array.

不幸的是,Java 不支持超过 2 31?1 个元素的数组。byte[]阵列的最大消耗为 2 GiB 的空间,或阵列的最大消耗为16 GiB 的空间long[]

While it is probably not applicable in this case, if the array is going to be sparse, you might be able to get away with using an associative data structure like a Mapto match each used offset to the appropriate value. In addition, Troveprovides an more memory-efficient implementation for storing primitive values than standard Java collections.

虽然在这种情况下它可能不适用,但如果数组将是sparse,您可能能够使用像 a 这样的关联数据结构Map来将每个使用的偏移量与适当的值相匹配。此外,Trove为存储原始值提供了比标准 Java 集合更节省内存的实现。

If the array is not sparse and you really, really do need the whole blob in memory, you will probably have to use a two-dimensional structure, e.g. with a Mapmatching offsets modulo 1024 to the proper 1024-byte array. This approach might be be more memory efficient even for sparse arrays, since adjacent filled cells can share the same Mapentry.

如果数组不是稀疏的,并且您确实确实需要内存中的整个 blob,则您可能必须使用二维结构,例如Map将偏移模数 1024 匹配到正确的 1024 字节数组。即使对于稀疏数组,这种方法也可能具有更高的内存效率,因为相邻的填充单元可以共享相同的Map条目。