java 将 4 个字节转换为一个无符号的 32 位整数并将其存储在一个 long 中

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/13203426/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-10-31 11:51:57  来源:igfitidea点击:

Convert 4 bytes to an unsigned 32-bit integer and storing it in a long

javabit-manipulation

提问by simon

I'm trying to read a binary file in Java. I need methods to read unsigned 8-bit values, unsigned 16-bit value and unsigned 32-bit values. What would be the best (fastest, nicest looking code) to do this? I've done this in c++ and did something like this:

我正在尝试用 Java 读取二进制文件。我需要读取无符号 8 位值、无符号 16 位值和无符号 32 位值的方法。什么是最好的(最快,最好看的代码)来做到这一点?我已经在 C++ 中完成了这个并做了这样的事情:

uint8_t *buffer;
uint32_t value = buffer[0] | buffer[1] << 8 | buffer[2] << 16 | buffer[3] << 24;

But in Java this causes a problem if for example buffer[1] contains a value which has it sign bit set as the result of a left-shift is an int (?). Instead of OR:ing in only 0xA5 at the specific place it OR:s in 0xFFFFA500 or something like that, which "damages" the two top bytes.

但是在 Java 中,如果例如 buffer[1] 包含一个值,该值将符号位设置为左移的结果是 int (?),则会导致问题。而不是 OR:ing 在特定位置的 0xA5 中,它 OR:s 在 0xFFFFA500 或类似的东西中,这会“损坏”两个顶部字节。

I have a code right now which looks like this:

我现在有一个代码,看起来像这样:

public long getUInt32() throws EOFException, IOException {
    byte[] bytes = getBytes(4);
    long value = bytes[0] | (bytes[1] << 8) | (bytes[2] << 16) | (bytes[3] << 24);
    return value & 0x00000000FFFFFFFFL;
}

If I want to convert the four bytes 0x67 0xA5 0x72 0x50 the result is 0xFFFFA567 instead of 0x5072A567.

如果我想转换四个字节 0x67 0xA5 0x72 0x50 结果是 0xFFFFA567 而不是 0x5072A567。

Edit: This works great:

编辑:这很好用:

public long getUInt32() throws EOFException, IOException {
    byte[] bytes = getBytes(4);
    long value = bytes[0] & 0xFF;
    value |= (bytes[1] << 8) & 0xFFFF;
    value |= (bytes[2] << 16) & 0xFFFFFF;
    value |= (bytes[3] << 24) & 0xFFFFFFFF;
    return value;
}

But isn't there a better way to do this? 10 bit-operations seems a "bit" much for a simple thing like this.. (See what I did there?) =)

但是没有更好的方法来做到这一点吗?10 位操作对于像这样简单的事情来说似乎有点“有点”......(看看我在那里做了什么?)=)

采纳答案by Keith Randall

You've got the right idea, I don't think there's any obvious improvement. If you look at the java.io.DataInput.readIntspec, they have code for the same thing. They switch the order of <<and &, but otherwise standard.

你的想法是对的,我认为没有任何明显的改进。如果你看一下java.io.DataInput.readInt规范,他们有同样的代码。他们切换的顺序<<&,但在其他方面的标准。

There is no way to read an intin one go from a bytearray, unless you use a memory-mapped region, which is wayoverkill for this.

有没有办法读取int从一气呵成byte阵列,除非你使用一个内存映射区域,这是方式矫枉过正这一点。

Of course, you could use a DataInputStreamdirectly instead of reading into a byte[]first:

当然,您可以DataInputStream直接使用 a而不是byte[]先读入 a :

DataInputStream d = new DataInputStream(new FileInputStream("myfile"));
d.readInt();

DataInputStreamworks on the opposite endianness than you are using, so you'll need some Integer.reverseBytescalls also. It won't be any faster, but it's cleaner.

DataInputStream与您使用的字节序相反,因此您Integer.reverseBytes还需要一些调用。它不会更快,但会更干净。

回答by starblue

A more regular version converts the bytes to their unsigned values as integers first:

更常规的版本首先将字节转换为整数作为无符号值:

public long getUInt32() throws EOFException, IOException {
    byte[] bytes = getBytes(4);
    long value = 
        ((bytes[0] & 0xFF) <<  0) |
        ((bytes[1] & 0xFF) <<  8) |
        ((bytes[2] & 0xFF) << 16) |
        ((bytes[3] & 0xFF) << 24);
    return value;
}

Don't get hung up on the number of bit operations, most likely the compiler will optimize those to byte operations.

不要纠结于位操作的数量,很可能编译器会将这些优化为字节操作。

Also, you shouldn't be using longfor 32-bit values just to avoid the sign, you can use intand ignore the fact that it is signed most of the time. See this answer.

此外,您不应该long仅仅为了避免符号而使用32 位值,您可以使用int并忽略它在大多数情况下已签名的事实。看到这个答案