C++ 如何对浮点数执行按位运算

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/1723575/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-27 21:01:06  来源:igfitidea点击:

How to perform a bitwise operation on floating point numbers

c++floating-pointgenetic-algorithmbitwise-operators

提问by Rohit Banga

I tried this:

我试过这个:

float a = 1.4123;
a = a & (1 << 3);

I get a compiler error saying that the operand of &cannot be of type float.

我收到一个编译器错误,提示操作数&不能是 float 类型。

When I do:

当我做:

float a = 1.4123;
a = (int)a & (1 << 3);

I get the program running. The only thing is that the bitwise operation is done on the integer representation of the number obtained after rounding off.

我让程序运行。唯一的一点是,按位运算是对四舍五入后获得的数字的整数表示进行的。

The following is also not allowed.

以下也不允许。

float a = 1.4123;
a = (void*)a & (1 << 3);

I don't understand why intcan be cast to void*but not float.

我不明白为什么int可以投射到void*但不能float

I am doing this to solve the problem described in Stack Overflow question How to solve linear equations using a genetic algorithm?.

我这样做是为了解决 Stack Overflow 问题如何使用遗传算法求解线性方程中描述的问题.

回答by AnT

At the language level, there's no such thing as "bitwise operation on floating-point numbers". Bitwise operations in C/C++ work on value-representation of a number. And the value-representation of floating point numbers is not defined in C/C++. Floating point numbers don't have bits at the level of value-representation, which is why you can't apply bitwise operations to them.

在语言层面,没有“浮点数按位运算”这样的东西。C/C++ 中的按位运算适用于数字的值表示。浮点数的值表示在 C/C++ 中没有定义。浮点数在值表示级别没有位,这就是您不能对它们应用按位运算的原因。

All you can do is analyze the bit content of the raw memory occupied by the floating-point number. For that you need to either use a union as suggested below or (equivalently, and only in C++) reinterpret the floating-point object as an array of unsigned charobjects, as in

你所能做的就是分析浮点数占用的原始内存的位内容。为此,您需要使用下面建议的联合或(等效地,并且仅在 C++ 中)将浮点对象重新解释为对象数组unsigned char,如

float f = 5;
unsigned char *c = reinterpret_cast<unsigned char *>(&f);
// inspect memory from c[0] to c[sizeof f - 1]

And please, don't try to reinterpret a floatobject as an intobject, as other answers suggest. That doesn't make much sense, that is illegal, and that is not guaranteed to work in compilers that follow strict-aliasing rules in optimization. The only legal way to inspect memory content in C++ is by reinterpreting it as an array of [signed/unsigned] char.

并且请不要像其他答案所建议的float那样尝试将对象重新解释为对象int。这没有多大意义,这是非法的,并且不能保证在优化中遵循严格别名规则的编译器中工作。在 C++ 中检查内存内容的唯一合法方法是将其重新解释为[signed/unsigned] char.

Also note that you technically aren't guaranteed that floating-point representation on your system is IEEE754 (although in practice it is unless you explicitly allow it not to be, and then only with respect to -0.0, ±infinity and NaN).

另请注意,从技术上讲,您不能保证系统上的浮点表示是 IEEE754(尽管在实践中确实如此,除非您明确允许它不是,然后仅针对 -0.0、±infinity 和 NaN)。

回答by mob

If you are trying to change the bits in the floating-point representation, you could do something like this:

如果您尝试更改浮点表示中的位,您可以执行以下操作:

union fp_bit_twiddler {
    float f;
    int i;
} q;
q.f = a;
q.i &= (1 << 3);
a = q.f;

As AndreyT notes, accessing a union like this invokes undefined behavior, and the compiler could grow arms and strangle you. Do what he suggests instead.

正如 AndreyT 指出的那样,访问这样的联合会调用未定义的行为,编译器可能会长出手臂并勒死你。而是按照他的建议去做。

回答by Chap

float a = 1.4123;
unsigned int* inta = reinterpret_cast<unsigned int*>(&a);
*inta = *inta & (1 << 3);

回答by Justin

Have a look at the following. Inspired by fast inverse square root:

看看下面的内容。受快速平方根反比的启发:

#include <iostream>
using namespace std;

int main()
{
    float x, td = 2.0;
    int ti = *(int*) &td;
    cout << "Cast int: " << ti << endl;
    ti = ti>>4;
    x = *(float*) &ti;
    cout << "Recast float: " << x << endl;
    return 0; 
}

回答by Tim Schaeffer

@mobrule:

@mobrule:

Better:

更好的:

#include <stdint.h>
...
union fp_bit_twiddler {
    float f;
    uint32_t u;
} q;

/* mutatis mutandis ... */

For these valuesint will likely be ok, but generally, you should use unsigned ints for bit shifting to avoid the effects of arithmetic shifts. And the uint32_t will work even on systems whose ints are not 32 bits.

对于这些值int 可能没问题,但通常,您应该使用 unsigned int 进行位移以避免算术移位的影响。并且 uint32_t 甚至可以在整数不是 32 位的系统上工作。

回答by djulien

FWIW, there is a real use case for bit-wise operations on floating point (I just ran into it recently) - shaders written for OpenGL implementations that only support older versions of GLSL (1.2 and earlier did not have support for bit-wise operators), and where there would be loss of precision if the floats were converted to ints.

FWIW,在浮点上按位操作有一个真正的用例(我最近刚遇到它)-为仅支持旧版本 GLSL 的 OpenGL 实现编写的着色器(1.2 及更早版本不支持按位运算符) ),并且如果将浮点数转换为整数,则会损失精度。

The bit-wise operations can be implemented on floating point numbers using remainders (modulo) and inequality checks. For example:

可以使用余数(模)和不等式检查在浮点数上实现按位运算。例如:

float A = 0.625; //value to check; ie, 160/256
float mask = 0.25; //bit to check; ie, 1/4
bool result = (mod(A, 2.0 * mask) >= mask); //non-zero if bit 0.25 is on in A

The above assumes that A is between [0..1) and that there is only one "bit" in mask to check, but it could be generalized for more complex cases.

上面假设 A 在 [0..1) 之间,并且掩码中只有一个“位”要检查,但它可以推广到更复杂的情况。

This idea is based on some of the info found in is-it-possible-to-implement-bitwise-operators-using-integer-arithmetic

这个想法基于在is-it-possible-to-implement-bitwise-operators-using-integer-arithmetic 中找到的一些信息

If there is not even a built-in mod function, then that can also be implemented fairly easily. For example:

如果甚至没有内置的 mod 功能,那么也可以很容易地实现。例如:

float mod(float num, float den)
{
    return num - den * floor(num / den);
}

回答by Patrick Roberts

You can work around the strict-aliasing ruleand perform bitwise operations on a floattype-punned as an uint32_t(if your implementation defines it, which most do) without undefined behavior by using memcpy():

您可以绕过严格别名规则,并对float类型双关语执行按位运算uint32_t(如果您的实现定义了它,大多数情况下是这样的),而没有未定义的行为,方法memcpy()如下:

float a = 1.4123f;
uint32_t b;

std::memcpy(&b, &a, 4);
// perform bitwise operation
b &= 1u << 3;
std::memcpy(&a, &b, 4);

回答by Pyry Pakkanen

The Python implementation in Floating point bitwise operations (Python recipe)of floating point bitwise operations works by representing numbers in binary that extends infinitely to the left as well as to the right from the fractional point. Because floating point numbers have a signed zero on most architectures it uses ones' complementfor representing negative numbers (well, actually it just pretends to do so and uses a few tricks to achieve the appearance).

浮点按位运算的浮点按位运算(Python 配方)中的 Python 实现通过表示从小数点向左和向右无限扩展的二进制数来工作。因为浮点数在大多数架构上都有一个带符号的零,所以它使用一个的补码来表示负数(实际上它只是假装这样做并使用一些技巧来实现外观)。

I'm sure it can be adapted to work in C++, but care must be taken so as to not let the right shifts overflow when equalizing the exponents.

我确信它可以适用于 C++,但必须小心,以免在均衡指数时让右移溢出。

回答by Kit10

Bitwise operators should NOT be used on floats, as floats are hardware specific, regardless of similarity on what ever hardware you might have. Which project/job do you want to risk on "well it worked on my machine"? Instead, for C++, you can get a similar "feel" for the bit shift operators by overloading the stream operator on an "object" wrapper for a float:

不应在浮点数上使用按位运算符,因为浮点数是特定于硬件的,无论您拥有的硬件是否相似。您想在“它在我的机器上运行良好”的哪个项目/工作中冒险?相反,对于 C++,您可以通过在浮点数的“对象”包装器上重载流运算符来获得对位移运算符的类似“感觉”:

// Simple object wrapper for float type as templates want classes.
class Float
{
float m_f;
public:
    Float( const float & f )
    : m_f( f )
    {
    }

    operator float() const
    {
        return m_f;
    }
};

float operator>>( const Float & left, int right )
{
    float temp = left;
    for( right; right > 0; --right )
    {
        temp /= 2.0f;
    }
    return temp;
}

float operator<<( const Float & left, int right )
{
    float temp = left;
    for( right; right > 0; --right )
    {
        temp *= 2.0f;
    }
    return temp;
}

int main( int argc, char ** argv )
{
    int a1 = 40 >> 2; 
    int a2 = 40 << 2;
    int a3 = 13 >> 2;
    int a4 = 256 >> 2;
    int a5 = 255 >> 2;

    float f1 = Float( 40.0f ) >> 2; 
    float f2 = Float( 40.0f ) << 2;
    float f3 = Float( 13.0f ) >> 2;
    float f4 = Float( 256.0f ) >> 2;
    float f5 = Float( 255.0f ) >> 2;
}

You will have a remainder, which you can throw away based on your desired implementation.

您将有剩余部分,您可以根据所需的实现将其丢弃。