为什么 Python 的无穷大哈希有 π 的数字?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/56227419/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 20:36:53  来源:igfitidea点击:

Why does Python's hash of infinity have the digits of π?

pythonmathhashfloating-pointpi

提问by wim

The hash of infinity in Python has digits matching pi:

Python 中无穷大的散列具有与pi匹配的数字:

>>> inf = float('inf')
>>> hash(inf)
314159
>>> int(math.pi*1e5)
314159

Is that just a coincidence or is it intentional?

这只是巧合还是有意为之?

采纳答案by Patrick Haugh

_PyHASH_INFis defined as a constantequal to 314159.

_PyHASH_INF定义为等于的常数314159

I can't find any discussion about this, or comments giving a reason. I think it was chosen more or less arbitrarily. I imagine that as long as they don't use the same meaningful value for other hashes, it shouldn't matter.

我找不到任何关于此的讨论,或给出理由的评论。我认为它或多或少是任意选择的。我想只要他们不对其他散列使用相同的有意义的值,就没有关系。

回答by ShreevatsaR

Summary: It's not a coincidence; _PyHASH_INFis hardcoded as 314159in the default CPython implementation of Python, and was picked as an arbitrary value (obviously from the digits of π) by Tim Peters in 2000.

总结:这不是巧合;在 Python 的默认 CPython 实现中_PyHASH_INF被硬编码为 314159,并且在 2000 年被 Tim Peters选择为任意值(显然来自 π 的数字)。



The value of hash(float('inf'))is one of the system-dependent parameters of the built-in hash function for numeric types, and is also availableas sys.hash_info.infin Python 3:

的值hash(float('inf'))是数值类型内置散列函数的系统相关的参数中的一个,并且也可以作为sys.hash_info.inf在Python 3:

>>> import sys
>>> sys.hash_info
sys.hash_info(width=64, modulus=2305843009213693951, inf=314159, nan=0, imag=1000003, algorithm='siphash24', hash_bits=64, seed_bits=128, cutoff=0)
>>> sys.hash_info.inf
314159

(Same results with PyPytoo.)

与 PyPy 的结果相同。)



In terms of code, hashis a built-in function. Calling it on a Python float object invokes the function whose pointer is given by the tp_hashattributeof the built-in float type (PyTypeObject PyFloat_Type), which isthe float_hashfunction, definedas return _Py_HashDouble(v->ob_fval), which in turn has

就代码而言,hash是一个内置函数。它调用一个Python浮动物体上就会调用其指针由给定的功能tp_hash属性内置浮子式(的PyTypeObject PyFloat_Type),它所述float_hash的功能,定义return _Py_HashDouble(v->ob_fval),其又具有

    if (Py_IS_INFINITY(v))
        return v > 0 ? _PyHASH_INF : -_PyHASH_INF;

where _PyHASH_INFis defined as314159:

其中_PyHASH_INF定义为314159:

#define _PyHASH_INF 314159


In terms of history, the first mention of 314159in this context in the Python code (you can find this with git bisector git log -S 314159 -p) was added by Tim Petersin August 2000, in what is now commit 39dce293in the cpythongit repository.

就历史而言,314159在 Python 代码(您可以使用git bisect或找到它git log -S 314159 -p)中第一次提到在此上下文中是由Tim Peters在 2000 年 8 月添加的,现在在git 存储库中提交39dce293cpython

The commit message says:

提交消息说:

Fix for http://sourceforge.net/bugs/?func=detailbug&bug_id=111866&group_id=5470. This was a misleading bug -- the true "bug" was that hash(x)gave an error return when xis an infinity. Fixed that. Added new Py_IS_INFINITYmacro to pyport.h. Rearranged code to reduce growing duplication in hashing of float and complex numbers, pushing Trent's earlier stab at that to a logical conclusion. Fixed exceedingly rare bug where hashing of floats could return -1 even if there wasn't an error (didn't waste time trying to construct a test case, it was simply obvious from the code that it couldhappen). Improved complex hash so that hash(complex(x, y))doesn't systematically equal hash(complex(y, x))anymore.

修复http://sourceforge.net/bugs/?func=detailbug&bug_id=111866&group_id=5470。这是一个误导性的错误——真正的“错误”是在无穷大hash(x)时给出错误返回x。修正了那个。添加了新的Py_IS_INFINITY宏到 pyport.h. 重新排列代码以减少浮点数和复数散列中不断增加的重复,将 Trent 早先的观点推向一个合乎逻辑的结论。修复了极其罕见的错误,即使没有错误,浮点数的散列也可能返回 -1(没有浪费时间尝试构建测试用例,从代码中很明显它可能发生)。改进了复杂的哈希,使其 hash(complex(x, y))不再系统地相等hash(complex(y, x))

In particular, in this commit he ripped out the code of static long float_hash(PyFloatObject *v)in Objects/floatobject.cand made it just return _Py_HashDouble(v->ob_fval);, and in the definition of long _Py_HashDouble(double v)in Objects/object.che added the lines:

特别是,在这次提交中,他撕掉了static long float_hash(PyFloatObject *v)in的代码Objects/floatobject.c并将其设为 just return _Py_HashDouble(v->ob_fval);,并在 in 的定义long _Py_HashDouble(double v)Objects/object.c添加了以下几行:

        if (Py_IS_INFINITY(intpart))
            /* can't convert to long int -- arbitrary */
            v = v < 0 ? -271828.0 : 314159.0;

So as mentioned, it was an arbitrary choice. Note that 271828 is formed from the first few decimal digits of e.

如前所述,这是一个任意选择。请注意, 271828 由e的前几个十进制数字组成。

Related later commits:

相关的后续提交:

回答by Alec Alameddine

Indeed,

的确,

sys.hash_info.inf

returns 314159. The value is not generated, it's built into the source code. In fact,

返回314159。该值不是生成的,它内置在源代码中。实际上,

hash(float('-inf'))

returns -271828, or approximately -e, in python 2 (it's -314159 now).

-271828在python 2中返回,或大约-e(现在是-314159)。

The fact that the two most famous irrational numbers of all time are used as the hash values makes it very unlikely to be a coincidence.

有史以来最著名的两个无理数被用作哈希值这一事实使得这不太可能是巧合。