Python cython 中 np.int、np.int_、int 和 np.int_t 之间的区别?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/21851985/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-18 23:44:37  来源:igfitidea点击:

Difference between np.int, np.int_, int, and np.int_t in cython?

pythoncnumpycython

提问by colinfang

I am a bit struggled with so many intdata types in cython.

int对 cython 中的这么多数据类型有点困惑。

np.int, np.int_, np.int_t, int

np.int, np.int_, np.int_t, int

I guess intin pure python is equivalent to np.int_, then where does np.intcome from? I cannot find the document from numpy? Also, why does np.int_exist given we do already have int?

我猜int在纯python中相当于np.int_,那么np.int从哪里来?我无法从 numpy 中找到文档?另外,np.int_既然我们已经有了,为什么还存在int

In cython, I guess intbecomes a C type when used as cdef intor ndarray[int], and when used as int()it stays as the python caster?

在 cython 中,我猜int当用作cdef intor时会变成 C 类型ndarray[int],而当用作int()python 施法者时?

Is np.int_equivalent to longin C? so cdef longis the identical to cdef np.int_?

np.int_相当于long用C?那么cdef long与 相同cdef np.int_吗?

Under what circumstances should I use np.int_tinstead of np.int? e.g. cdef np.int_t, ndarray[np.int_t]...

在什么情况下我应该使用np.int_t而不是np.int?例如cdef np.int_tndarray[np.int_t]...

Can someone briefly explain how the wrong use of those types would affect the performance of compiled cython code?

有人可以简要解释一下这些类型的错误使用如何影响编译后的 cython 代码的性能吗?

采纳答案by MSeifert

It's a bit complicated because the names have different meanings depending on the context.

这有点复杂,因为名称根据上下文具有不同的含义。

int

int

  1. In Python

    The intis normally just a Python type, it's of arbitrary precision, meaning that you can store any conceivable integer inside it (as long as you have enough memory).

    >>> int(10**50)
    100000000000000000000000000000000000000000000000000
    
  2. However, when you use it as dtypefor a NumPy array it will be interpreted as np.int_1. Which is notof arbitrary precision, it will have the same size as C's long:

    >>> np.array(10**50, dtype=int)
    OverflowError: Python int too large to convert to C long
    

    That also means the following two are equivalent:

    np.array([1,2,3], dtype=int)
    np.array([1,2,3], dtype=np.int_)
    
  3. As Cython type identifier it has another meaning, here it stands for the ctype int. It's of limited precision (typically 32bits). You can use it as Cython type, for example when defining variables with cdef:

    cdef int value = 100    # variable
    cdef int[:] arr = ...   # memoryview
    

    As return value or argument value for cdefor cpdeffunctions:

    cdef int my_function(int argument1, int argument2):
        # ...
    

    As "generic" for ndarray:

    cimport numpy as cnp
    cdef cnp.ndarray[int, ndim=1] val = ...
    

    For type casting:

    avalue = <int>(another_value)
    

    And probably many more.

  4. In Cython but as Python type. You can still call intand you'll get a "Python int" (of arbitrary precision), or use it for isinstanceor as dtypeargument for np.array. Here the context is important, so converting to a Python intis different from converting to a C int:

    cdef object val = int(10)  # Python int
    cdef int val = <int>(10)   # C int
    
  1. 在 Python 中

    int通常仅仅是一个Python类型,它是任意精度的,这意味着你可以存储在它里面的任何可能的整数(只要你有足够的内存)。

    >>> int(10**50)
    100000000000000000000000000000000000000000000000000
    
  2. 但是,当您将它用作dtypeNumPy 数组时,它将被解释为np.int_1。这不是任意精度,它将与 C 具有相同的大小long

    >>> np.array(10**50, dtype=int)
    OverflowError: Python int too large to convert to C long
    

    这也意味着以下两个是等效的:

    np.array([1,2,3], dtype=int)
    np.array([1,2,3], dtype=np.int_)
    
  3. 作为 Cython 类型标识符它还有另一个含义,这里它代表c类型int。它的精度有限(通常为 32 位)。您可以将其用作 Cython 类型,例如在定义变量时cdef

    cdef int value = 100    # variable
    cdef int[:] arr = ...   # memoryview
    

    作为cdefcpdef函数的返回值或参数值:

    cdef int my_function(int argument1, int argument2):
        # ...
    

    作为“通用” ndarray

    cimport numpy as cnp
    cdef cnp.ndarray[int, ndim=1] val = ...
    

    对于类型铸造:

    avalue = <int>(another_value)
    

    可能还有更多。

  4. 在 Cython 中,但作为 Python 类型。你还可以打电话int,你会得到一个“巨蟒INT”(任意精度),或将其用于isinstance或作为dtype论据np.array。这里上下文很重要,因此转换为 Pythonint与转换为 C int 不同:

    cdef object val = int(10)  # Python int
    cdef int val = <int>(10)   # C int
    

np.int

np.int

Actually this is very easy. It's just an alias for int:

其实这很容易。它只是一个别名int

>>> int is np.int
True

So everything from above applies to np.intas well. However you can't use it as a type-identifier except when you use it on the cimported package. In that case it represents the Python integer type.

所以上面的所有内容也适用np.int。但是,您不能将它用作类型标识符,除非您在cimported 包上使用它。在这种情况下,它表示 Python 整数类型。

cimport numpy as cnp

cpdef func(cnp.int obj):
    return obj

This will expect objto be a Python integer not a NumPy type:

这将obj是一个 Python 整数而不是 NumPy 类型

>>> func(np.int_(10))
TypeError: Argument 'obj' has incorrect type (expected int, got numpy.int32)
>>> func(10)
10

My advise regarding np.int: Avoid it whenever possible. In Python code it's equivalent to intand in Cython code it's also equivalent to Pythons intbut if used as type-identifier it will probably confuse you and everyone who reads the code! It certainly confused me...

我的建议是np.int:尽可能避免它。在 Python 代码中,它等同于int,在 Cython 代码中,它也等同于 Python,int但如果用作类型标识符,它可能会使您和阅读代码的每个人感到困惑!这当然让我感到困惑......

np.int_

np.int_

Actually it only has one meaning: It's a Python typethat represents a scalar NumPy type. You use it like Pythons int:

实际上它只有一个含义:它是一个Python 类型,表示一个标量 NumPy 类型。您可以像使用 Python 一样使用它int

>>> np.int_(10)        # looks like a normal Python integer
10
>>> type(np.int_(10))  # but isn't (output may vary depending on your system!)
numpy.int32

Or you use it to specify the dtype, for example with np.array:

或者您使用它来指定dtype,例如np.array

>>> np.array([1,2,3], dtype=np.int_)
array([1, 2, 3])

But you cannot use it as type-identifier in Cython.

但是您不能在 Cython 中将其用作类型标识符。

cnp.int_t

cnp.int_t

It's the type-identifier version for np.int_. That means you can't use it as dtype argument. But you can use it as type for cdefdeclarations:

它是np.int_. 这意味着您不能将其用作 dtype 参数。但是您可以将其用作cdef声明类型:

cimport numpy as cnp
import numpy as np

cdef cnp.int_t[:] arr = np.array([1,2,3], dtype=np.int_)
     |---TYPE---|                         |---DTYPE---|

This example (hopefully) shows that the type-identifier with the trailing _tactually represents the type of an array using the dtypewithout the trailing t. You can't interchange them in Cython code!

这个例子(希望如此)表明,带有尾随的类型标识符_t实际上表示使用没有尾随的dtype的数组的类型t。你不能在 Cython 代码中交换它们!

Notes

笔记

There are several more numeric types in NumPy I'll include a list containing the NumPy dtype and Cython type-identifier and the C type identifier that could also be used in Cython here. But it's basically taken from the NumPy documentationand the Cython NumPy pxdfile:

NumPy 中还有更多数字类型,我将在此处包含一个包含 NumPy dtype 和 Cython 类型标识符以及 C 类型标识符的列表,这些标识符也可以在 Cython 中使用。但它基本上取自NumPy 文档Cython NumPypxd文件

NumPy dtype          Numpy Cython type         C Cython type identifier

np.bool_             None                      None
np.int_              cnp.int_t                 long
np.intc              None                      int       
np.intp              cnp.intp_t                ssize_t
np.int8              cnp.int8_t                signed char
np.int16             cnp.int16_t               signed short
np.int32             cnp.int32_t               signed int
np.int64             cnp.int64_t               signed long long
np.uint8             cnp.uint8_t               unsigned char
np.uint16            cnp.uint16_t              unsigned short
np.uint32            cnp.uint32_t              unsigned int
np.uint64            cnp.uint64_t              unsigned long
np.float_            cnp.float64_t             double
np.float32           cnp.float32_t             float
np.float64           cnp.float64_t             double
np.complex_          cnp.complex128_t          double complex
np.complex64         cnp.complex64_t           float complex
np.complex128        cnp.complex128_t          double complex

Actually there are Cython types for np.bool_: cnp.npy_booland bintbut both they can't be used for NumPy arrays currently. For scalars cnp.npy_boolwill just be an unsigned integer while bintwill be a boolean. Not sure what's going on there...

实际上有用于np.bool_: 的Cython 类型cnp.npy_boolbint但它们目前都不能用于 NumPy 数组。对于标量cnp.npy_bool将只是一个无符号整数而bint将是一个布尔值。不知道那里发生了什么......



1Taken From the NumPy documentation "Data type objects"

1摘自NumPy 文档“数据类型对象”

Built-in Python types

Several python types are equivalent to a corresponding array scalar when used to generate a dtype object:

int           np.int_
bool          np.bool_
float         np.float_
complex       np.cfloat
bytes         np.bytes_
str           np.bytes_ (Python2) or np.unicode_ (Python3)
unicode       np.unicode_
buffer        np.void
(all others)  np.object_

内置 Python 类型

几种 python 类型在用于生成 dtype 对象时等效于相应的数组标量:

int           np.int_
bool          np.bool_
float         np.float_
complex       np.cfloat
bytes         np.bytes_
str           np.bytes_ (Python2) or np.unicode_ (Python3)
unicode       np.unicode_
buffer        np.void
(all others)  np.object_

回答by Matti Lyra

np.int_is the default integer type (as defined in the NumPy docs), on a 64bit system this would be a C long. np.intcis the default C inteither int32or int64. np.intis an alias to the built-in intfunction

np.int_是默认整数类型(在 NumPy 文档中定义),在 64 位系统上这将是C long. np.intc是默认的C int或者int32int64np.int是内置int函数的别名

>>> np.int(2.4)
2
>>> np.int is int  # object id equality
True

The cython datatypes should reflect Cdatatypes, so cdef int ais a C intand so on.

cython 数据类型应该反映C数据类型,所以cdef int a是 aC int等等。

As for np.int_tthat is the Cythoncompile time equivalent of the NumPy np.int_datatype, np.int64_tis the Cythoncompile time equivalent of np.int64

至于np.int_t那是CythonNumPynp.int_数据类型np.int64_tCython编译时间等价物,是编译时间等价物np.int64