Python numpy 数组行主要和列主要

Question

提问by jmlopez

I'm having trouble understanding how numpystores its data. Consider the following:

我无法理解如何numpy存储其数据。考虑以下：

>>> import numpy as np
>>> a = np.ndarray(shape=(2,3), order='F')
>>> for i in xrange(6): a.itemset(i, i+1)
... 
>>> a
array([[ 1.,  2.,  3.],
       [ 4.,  5.,  6.]])
>>> a.flags
  C_CONTIGUOUS : False
  F_CONTIGUOUS : True
  OWNDATA : True
  WRITEABLE : True
  ALIGNED : True
  UPDATEIFCOPY : False

This says that ais column major (F_CONTIGUOUS) thus, internally, ashould look like the following:

这表示这a是列主要 ( F_CONTIGUOUS) 因此，在内部，a应该如下所示：

[1, 4, 2, 5, 3, 6]

This is just what it is stated in in this glossary. What is confusing me is that if I try to to access the data of ain a linear fashion instead I get:

这正是本词汇表中的表述。令我困惑的是，如果我尝试以a线性方式访问数据，我会得到：

>>> for i in xrange(6): print a.item(i)
... 
1.0
2.0
3.0
4.0
5.0
6.0

At this point I'm not sure what the F_CONTIGUOUSflag tells us since it does not honor the ordering. Apparently everything in python is row major and when we want to iterate in a linear fashion we can use the iterator flat.

在这一点上，我不确定F_CONTIGUOUS标志告诉我们什么，因为它不遵守顺序。显然，python 中的所有内容都是行主要的，当我们想要以线性方式进行迭代时，我们可以使用迭代器flat。

The question is the following:given that we have a list of numbers, say: 1, 2, 3, 4, 5, 6, how can we create a numpyarray of shape (2, 3)in column major order? That is how can I get a matrix that looks like this

问题如下：假设我们有一个数字列表，例如：1, 2, 3, 4, 5, 6，我们如何以列主要顺序创建一个numpy形状数组(2, 3)？那就是我怎样才能得到一个看起来像这样的矩阵

array([[ 1.,  3.,  5.],
       [ 2.,  4.,  6.]])

I would really like to be able to iterate linearly over the list and place them into the newly created ndarray. The reason for this is because I will be reading files of multidimensional arrays set in column major order.

我真的很希望能够对列表进行线性迭代并将它们放入新创建的ndarray. 这样做的原因是因为我将读取按列主要顺序设置的多维数组文件。

Answer 1

采纳答案by Kill Console

The numpy stores data in row major order.

numpy 按行主要顺序存储数据。

>>> a = np.array([[1,2,3,4], [5,6,7,8]])
>>> a.shape
(2, 4)
>>> a.shape = 4,2
>>> a
array([[1, 2],
       [3, 4],
       [5, 6],
       [7, 8]])

If you change the shape, the order of data do not change.

如果改变形状，数据的顺序不会改变。

If you add a 'F', you can get what you want.

如果你添加一个'F'，你可以得到你想要的。

>>> b
array([1, 2, 3, 4, 5, 6])
>>> c = b.reshape(2,3,order='F')
>>> c
array([[1, 3, 5],
       [2, 4, 6]])

Answer 2

回答by Bi Rico

In general, numpy uses order to describe the memory layout, but the python behavior of the arrays should be consistent regardless of the memory layout. I think you can get the behavior you want using views. A view is an array that shares memory with another array. For example:

一般来说，numpy 使用 order 来描述内存布局，但是无论内存布局如何，数组的 python 行为都应该是一致的。我认为您可以使用视图获得所需的行为。视图是与另一个数组共享内存的数组。例如：

import numpy as np

a = np.arange(1, 6 + 1)
b = a.reshape(3, 2).T

a[1] = 99
print b
# [[ 1  3  5]
#  [99  4  6]]

Hope that helps.

希望有帮助。

Answer 3

回答by Matt Hancock

Your question has been answered, but I thought I would add this to explain your observations regarding, "At this point I'm not sure what the F_CONTIGUOUSflag tells us since it does not honor the ordering."

您的问题已得到解答，但我想我会添加这一点来解释您对“此时我不确定F_CONTIGUOUS标志告诉我们什么，因为它不遵守顺序”的看法。

The itemmethod doesn't directly access the data like you think it does. To do this, you should access the dataattribute, which gives you the byte string.

该item方法不会像您认为的那样直接访问数据。为此，您应该访问data为您提供字节字符串的属性。

An example:

一个例子：

c = np.array([[1,2,3],
              [4,6,7]], order='C')

f = np.array([[1,2,3],
              [4,6,7]], order='F')

Observe

观察

print c.flags.c_contiguous, f.flags.f_contiguous
# True, True

and

和

print c.nbytes == len(c.data)
# True

Now let's print the contiguous data for both:

现在让我们打印两者的连续数据：

nelements = np.prod(c.shape)
bsize = c.dtype.itemsize # should be 8 bytes for 'int64'
for i in range(nelements):
    bnum = c.data[i*bsize : (i+1)*bsize] # The element as a byte string.
    print np.fromstring(bnum, dtype=c.dtype)[0], # Convert to number.

This prints:

这打印：

1 2 3 4 6 7

which is what we expect since cis order 'C', i.e., its data is stored row-major contiguous.

这是我们期望的，因为c是 order 'C'，即它的数据存储在行优先连续的。

On the other hand,

另一方面，

nelements = np.prod(f.shape)
bsize = f.dtype.itemsize # should be 8 bytes for 'int64'
for i in range(nelements):
    bnum = f.data[i*bsize : (i+1)*bsize] # The element as a byte string.
    print np.fromstring(bnum, dtype=f.dtype)[0], # Convert to number.

prints

印刷

1 4 2 6 3 7

which, again, is what we expect to see since f's data is stored column-major contiguous.

这也是我们期望看到的，因为f的数据是以列为主连续存储的。

Answer 4

回答by cfh

Here is a simple way to print the data in memory order, by using the ravel()function:

这是使用该ravel()函数按内存顺序打印数据的简单方法：

>>> import numpy as np
>>> a = np.ndarray(shape=(2,3), order='F')
>>> for i in range(6): a.itemset(i, i+1)

>>> print(a.ravel(order='K'))
[ 1.  4.  2.  5.  3.  6.]

This confirms that the array is stored in Fortran order.

这确认数组是以 Fortran 顺序存储的。

Answer 5

回答by KamKam

Wanted to add this in the comments but my rep is too low:

想在评论中添加这个，但我的代表太低了：

While Kill Console's answer gave the OP's required solution, I think it's important to note that as stated in the numpy.reshape() documentation (https://docs.scipy.org/doc/numpy/reference/generated/numpy.reshape.html):

虽然 Kill Console 的回答给出了 OP 所需的解决方案，但我认为重要的是要注意 numpy.reshape() 文档 ( https://docs.scipy.org/doc/numpy/reference/generated/numpy.reshape) 中所述。 html):

Note there is no guarantee of the memory layout (C- or Fortran- contiguous) of the returned array.

请注意，无法保证返回数组的内存布局（C 或 Fortran 连续）。

so even if the view is column-wise, the data itself may not be, which may lead to inefficiencies in calculations which benefit from the data being stored column-wise in memory. Perhaps:

因此，即使视图是按列进行的，数据本身也可能不是，这可能会导致计算效率低下，这会受益于数据按列存储在内存中。也许：

a = np.array(np.array([1, 2, 3, 4, 5, 6]).reshape(2,3,order='F'), order='F')

provides more of a guarantee that the data is stored column-wise (see order argument description at https://docs.scipy.org/doc/numpy-1.15.1/reference/generated/numpy.array.html).

提供了更多的数据按列存储的保证（请参阅https://docs.scipy.org/doc/numpy-1.15.1/reference/generated/numpy.array.html 上的订单参数说明）。

Python numpy 数组行主要和列主要

提问by jmlopez

采纳答案by Kill Console

回答by Bi Rico

回答by Matt Hancock

回答by cfh

回答by KamKam

相关推荐

最近更新

标签

Python numpy 数组行主要和列主要

提问by jmlopez

采纳答案by Kill Console

回答by Bi Rico

回答by Matt Hancock

回答by cfh

回答by KamKam

相关推荐

Python计算时差，在1中给出“年、月、日、时、分和秒”

Python 如何从我的 Flask 应用程序向另一个站点发送 GET 请求？

python 2.7 字符 \u2013

python pandas将数据框展平为列表

相关推荐

最近更新

标签