Python 如何在不改变其维度的情况下将名称添加到 numpy 数组?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/24168569/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How to add names to a numpy array without changing its dimension?
提问by Josh O'Brien
I have an existing two-column numpy array to which I need to add column names. Passing those in via dtype
works in the toy example shown in Block 1below. With my actual array, though, as shown in Block 2, the same approach is having an unexpected (to me!) side-effect of changing the array dimensions.
我有一个现有的两列 numpy 数组,我需要向其中添加列名。在下面的块 1中dtype
显示的玩具示例中传递那些通过工作。但是,对于我的实际数组,如Block 2所示,相同的方法会产生意外(对我而言!)更改数组维度的副作用。
How can I convert my actual array, the one named Y
in the second block below, to an array having named columns, like I did for array A
in the first block?
如何将我的实际数组(在Y
下面第二个块中命名的数组)转换为具有命名列的数组,就像我A
在第一个块中为数组所做的那样?
Block 1:(Columns of A
named without reshaping dimension)
第 1 块:(A
未重塑维度的已命名列)
import numpy as np
A = np.array(((1,2),(3,4),(50,100)))
A
# array([[ 1, 2],
# [ 3, 4],
# [ 50, 100]])
dt = {'names':['ID', 'Ring'], 'formats':[np.int32, np.int32]}
A.dtype=dt
A
# array([[(1, 2)],
# [(3, 4)],
# [(50, 100)]],
# dtype=[('ID', '<i4'), ('Ring', '<i4')])
Block 2:(Naming columns of my actual array, Y
, reshapes its dimension)
第 2 块:(命名我的实际数组的列Y
,重塑其维度)
import numpy as np
## Code to reproduce Y, the array I'm actually dealing with
nRings = 3
nn = [[nRings+1-n] * n for n in range(nRings+1)]
RING = reduce(lambda x, y: x+y, nn)
ID = range(1,len(RING)+1)
X = numpy.array([ID, RING])
Y = X.T
Y
# array([[1, 3],
# [2, 2],
# [3, 2],
# [4, 1],
# [5, 1],
# [6, 1]])
## My unsuccessful attempt to add names to the array's columns
dt = {'names':['ID', 'Ring'], 'formats':[np.int32, np.int32]}
Y.dtype=dt
Y
# array([[(1, 2), (3, 2)],
# [(3, 4), (2, 1)],
# [(5, 6), (1, 1)]],
# dtype=[('ID', '<i4'), ('Ring', '<i4')])
## What I'd like instead of the results shown just above
# array([[(1, 3)],
# [(2, 2)],
# [(3, 2)],
# [(4, 1)],
# [(5, 1)],
# [(6, 1)]],
# dtype=[('ID', '<i4'), ('Ring', '<i4')])
采纳答案by Bi Rico
First because your question asks about giving names to arrays, I feel obligated to point out that using "structured arrays" for the purpose of giving names is probably not the best approach. We often like to give names to rows/columns when we're working with tables, if this is the case I suggest you try something like pandaswhich is awesome. If you simply want to organize some data in your code, a dictionary of arrays is often much better than a structured array, so for example you can do:
首先,因为您的问题是关于为数组命名,我觉得有必要指出,使用“结构化数组”来命名可能不是最好的方法。当我们处理表格时,我们经常喜欢给行/列命名,如果是这种情况,我建议你尝试像Pandas这样很棒的东西。如果你只是想在你的代码中组织一些数据,数组字典通常比结构化数组好得多,例如你可以这样做:
Y = {'ID':X[0], 'Ring':X[1]}
With that out of the way, if you want to use a structured array, here is the clearest way to do it in my opinion:
顺便说一句,如果你想使用结构化数组,我认为这是最清晰的方法:
import numpy as np
nRings = 3
nn = [[nRings+1-n] * n for n in range(nRings+1)]
RING = reduce(lambda x, y: x+y, nn)
ID = range(1,len(RING)+1)
X = np.array([ID, RING])
dt = {'names':['ID', 'Ring'], 'formats':[np.int, np.int]}
Y = np.zeros(len(RING), dtype=dt)
Y['ID'] = X[0]
Y['Ring'] = X[1]
回答by jonnybazookatone
Try re-writing the definition of X:
尝试重写 X 的定义:
X = np.array(zip(ID, RING))
and then you don't need to define Y = X.T
然后你不需要定义 Y = XT
回答by hgazibara
Are you completely sure about the outputs for A
and Y
? I get something different using Python 2.7.6 and numpy 1.8.1.
您完全确定A
和的输出Y
吗?我使用 Python 2.7.6 和 numpy 1.8.1 得到了一些不同的东西。
My initial output for A
is the same as yours, as it should be. After running the following code for the first example
我的初始输出A
与您的相同,应该如此。为第一个示例运行以下代码后
dt = {'names':['ID', 'Ring'], 'formats':[np.int32, np.int32]}
A.dtype=dt
the contents of array A
are actually
数组的内容A
实际上是
array([[(1, 0), (3, 0)],
[(2, 0), (2, 0)],
[(3, 0), (2, 0)],
[(4, 0), (1, 0)],
[(5, 0), (1, 0)],
[(6, 0), (1, 0)]],
dtype=[('ID', '<i4'), ('Ring', '<i4')])
This makes somewhat more sense to me than the output you added because dtype
determines the data-type of every element in the array and the new definition states that every element should contain two fields, so it does, but the value of the second field is set to 0 because there was no preexisting value for the second field.
这对我来说比你添加的输出更有意义,因为它dtype
决定了数组中每个元素的数据类型,而新定义指出每个元素都应该包含两个字段,所以确实如此,但第二个字段的值已设置为 0,因为第二个字段没有预先存在的值。
However, if you would like to make numpy group columns of your existing array so that every row contains only one element, but with each element having two fields, you could introduce a small code change.
但是,如果您想让现有数组的 numpy 组列使每一行只包含一个元素,但每个元素都有两个字段,则可以引入一个小的代码更改。
Since a tuple is needed to make numpy group elements into a more complex data-type, you could make this happen by creating a new array and turning every row of the existing array into a tuple. Here is a simple working example
由于需要一个元组来将 numpy 组元素转换为更复杂的数据类型,因此您可以通过创建一个新数组并将现有数组的每一行转换为一个元组来实现这一点。这是一个简单的工作示例
import numpy as np
A = np.array(((1,2),(3,4),(50,100)))
dt = np.dtype([('ID', np.int32), ('Ring', np.int32)])
B = np.array(list(map(tuple, A)), dtype=dt)
Using this short piece of code, array B
becomes
使用这段简短的代码,数组B
变成
array([(1, 2), (3, 4), (50, 100)],
dtype=[('ID', '<i4'), ('Ring', '<i4')])
To make B
a 2D array, it is enough to write
要做B
一个二维数组,这样写就够了
B.reshape(len(B), 1) # in this case, even B.size would work instead of len(B)
For the second example, the similar thing needs to be done to make Y a structured array:
对于第二个示例,需要做类似的事情来使 Y 成为结构化数组:
Y = np.array(list(map(tuple, X.T)), dtype=dt)
After doing this for your second example, array Y looks like this
在为第二个示例执行此操作后,数组 Y 如下所示
array([(1, 3), (2, 2), (3, 2), (4, 1), (5, 1), (6, 1)],
dtype=[('ID', '<i4'), ('Ring', '<i4')])
You can notice that the output is not the same as the one you expect it to be, but this one is simpler because instead of writing Y[0,0]
to get the first element, you can just write Y[0]
. To also make this array 2D, you can also use reshape
, just as with B
.
您会注意到输出与您期望的输出不同,但这个更简单,因为您无需编写Y[0,0]
以获取第一个元素,而只需编写Y[0]
. 要也使这个数组成为二维,您还可以使用reshape
,就像使用B
.
回答by HYRY
This is because Y is not C_CONTIGUOUS, you can check it by Y.flags
:
这是因为 Y 不是 C_CONTIGUOUS,您可以通过Y.flags
以下方式检查它:
C_CONTIGUOUS : False
F_CONTIGUOUS : True
OWNDATA : False
WRITEABLE : True
ALIGNED : True
UPDATEIFCOPY : False
You can call Y.copy()
or Y.ravel()
first:
您可以致电Y.copy()
或Y.ravel()
先:
dt = {'names':['ID', 'Ring'], 'formats':[np.int32, np.int32]}
print Y.ravel().view(dt) # the result shape is (6, )
print Y.copy().view(dt) # the result shape is (6, 1)
回答by lX-Xl
store-different-datatypes-in-one-numpy-arrayanother page including a nice solution of adding name to an array which can be used as column Example:
store-different-datatypes-in-one-numpy-array另一个页面,包括将名称添加到可用作列的数组的一个很好的解决方案示例:
r = np.core.records.fromarrays([x1,x2,x3],names='a,b,c')
# x1, x2, x3 are flatten array
# a,b,c are field name