Python Numpy 将 1d 数组重塑为 1 列的 2d 数组
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/36009907/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Numpy reshape 1d to 2d array with 1 column
提问by DevShark
In numpy
the dimensions of the resulting array vary at run time.
There is often confusion between a 1d array and a 2d array with 1 column.
In one case I can iterate over the columns, in the other case I cannot.
在numpy
所得到的阵列的尺寸在运行时间而变化。一维数组和具有 1 列的二维数组之间经常会出现混淆。在一种情况下我可以遍历列,在另一种情况下我不能。
How do you solve elegantly that problem?
To avoid littering my code with if
statements checking for the dimensionality, I use this function:
你如何优雅地解决这个问题?为了避免用if
检查维度的语句来乱扔我的代码,我使用了这个函数:
def reshape_to_vect(ar):
if len(ar.shape) == 1:
return ar.reshape(ar.shape[0],1)
return ar
However, this feels inelegant and costly. Is there a better solution?
然而,这感觉不雅和昂贵。有更好的解决方案吗?
回答by KaaPex
The simplest way:
最简单的方法:
ar.reshape(-1, 1)
回答by Divakar
You could do -
你可以这样做——
ar.reshape(ar.shape[0],-1)
That second input to reshape
: -1
takes care of the number of elements for the second axis. Thus, for a 2D
input case, it does no change. For a 1D
input case, it creates a 2D
array with all elements being "pushed" to the first axis because of ar.shape[0]
, which was the total number of elements.
reshape
: 的第二个输入处理-1
第二个轴的元素数量。因此,对于2D
输入案例,它不会改变。对于1D
输入情况,它创建一个2D
数组,其中所有元素都被“推送”到第一个轴,因为 ar.shape[0]
,这是元素的总数。
Sample runs
样品运行
1D Case :
一维案例:
In [87]: ar
Out[87]: array([ 0.80203158, 0.25762844, 0.67039516, 0.31021513, 0.80701097])
In [88]: ar.reshape(ar.shape[0],-1)
Out[88]:
array([[ 0.80203158],
[ 0.25762844],
[ 0.67039516],
[ 0.31021513],
[ 0.80701097]])
2D Case :
二维案例:
In [82]: ar
Out[82]:
array([[ 0.37684126, 0.16973899, 0.82157815, 0.38958523],
[ 0.39728524, 0.03952238, 0.04153052, 0.82009233],
[ 0.38748174, 0.51377738, 0.40365096, 0.74823535]])
In [83]: ar.reshape(ar.shape[0],-1)
Out[83]:
array([[ 0.37684126, 0.16973899, 0.82157815, 0.38958523],
[ 0.39728524, 0.03952238, 0.04153052, 0.82009233],
[ 0.38748174, 0.51377738, 0.40365096, 0.74823535]])
回答by Luca Citi
A variant of the answer by divakar is: x = np.reshape(x, (len(x),-1))
, which also deals with the case when the input is a 1d or 2d list.
divakar 的答案的一个变体是:x = np.reshape(x, (len(x),-1))
,它也处理输入是 1d 或 2d 列表的情况。
回答by Yuval Atzmon
To avoid the need to reshape in the first place, if you slice a row / column with a list, or a "running" slice, you will get a 2D array with one row / column
为了避免首先需要重塑,如果您使用列表或“正在运行”切片对行/列进行切片,您将获得具有一行/列的二维数组
import numpy as np
x = np.array(np.random.normal(size=(4,4)))
print x, '\n'
Result:
[[ 0.01360395 1.12130368 0.95429414 0.56827029]
[-0.66592215 1.04852182 0.20588886 0.37623406]
[ 0.9440652 0.69157556 0.8252977 -0.53993904]
[ 0.6437994 0.32704783 0.52523173 0.8320762 ]]
y = x[:,[0]]
print y, 'col vector \n'
Result:
[[ 0.01360395]
[-0.66592215]
[ 0.9440652 ]
[ 0.6437994 ]] col vector
y = x[[0],:]
print y, 'row vector \n'
Result:
[[ 0.01360395 1.12130368 0.95429414 0.56827029]] row vector
# Slice with "running" index on a column
y = x[:,0:1]
print y, '\n'
Result:
[[ 0.01360395]
[-0.66592215]
[ 0.9440652 ]
[ 0.6437994 ]]
Instead if you use a single number for choosing the row/column, it will result in a 1D array, which is the root cause of your issue:
相反,如果您使用单个数字来选择行/列,则会产生一维数组,这是您问题的根本原因:
y = x[:,0]
print y, '\n'
Result:
[ 0.01360395 -0.66592215 0.9440652 0.6437994 ]
回答by Murtaza Chawala
y = np.array(12)
y = y.reshape(-1,1)
print(y.shape)
O/P:- (1, 1)
回答by hpaulj
I asked about dtype
because your example is puzzling.
我问的是dtype
因为你的例子令人费解。
I can make a structured array with 3 elements (1d) and 3 fields:
我可以创建一个包含 3 个元素 (1d) 和 3 个字段的结构化数组:
In [1]: A = np.ones((3,), dtype='i,i,i')
In [2]: A
Out[2]:
array([(1, 1, 1), (1, 1, 1), (1, 1, 1)],
dtype=[('f0', '<i4'), ('f1', '<i4'), ('f2', '<i4')])
I can access one field by name (adding brackets doesn't change things)
我可以按名称访问一个字段(添加括号不会改变事情)
In [3]: A['f0'].shape
Out[3]: (3,)
but if I access 2 fields, I still get a 1d array
但是如果我访问 2 个字段,我仍然会得到一个一维数组
In [4]: A[['f0','f1']].shape
Out[4]: (3,)
In [5]: A[['f0','f1']]
Out[5]:
array([(1, 1), (1, 1), (1, 1)],
dtype=[('f0', '<i4'), ('f1', '<i4')])
Actually those extra brackets do matter, if I look at values
实际上,如果我查看值,那些额外的括号确实很重要
In [22]: A['f0']
Out[22]: array([1, 1, 1], dtype=int32)
In [23]: A[['f0']]
Out[23]:
array([(1,), (1,), (1,)],
dtype=[('f0', '<i4')])
If the array is a simple 2d one, I still don't get your shapes
如果数组是一个简单的二维数组,我仍然不明白你的形状
In [24]: A=np.ones((3,3),int)
In [25]: A[0].shape
Out[25]: (3,)
In [26]: A[[0]].shape
Out[26]: (1, 3)
In [27]: A[[0,1]].shape
Out[27]: (2, 3)
But as to question of making sure an array is 2d, regardless of whether the indexing returns 1d or 2, your function is basically ok
但是关于确保数组是 2d 的问题,无论索引返回 1d 还是 2,您的函数基本上都可以
def reshape_to_vect(ar):
if len(ar.shape) == 1:
return ar.reshape(ar.shape[0],1)
return ar
You could test ar.ndim
instead of len(ar.shape)
. But either way it is not costly - that is, the execution time is minimal - no big array operations. reshape
doesn't copy data (unless your strides are weird), so it is just the cost of creating a new array object with a shared data pointer.
你可以测试ar.ndim
而不是len(ar.shape)
. 但无论哪种方式它都不昂贵 - 也就是说,执行时间最短 - 没有大数组操作。 reshape
不复制数据(除非您的步幅很奇怪),因此这只是创建具有共享数据指针的新数组对象的成本。
Look at the code for np.atleast_2d
; it tests for 0d and 1d. In the 1d case it returns result = ary[newaxis,:]
. It adds the extra axis first, the more natural numpy
location for adding an axis. You add it at the end.
查看代码np.atleast_2d
;它测试 0d 和 1d。在 1d 情况下,它返回result = ary[newaxis,:]
。它首先添加额外的轴,numpy
添加轴的位置更自然。你在最后添加它。
ar.reshape(ar.shape[0],-1)
is a clever way of bypassing the if
test. In small timing tests it faster, but we are talking about microseconds, the effect of a function call layer.
ar.reshape(ar.shape[0],-1)
是绕过if
测试的巧妙方法。在小时间测试它更快,但我们谈论的是微秒,一个函数调用层的效果。
np.column_stack
is another function that creates column arrays if needed. It uses:
np.column_stack
是另一个根据需要创建列数组的函数。它用:
if arr.ndim < 2:
arr = array(arr, copy=False, subok=True, ndmin=2).T