Python 如何将列添加到numpy数组
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/15815854/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How to add column to numpy array
提问by user2130951
I am trying to add one column to the array created from recfromcsv. In this case it's an array: [210,8](rows, cols).
我正在尝试将一列添加到从recfromcsv. 在这种情况下,它是一个数组:([210,8]行,列)。
I want to add a ninth column. Empty or with zeroes doesn't matter.
我想添加第九列。空或带零无关紧要。
from numpy import genfromtxt
from numpy import recfromcsv
import numpy as np
import time
if __name__ == '__main__':
print("testing")
my_data = recfromcsv('LIAB.ST.csv', delimiter='\t')
array_size = my_data.size
#my_data = np.append(my_data[:array_size],my_data[9:],0)
new_col = np.sum(x,1).reshape((x.shape[0],1))
np.append(x,new_col,1)
采纳答案by askewchan
I think that your problem is that you are expecting np.appendto add the column in-place, but what it does, because of how numpy data is stored, is create a copy of the joined arrays
我认为您的问题是您希望np.append就地添加列,但是由于 numpy 数据的存储方式,它的作用是创建连接数组的副本
Returns
-------
append : ndarray
A copy of `arr` with `values` appended to `axis`. Note that `append`
does not occur in-place: a new array is allocated and filled. If
`axis` is None, `out` is a flattened array.
so you need to save the output all_data = np.append(...):
所以你需要保存输出all_data = np.append(...):
my_data = np.random.random((210,8)) #recfromcsv('LIAB.ST.csv', delimiter='\t')
new_col = my_data.sum(1)[...,None] # None keeps (n, 1) shape
new_col.shape
#(210,1)
all_data = np.append(my_data, new_col, 1)
all_data.shape
#(210,9)
Alternative ways:
替代方法:
all_data = np.hstack((my_data, new_col))
#or
all_data = np.concatenate((my_data, new_col), 1)
I believe that the only difference between these three functions (as well as np.vstack) are their default behaviors for when axisis unspecified:
我相信这三个函数(以及np.vstack)之间的唯一区别是它们在 whenaxis未指定时的默认行为:
concatenateassumesaxis = 0hstackassumesaxis = 1unless inputs are 1d, thenaxis = 0vstackassumesaxis = 0after adding an axis if inputs are 1dappendflattens array
concatenate假设axis = 0hstack假设axis = 1除非输入是 1d,那么axis = 0vstackaxis = 0如果输入为 1d,则在添加轴后假设append展平阵列
Based on your comment, and looking more closely at your example code, I now believe that what you are probably looking to do is add a fieldto a record array. You imported both genfromtxtwhich returns a structured arrayand recfromcsvwhich returns the subtly different record array(recarray). You used the recfromcsvso right now my_datais actually a recarray, which means that most likely my_data.shape = (210,)since recarrays are 1d arrays of records, where each record is a tuple with the given dtype.
根据您的评论,并更仔细地查看您的示例代码,我现在相信您可能想要做的是将字段添加到记录数组。您导入了genfromtxt返回结构化数组和recfromcsv返回细微不同的记录数组( recarray) 的两者。您recfromcsv现在使用的somy_data实际上是 a recarray,这意味着很可能my_data.shape = (210,)因为 recarrays 是一维记录数组,其中每个记录都是具有给定 dtype 的元组。
So you could try this:
所以你可以试试这个:
import numpy as np
from numpy.lib.recfunctions import append_fields
x = np.random.random(10)
y = np.random.random(10)
z = np.random.random(10)
data = np.array( list(zip(x,y,z)), dtype=[('x',float),('y',float),('z',float)])
data = np.recarray(data.shape, data.dtype, buf=data)
data.shape
#(10,)
tot = data['x'] + data['y'] + data['z'] # sum(axis=1) won't work on recarray
tot.shape
#(10,)
all_data = append_fields(data, 'total', tot, usemask=False)
all_data
#array([(0.4374783740738456 , 0.04307289878861764, 0.021176067323686598, 0.5017273401861498),
# (0.07622262416466963, 0.3962146058689695 , 0.27912715826653534 , 0.7515643883001745),
# (0.30878532523061153, 0.8553768789387086 , 0.9577415585116588 , 2.121903762680979 ),
# (0.5288343561208022 , 0.17048864443625933, 0.07915689716226904 , 0.7784798977193306),
# (0.8804269791375121 , 0.45517504750917714, 0.1601389248542675 , 1.4957409515009568),
# (0.9556552723429782 , 0.8884504475901043 , 0.6412854758843308 , 2.4853911958174133),
# (0.0227638618687922 , 0.9295332854783015 , 0.3234597575660103 , 1.275756904913104 ),
# (0.684075052174589 , 0.6654774682866273 , 0.5246593820025259 , 1.8742119024637423),
# (0.9841793718333871 , 0.5813955915551511 , 0.39577520705133684 , 1.961350170439875 ),
# (0.9889343795296571 , 0.22830104497714432, 0.20011292764078448 , 1.4173483521475858)],
# dtype=[('x', '<f8'), ('y', '<f8'), ('z', '<f8'), ('total', '<f8')])
all_data.shape
#(10,)
all_data.dtype.names
#('x', 'y', 'z', 'total')
回答by atomh33ls
If you have an array, aof say 210 rows by 8 columns:
如果您有一个数组,a例如 210 行 x 8 列:
a = numpy.empty([210,8])
and want to add a ninth column of zeros you can do this:
并且想要添加第九列零,你可以这样做:
b = numpy.append(a,numpy.zeros([len(a),1]),1)
回答by Tomas
I add a new column with ones to a matrix array in this way:
我以这种方式向矩阵数组添加一个带有 1 的新列:
Z = append([[1 for _ in range(0,len(Z))]], Z.T,0).T
Maybe it is not that efficient?
也许它不是那么有效?
回答by aderchox
It can be done like this:
可以这样做:
import numpy as np
# create a random matrix:
A = np.random.normal(size=(5,2))
# add a column of zeros to it:
print(np.hstack((A,np.zeros((A.shape[0],1)))))
In general, if A is an m*n matrix, and you need to add a column, you have to create an n*1 matrix of zeros, then use "hstack" to add the matrix of zeros to the right of the matrix A.
一般情况下,如果A是一个m*n矩阵,需要加一列,则必须创建一个n*1的零矩阵,然后使用“hstack”将零矩阵添加到矩阵A的右边.

