Python 如何将列添加到numpy数组

Question

提问by user2130951

I am trying to add one column to the array created from recfromcsv. In this case it's an array: [210,8](rows, cols).

我正在尝试将一列添加到从recfromcsv. 在这种情况下，它是一个数组：（[210,8]行，列）。

I want to add a ninth column. Empty or with zeroes doesn't matter.

我想添加第九列。空或带零无关紧要。

from numpy import genfromtxt
from numpy import recfromcsv
import numpy as np
import time

if __name__ == '__main__':
 print("testing")
 my_data = recfromcsv('LIAB.ST.csv', delimiter='\t')
 array_size = my_data.size
 #my_data = np.append(my_data[:array_size],my_data[9:],0)

 new_col = np.sum(x,1).reshape((x.shape[0],1))
 np.append(x,new_col,1)

Answer 1

采纳答案by askewchan

I think that your problem is that you are expecting np.appendto add the column in-place, but what it does, because of how numpy data is stored, is create a copy of the joined arrays

我认为您的问题是您希望np.append就地添加列，但是由于 numpy 数据的存储方式，它的作用是创建连接数组的副本

Returns
-------
append : ndarray
    A copy of `arr` with `values` appended to `axis`.  Note that `append`
    does not occur in-place: a new array is allocated and filled.  If
    `axis` is None, `out` is a flattened array.

so you need to save the output all_data = np.append(...):

所以你需要保存输出all_data = np.append(...)：

my_data = np.random.random((210,8)) #recfromcsv('LIAB.ST.csv', delimiter='\t')
new_col = my_data.sum(1)[...,None] # None keeps (n, 1) shape
new_col.shape
#(210,1)
all_data = np.append(my_data, new_col, 1)
all_data.shape
#(210,9)

Alternative ways:

替代方法：

all_data = np.hstack((my_data, new_col))
#or
all_data = np.concatenate((my_data, new_col), 1)

I believe that the only difference between these three functions (as well as np.vstack) are their default behaviors for when axisis unspecified:

我相信这三个函数（以及np.vstack）之间的唯一区别是它们在 whenaxis未指定时的默认行为：

concatenateassumes axis = 0
hstackassumes axis = 1unless inputs are 1d, then axis = 0
vstackassumes axis = 0after adding an axis if inputs are 1d
appendflattens array

concatenate假设 axis = 0
hstack假设axis = 1除非输入是 1d，那么axis = 0
vstackaxis = 0如果输入为 1d，则在添加轴后假设
append展平阵列

Based on your comment, and looking more closely at your example code, I now believe that what you are probably looking to do is add a fieldto a record array. You imported both genfromtxtwhich returns a structured arrayand recfromcsvwhich returns the subtly different record array(recarray). You used the recfromcsvso right now my_datais actually a recarray, which means that most likely my_data.shape = (210,)since recarrays are 1d arrays of records, where each record is a tuple with the given dtype.

根据您的评论，并更仔细地查看您的示例代码，我现在相信您可能想要做的是将字段添加到记录数组。您导入了genfromtxt返回结构化数组和recfromcsv返回细微不同的记录数组( recarray) 的两者。您recfromcsv现在使用的somy_data实际上是 a recarray，这意味着很可能my_data.shape = (210,)因为 recarrays 是一维记录数组，其中每个记录都是具有给定 dtype 的元组。

So you could try this:

所以你可以试试这个：

import numpy as np
from numpy.lib.recfunctions import append_fields
x = np.random.random(10)
y = np.random.random(10)
z = np.random.random(10)
data = np.array( list(zip(x,y,z)), dtype=[('x',float),('y',float),('z',float)])
data = np.recarray(data.shape, data.dtype, buf=data)
data.shape
#(10,)
tot = data['x'] + data['y'] + data['z'] # sum(axis=1) won't work on recarray
tot.shape
#(10,)
all_data = append_fields(data, 'total', tot, usemask=False)
all_data
#array([(0.4374783740738456 , 0.04307289878861764, 0.021176067323686598, 0.5017273401861498),
#       (0.07622262416466963, 0.3962146058689695 , 0.27912715826653534 , 0.7515643883001745),
#       (0.30878532523061153, 0.8553768789387086 , 0.9577415585116588  , 2.121903762680979 ),
#       (0.5288343561208022 , 0.17048864443625933, 0.07915689716226904 , 0.7784798977193306),
#       (0.8804269791375121 , 0.45517504750917714, 0.1601389248542675  , 1.4957409515009568),
#       (0.9556552723429782 , 0.8884504475901043 , 0.6412854758843308  , 2.4853911958174133),
#       (0.0227638618687922 , 0.9295332854783015 , 0.3234597575660103  , 1.275756904913104 ),
#       (0.684075052174589  , 0.6654774682866273 , 0.5246593820025259  , 1.8742119024637423),
#       (0.9841793718333871 , 0.5813955915551511 , 0.39577520705133684 , 1.961350170439875 ),
#       (0.9889343795296571 , 0.22830104497714432, 0.20011292764078448 , 1.4173483521475858)], 
#      dtype=[('x', '<f8'), ('y', '<f8'), ('z', '<f8'), ('total', '<f8')])
all_data.shape
#(10,)
all_data.dtype.names
#('x', 'y', 'z', 'total')

Answer 2

回答by atomh33ls

If you have an array, aof say 210 rows by 8 columns:

如果您有一个数组，a例如 210 行 x 8 列：

a = numpy.empty([210,8])

and want to add a ninth column of zeros you can do this:

并且想要添加第九列零，你可以这样做：

b = numpy.append(a,numpy.zeros([len(a),1]),1)

Answer 3

回答by Tomas

I add a new column with ones to a matrix array in this way:

我以这种方式向矩阵数组添加一个带有 1 的新列：

Z = append([[1 for _ in range(0,len(Z))]], Z.T,0).T

Maybe it is not that efficient?

也许它不是那么有效？

Answer 4

回答by aderchox

It can be done like this:

可以这样做：

import numpy as np

# create a random matrix:
A = np.random.normal(size=(5,2))

# add a column of zeros to it:
print(np.hstack((A,np.zeros((A.shape[0],1)))))

In general, if A is an m*n matrix, and you need to add a column, you have to create an n*1 matrix of zeros, then use "hstack" to add the matrix of zeros to the right of the matrix A.

一般情况下，如果A是一个m*n矩阵，需要加一列，则必须创建一个n*1的零矩阵，然后使用“hstack”将零矩阵添加到矩阵A的右边.

Python 如何将列添加到numpy数组

提问by user2130951

采纳答案by askewchan

回答by atomh33ls

回答by Tomas

回答by aderchox

相关推荐

最近更新

标签

Python 如何将列添加到numpy数组

提问by user2130951

采纳答案by askewchan

回答by atomh33ls

回答by Tomas

回答by aderchox

相关推荐

Python 如何在numpy中进行循环移位

Python 使用 Pandas 解析从 CSV 加载的 JSON 字符串

Python numpy/scipy 等效于 R ecdf(x)(x) 函数？

Python sklearn 上的套索不收敛

相关推荐

最近更新

标签