Python 将 numpy 数组更改为浮动
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/32207474/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
changing numpy array to float
提问by MAS
I have a numpy array of type object. I want to find the columns with numerical values and cast them to float. Also I want to find the indices of the columns with object values. this is my attempt:
我有一个类型为对象的 numpy 数组。我想找到带有数值的列并将它们转换为浮动。我还想找到具有对象值的列的索引。这是我的尝试:
import numpy as np
import pandas as pd
df = pd.DataFrame({'A' : [1,2,3,4,5],'B' : ['A', 'A', 'C', 'D','B']})
X = df.values.copy()
obj_ind = []
for ind in range(X.shape[1]):
try:
X[:,ind] = X[:,ind].astype(np.float32)
except:
obj_ind = np.append(obj_ind,ind)
print obj_ind
print X.dtype
and this is the output I get:
这是我得到的输出:
[ 1.]
object
采纳答案by hpaulj
Generally your idea of trying to apply astype
to each column is fine.
通常,您尝试应用于astype
每一列的想法很好。
In [590]: X[:,0].astype(int)
Out[590]: array([1, 2, 3, 4, 5])
But you have to collect the results in a separate list. You can't just put them back in X
. That list can then be concatenated.
但是您必须将结果收集在单独的列表中。你不能只是把它们放回去X
。然后可以连接该列表。
In [601]: numlist=[]; obj_ind=[]
In [602]: for ind in range(X.shape[1]):
.....: try:
.....: x = X[:,ind].astype(np.float32)
.....: numlist.append(x)
.....: except:
.....: obj_ind.append(ind)
In [603]: numlist
Out[603]: [array([ 3., 4., 5., 6., 7.], dtype=float32)]
In [604]: np.column_stack(numlist)
Out[604]:
array([[ 3.],
[ 4.],
[ 5.],
[ 6.],
[ 7.]], dtype=float32)
In [606]: obj_ind
Out[606]: [1]
X
is a numpy array with dtype object
:
X
是一个带有 dtype 的 numpy 数组object
:
In [582]: X
Out[582]:
array([[1, 'A'],
[2, 'A'],
[3, 'C'],
[4, 'D'],
[5, 'B']], dtype=object)
You could use the same conversion logic to create a structured array with a mix of int and object fields.
您可以使用相同的转换逻辑来创建一个混合了 int 和 object 字段的结构化数组。
In [616]: ytype=[]
In [617]: for ind in range(X.shape[1]):
try:
x = X[:,ind].astype(np.float32)
ytype.append('i4')
except:
ytype.append('O')
In [618]: ytype
Out[618]: ['i4', 'O']
In [620]: Y=np.zeros(X.shape[0],dtype=','.join(ytype))
In [621]: for i in range(X.shape[1]):
Y[Y.dtype.names[i]] = X[:,i]
In [622]: Y
Out[622]:
array([(3, 'A'), (4, 'A'), (5, 'C'), (6, 'D'), (7, 'B')],
dtype=[('f0', '<i4'), ('f1', 'O')])
Y['f0']
gives the the numeric field.
Y['f0']
给出数字字段。
回答by shanmuga
df.dtypes
return a pandas series which can be operated further
df.dtypes
返回一个可以进一步操作的熊猫系列
# find columns of type int
mask = df.dtypes==int
# select columns for for the same
cols = df.dtypes[mask].index
# select these columns and convert to float
new_cols_df = df[cols].apply(lambda x: x.astype(float), axis=1)
# Replace these columns in original df
df[new_cols_df.columns] = new_cols_df
回答by shanmuga
I think this might help
我认为这可能会有所帮助
def func(x):
a = None
try:
a = x.astype(float)
except:
# x.name represents the current index value
# which is column name in this case
obj.append(x.name)
a = x
return a
obj = []
new_df = df.apply(func, axis=0)
This will keep the object
columns as such which you can use later.
这将保留object
您以后可以使用的列。
Note: While using pandas.DataFrame
avoid using iteration using loop as this much slower than performing the same operation using apply
.
注意:pandas.DataFrame
使用循环时避免使用迭代,因为这比使用循环执行相同的操作慢得多apply
。