Python 如何将字符串转换为数据框中的浮点值
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/30121181/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How to convert string into float value in the dataframe
提问by Ashim Sinha
We are facing an error when we have a column which have datatype as string and the value like col1 col2 1 .89
当我们有一列数据类型为字符串且值如 col1 col2 1 .89 时,我们面临错误
So, when we are using
所以,当我们使用
def azureml_main(dataframe1 = None, dataframe2 = None):
# Execution logic goes here
print('Input pandas.DataFrame #1:')
import pandas as pd
import numpy as np
from sklearn.kernel_approximation import RBFSampler
x =dataframe1.iloc[:,2:1080]
print x
df1 = dataframe1[['colname']]
change = np.array(df1)
b = change.ravel()
print b
rbf_feature = RBFSampler(gamma=1, n_components=100,random_state=1)
print rbf_feature
print "test"
X_features = rbf_feature.fit_transform(x)
After this we are getting error as cannt convert non int into type float
在此之后,我们收到错误,因为无法将非 int 转换为 float 类型
回答by EdChum
Use astype(float)
e.g.:
使用astype(float)
例如:
df['col'] = df['col'].astype(float)
or convert_objects
:
df = df.convert_objects(convert_numeric=True)
Example:
例子:
In [379]:
df = pd.DataFrame({'a':['1.23', '0.123']})
df.info()
<class 'pandas.core.frame.DataFrame'>
Int64Index: 2 entries, 0 to 1
Data columns (total 1 columns):
a 2 non-null object
dtypes: object(1)
memory usage: 32.0+ bytes
In [380]:
df['a'].astype(float)
Out[380]:
0 1.230
1 0.123
Name: a, dtype: float64
In [382]:
df = df.convert_objects(convert_numeric=True)
df.info()
<class 'pandas.core.frame.DataFrame'>
Int64Index: 2 entries, 0 to 1
Data columns (total 1 columns):
a 2 non-null float64
dtypes: float64(1)
memory usage: 32.0 bytes
UPDATE
更新
If you're running version 0.17.0
or later then convert_objects
has been replaced with the methods: to_numeric
, to_datetime
, and to_timestamp
so instead of:
如果您正在运行 version0.17.0
或更高版本,convert_objects
则已替换为方法:to_numeric
, to_datetime
, 等to_timestamp
而不是:
df['col'] = df['col'].astype(float)
you can do:
你可以做:
df['col'] = pd.to_numeric(df['col'])
note that by default any non convertible values will raise an error, if you want these to be forced to NaN
then do:
请注意,默认情况下,任何不可转换的值都会引发错误,如果您希望强制NaN
执行这些值,请执行以下操作:
df['col'] = pd.to_numeric(df['col'], errors='coerce')