pandas 熊猫如何将所有字符串值转换为浮点数
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/32792955/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
pandas how to convert all the string value to float
提问by GoingMyWay
I want to convert all the stringvalue in Pandas DataFrameinto float, and I can define a short function to do this, but it's not a Pythonic way to do that. My DataFrame looks like this:
我想将所有string值转换Pandas DataFrame为float,并且我可以定义一个简短的函数来执行此操作,但这不是 Pythonic 的方法来做到这一点。我的 DataFrame 看起来像这样:
>>> df = pd.DataFrame(np.array([['1', '2', '3'], ['4', '5', '6']]))
>>> df
0 1 2
0 1 2 3
1 4 5 6
>>> df.dtypes
0 object
1 object
2 object
dtype: object
>>> type(df[0][0])
<type 'str'>
I just wonder whether are there some built-in functions of Pandas DataFrameto convert all the stringvalue to float. If you know the built-in function on the Pandas doc, please post the link.
我只是想知道是否有一些内置函数Pandas DataFrame可以将所有string值转换为float. 如果您知道 Pandas 文档中的内置函数,请发布链接。
回答by Anand S Kumar
Assuming all values can be correctly converted to float, you can use DataFrame.astype()function to convert the type of complete dataframe to float. Example -
假设所有值都可以正确转换为浮点数,您可以使用DataFrame.astype()函数将完整数据帧的类型转换为浮点数。例子 -
df = df.astype(float)
Demo -
演示 -
In [5]: df = pd.DataFrame(np.array([['1', '2', '3'], ['4', '5', '6']]))
In [6]: df.astype(float)
Out[6]:
0 1 2
0 1 2 3
1 4 5 6
In [7]: df = df.astype(float)
In [8]: df.dtypes
Out[8]:
0 float64
1 float64
2 float64
dtype: object
.astype()function also has a raise_on_errorargument (which defaults to True) which you can set to Falseto make it ignore errors . In such cases, the original value is used in the DataFrame -
.astype()函数还有一个raise_on_error参数(默认为 True),您可以将其设置为False忽略错误。在这种情况下,原始值用于 DataFrame -
In [10]: df = pd.DataFrame([['1', '2', '3'], ['4', '5', '6'],['blah','bloh','bleh']])
In [11]: df.astype(float,raise_on_error=False)
Out[11]:
0 1 2
0 1 2 3
1 4 5 6
2 blah bloh bleh
To convert just a series/column to float, again assuming all values can be converted, you can use [Series.astype()][2]. Example -
要将系列/列转换为浮点数,再次假设所有值都可以转换,您可以使用[Series.astype()][2]. 例子 -
df['somecol'] = df['somecol'].astype(<type>)
回答by unutbu
Another option is to use df.convert_objects(numeric=True). It attempts to
convert numeric strings to numbers, with unconvertible values becoming NaN:
另一种选择是使用df.convert_objects(numeric=True). 它尝试将数字字符串转换为数字,不可转换的值变为 NaN:
import pandas as pd
df = pd.DataFrame([['1', '2', '3'], ['4', '5', 'foo'], ['bar', 'baz', 'quux']])
df = df.convert_objects(convert_numeric=True)
print(df)
yields
产量
0 1 2
0 1 2 3
1 4 5 NaN
2 NaN NaN NaN
In contrast, df.astype(float)would raise ValueError: could not convert string to float: quuxsince in the above DataFrame some strings (such as 'quux') is not numeric.
相比之下,df.astype(float)会提高,ValueError: could not convert string to float: quux因为在上面的 DataFrame 中,一些字符串(例如'quux')不是数字。
Note: in future versions of pandas (after 0.16.2) the function argument will be numeric=Trueinstead of convert_numeric=True.
注意:在未来版本的 pandas(0.16.2 之后)中,函数参数将numeric=True代替convert_numeric=True.

