pandas 理解熊猫数据帧中的数学错误

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/23748842/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-13 22:04:12  来源:igfitidea点击:

understanding math errors in pandas dataframes

pythonpandasipython

提问by user3654387

I'm trying to generate a new column in a pandas dataframe from other columns and am getting some math errors that I don't understand. Here is a snapshot of the problem and some simplifying diagnostics...

我正在尝试从其他列的 Pandas 数据框中生成一个新列,但出现了一些我不明白的数学错误。这是问题的快照和一些简化的诊断...

I can generate a data frame that looks pretty good:

我可以生成一个看起来不错的数据框:

import pandas
import math as m

data = {'loc':['1','2','3','4','5'],
        'lat':[61.3850,32.7990,34.9513,14.2417,33.7712],
        'lng':[-152.2683,-86.8073,-92.3809,-170.7197,-111.3877]}
frame = pandas.DataFrame(data)

frame

Out[15]:
lat lng loc
0    61.3850    -152.2683    1
1    32.7990     -86.8073    2
2    34.9513     -92.3809    3
3    14.2417    -170.7197    4
4    33.7712    -111.3877    5
5 rows × 3 columns

I can do simple math (i.e. degrees to radians):

我可以做简单的数学运算(即度数到弧度):

In [32]:
m.pi*frame.lat/180.

Out[32]:
0    1.071370
1    0.572451
2    0.610015
3    0.248565
4    0.589419
Name: lat, dtype: float64

But I can't convert from degrees to radians using the python math library:

但我无法使用 python 数学库从度数转换为弧度:

 In [33]:
 m.radians(frame.lat)

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-33-99a986252f80> in <module>()
----> 1 m.radians(frame.lat)

/Users/user/anaconda/lib/python2.7/site-packages/pandas/core/series.pyc in wrapper(self)
     72             return converter(self.iloc[0])
     73         raise TypeError(
---> 74             "cannot convert the series to {0}".format(str(converter)))
     75     return wrapper
     76 

TypeError: cannot convert the series to <type 'float'>

And can't even convert the values to floats to try to force it to work:

甚至不能将值转换为浮点数以试图强制它工作:

In [34]:

float(frame.lat)
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-34-3311aee92f31> in <module>()
----> 1 float(frame.lat)

/Users/user/anaconda/lib/python2.7/site-packages/pandas/core/series.pyc in wrapper(self)
     72             return converter(self.iloc[0])
     73         raise TypeError(
---> 74             "cannot convert the series to {0}".format(str(converter)))
     75     return wrapper
     76 

TypeError: cannot convert the series to <type 'float'>

I'm sure there must be a simple explanation and would appreciate your help in finding it. Thanks!

我相信必须有一个简单的解释,并希望您能帮助找到它。谢谢!

回答by unutbu

math functions such as math.radiansexpect a numeric value such as a float, not a sequence such as a pandas.Series.

诸如math.radians 之类的数学函数需要一个数字值,例如一个浮点数,而不是一个诸如 a 之类的序列pandas.Series

Instead, you could use numpy.radians, since numpy.radianscan accept an array as input:

相反,您可以使用numpy.radians,因为numpy.radians可以接受数组作为输入:

In [95]: np.radians(frame['lat'])
Out[95]: 
0    1.071370
1    0.572451
2    0.610015
3    0.248565
4    0.589419
Name: lat, dtype: float64


Only Series of length 1 can be converted to a float. So while this works,

只有长度为 1 的系列才能转换为float. 所以虽然这有效,

In [103]: math.radians(pd.Series([1]))
Out[103]: 0.017453292519943295

in general it does not:

一般来说,它不会:

In [104]: math.radians(pd.Series([1,2]))
TypeError: cannot convert the series to <type 'float'>


math.radiansis calling floaton its argument. Note that you get the same error calling floaton pd.Series([1,2]):

math.radians正在呼吁float其论点。请注意,您在调用float时会遇到相同的错误pd.Series([1,2])

In [105]: float(pd.Series([1,2]))
TypeError: cannot convert the series to <type 'float'>

回答by AnthonySCaldera

I had a similar issue but was using a custom function. The solution was to use the applyfunction:

我有一个类似的问题,但使用的是自定义函数。解决方案是使用该apply功能:

def monthdiff(x):
    z = (int(x/100) * 12) + (x - int(x/100) * 100)
    return z

series['age'].apply(monthdiff)

Now, I have a new column with my simple (yet beautiful) calculation applied to every line in the data frame!

现在,我有一个新列,我的简单(但漂亮)的计算应用于数据框中的每一行!

回答by ANNE ANGELINA

try:

尝试:

pd.to_numeric()

When I got the same error, this is what worked for me.

当我遇到同样的错误时,这对我有用。