pandas 无法根据规则“安全”将数组数据从 dtype('<M8[ns]') 转换为 dtype('float64')

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/48614440/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 05:07:53  来源:igfitidea点击:

Cannot cast array data from dtype('<M8[ns]') to dtype('float64') according to the rule 'safe'

pythonpandasnumpyinterpolation

提问by Xiaoli Zhang

I am using numpy interp to interpolate datapoint but was given Cannot cast array data from dtype('

我正在使用 numpy interp 来插入数据点,但得到了无法从 dtype('

Code snippet:

代码片段:

import pandas as pd
import numpy as np
def interpolate_fwd_price(row, fx):
    res = np.interp(row['SA_M'], fx['TENOR_DT'], fx['RATE'])
    return res

df = pd.DataFrame({'SA_M': ['2018-02-28','2018-03-10']})
df['SA_M'] = pd.to_datetime(df['SA_M'])
data = pd.DataFrame({'TENOR_DT': ['2017-02-09','2017-03-02','2017-04-03','2017-05-02'], 'RATE':[1.0, 1.2, 1.5, 1.8]})
data['TENOR_DT'] = pd.to_datetime(data['TENOR_DT'])
df['PRICE'] = df.apply(interpolate_fwd_price, fx=data, axis=1)

I did some search and could not figure out what is causing the error. Appreciate your input.

我进行了一些搜索,但无法弄清楚导致错误的原因。感谢您的投入。

Make some change and it works for interpolating the datetime difference instead of datetime directly. Would still be interested to know why it did not work for interpolating datetime directly.

进行一些更改,它适用于插入日期时间差异而不是直接插入日期时间。仍然有兴趣知道为什么它不能直接插入日期时间。

def interpolate_fwd_price(row, fx):
    fx['DT'] = (fx['TENOR_DT'] - row(['SA_M'])).dt.days
    res = np.interp(0, fx['DT'], fx['RATE'])
    return res

回答by hpaulj

In [92]: data = pd.DataFrame({'TENOR_DT': ['2017-02-09','2017-03-02','2017-04-03','2017-05-02'], 'RATE':[1.0, 1.2, 1.5, 1.8]})
In [93]: data        # object dtype with strings
Out[93]: 
   RATE    TENOR_DT
0   1.0  2017-02-09
1   1.2  2017-03-02
2   1.5  2017-04-03
3   1.8  2017-05-02
In [94]: data['TENOR_DT'] = pd.to_datetime(data['TENOR_DT'])
In [95]: data
Out[95]: 
   RATE   TENOR_DT
0   1.0 2017-02-09
1   1.2 2017-03-02
2   1.5 2017-04-03
3   1.8 2017-05-02
In [96]: data['TENOR_DT']
Out[96]: 
0   2017-02-09
1   2017-03-02
2   2017-04-03
3   2017-05-02
Name: TENOR_DT, dtype: datetime64[ns]

The array version of the dates:

日期的数组版本:

In [98]: dt = data['TENOR_DT'].values
In [99]: dt
Out[99]: 
array(['2017-02-09T00:00:00.000000000', '2017-03-02T00:00:00.000000000',
       '2017-04-03T00:00:00.000000000', '2017-05-02T00:00:00.000000000'],
      dtype='datetime64[ns]')

It can be cast to float using the default unsafe:

可以使用默认值将其强制转换为浮动unsafe

In [100]: dt.astype(float)
Out[100]: array([1.4865984e+18, 1.4884128e+18, 1.4911776e+18, 1.4936832e+18])
In [101]: dt.astype(float, casting='safe')
TypeError: Cannot cast array from dtype('<M8[ns]') to dtype('float64') according to the rule 'safe'

My guess is that np.interpis using the safecasting to convert those datetime values to floats.

我的猜测是np.interp使用safe强制转换将这些日期时间值转换为浮点数。

I haven't tried to do interpwith dates before, so can only suggest some fixes. First your dates only differ by day, so we don't need the full nsresolution:

我之前没有尝试过interp处理日期,所以只能建议一些修复。首先,您的日期仅因天而异,因此我们不需要完整ns分辨率:

In [107]: dt.astype('datetime64[D]')
Out[107]: 
array(['2017-02-09', '2017-03-02', '2017-04-03', '2017-05-02'],
      dtype='datetime64[D]')

It still won't allow safe casting, but the 'unsafe' casting produces reasonable looking numbers. You might be able to use those in the interpolation.

它仍然不允许安全铸造,但“不安全”铸造产生合理的数字。您也许可以在插值中使用它们。

In [108]: dt.astype('datetime64[D]').astype(int)
Out[108]: array([17206, 17227, 17259, 17288])