Python 在熊猫中将浮点系列转换为整数系列

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/19026684/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 12:35:10  来源:igfitidea点击:

Convert float Series into an integer Series in pandas

pythonpandastime-series

提问by Geekster

I have the following data frame:

我有以下数据框:

In [31]: rise_p
Out[31]: 
         time    magnitude
0  1379945444   156.627598
1  1379945447  1474.648726
2  1379945448  1477.448999
3  1379945449  1474.886202
4  1379945699  1371.454224

Now, I want to group rows which are within a minute. So I divide the time series with 100. I get this:

现在,我想对一分钟内的行进行分组。所以我将时间序列除以 100。我得到了这个:

In [32]: rise_p/100
Out[32]: 
          time  magnitude
0  13799454.44   1.566276
1  13799454.47  14.746487
2  13799454.48  14.774490
3  13799454.49  14.748862
4  13799456.99  13.714542

As explained above, I want to create groups based on time. So expected subgroups would be rows with times 13799454and 13799456. I do this:

如上所述,我想根据时间创建组。所以预期的子组将是带有时间13799454和的行13799456。我这样做:

In [37]: ts = rise_p['time']/100

In [38]: s = rise_p/100

In [39]: new_re_df = [s.iloc[np.where(int(ts) == int(i))] for i in ts]
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-39-5ea498cf32b2> in <module>()
----> 1 new_re_df = [s.iloc[np.where(int(ts) == int(i))] for i in ts]

TypeError: only length-1 arrays can be converted to Python scalars

How do I convert tsinto an Integer Series since int() doesn't take a Series or a list as an argument? Is there any method in pandas which does this?

ts由于 int() 不将系列或列表作为参数,我如何转换为整数系列?大熊猫有什么方法可以做到这一点吗?

采纳答案by drexiya

Try converting with astype:

尝试使用 astype 进行转换:

new_re_df = [s.iloc[np.where(ts.astype(int) == int(i))] for i in ts]

Edit

编辑

On suggestion by @Rutger Kassies a nicer way would be to cast series and then groupby:

根据@Rutger Kassies 的建议,更好的方法是先投射系列,然后再分组:

rise_p['ts'] = (rise_p.time / 100).astype('int')

ts_grouped = rise_p.groupby('ts')

...

回答by Jeff

Here's a different way to solve your problem

这是解决您问题的另一种方法

In [3]: df
Out[3]: 
         time    magnitude
0  1379945444   156.627598
1  1379945447  1474.648726
2  1379945448  1477.448999
3  1379945449  1474.886202
4  1379945699  1371.454224

In [4]: df.dtypes
Out[4]: 
time           int64
magnitude    float64
dtype: object

Convert your epoch timestamps to seconds

将您的纪元时间戳转换为秒

In [7]: df['time'] = pd.to_datetime(df['time'],unit='s')

Set the index

设置索引

In [8]: df.set_index('time',inplace=True)

In [9]: df
Out[9]: 
                       magnitude
time                            
2013-09-23 14:10:44   156.627598
2013-09-23 14:10:47  1474.648726
2013-09-23 14:10:48  1477.448999
2013-09-23 14:10:49  1474.886202
2013-09-23 14:14:59  1371.454224

Groupby 1min and mean the results (how=can be an arbitrary function as well)

Groupby 1min 并表示结果(how=也可以是任意函数)

In [10]: df.resample('1Min',how=np.mean)
Out[10]: 
                       magnitude
time                            
2013-09-23 14:10:00  1145.902881
2013-09-23 14:11:00          NaN
2013-09-23 14:12:00          NaN
2013-09-23 14:13:00          NaN
2013-09-23 14:14:00  1371.454224

回答by winni2k

Here's another quite general way to convert tsto a Seriesof type int:

这是转换ts为 a Seriesof type的另一种非常通用的方法int

rise_p['ts'] = (rise_p.time / 100).apply(lambda val: int(val))

applyallows you to apply an arbitrary function to your Seriesobject value by value. applyalso works on columns of a DataFrame object.

apply允许您Series按值将任意函数应用于对象值。apply也适用于 DataFrame 对象的列。