从 Pandas 数据框列中删除“秒”和“分钟”
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/43400331/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Remove 'seconds' and 'minutes' from a Pandas dataframe column
提问by Dustin Helliwell
Given a dataframe like:
给定一个数据框,如:
import numpy as np
import pandas as pd
df = pd.DataFrame(
{'Date' : pd.date_range('1/1/2011', periods=5, freq='3675S'),
'Num' : np.random.rand(5)})
Date Num
0 2011-01-01 00:00:00 0.580997
1 2011-01-01 01:01:15 0.407332
2 2011-01-01 02:02:30 0.786035
3 2011-01-01 03:03:45 0.821792
4 2011-01-01 04:05:00 0.807869
I would like to remove the 'minutes' and 'seconds' information.
我想删除“分钟”和“秒”信息。
The following (mostly stolen from: How to remove the 'seconds' of Pandas dataframe index?) works okay,
以下(主要来自:How to remove the 'seconds' of Pandas dataframe index?)工作正常,
df = df.assign(Date = lambda x: pd.to_datetime(x['Date'].dt.strftime('%Y-%m-%d %H')))
Date Num
0 2011-01-01 00:00:00 0.580997
1 2011-01-01 01:00:00 0.407332
2 2011-01-01 02:00:00 0.786035
3 2011-01-01 03:00:00 0.821792
4 2011-01-01 04:00:00 0.807869
but it feels strange to convert a datetime to a string then back to a datetime. Is there a way to do this more directly?
但是将日期时间转换为字符串然后再转换回日期时间感觉很奇怪。有没有办法更直接地做到这一点?
回答by piRSquared
dt.round
dt.round
This is how it should be done... use dt.round
这就是它应该如何完成...使用 dt.round
df.assign(Date=df.Date.dt.round('H'))
Date Num
0 2011-01-01 00:00:00 0.577957
1 2011-01-01 01:00:00 0.995748
2 2011-01-01 02:00:00 0.864013
3 2011-01-01 03:00:00 0.468762
4 2011-01-01 04:00:00 0.866827
OLD ANSWER
旧答案
One approach is to set the index and use resample
一种方法是设置索引并使用 resample
df.set_index('Date').resample('H').last().reset_index()
Date Num
0 2011-01-01 00:00:00 0.577957
1 2011-01-01 01:00:00 0.995748
2 2011-01-01 02:00:00 0.864013
3 2011-01-01 03:00:00 0.468762
4 2011-01-01 04:00:00 0.866827
Another alternative is to strip the date
and hour
components
另一种选择是剥离date
和hour
组件
df.assign(
Date=pd.to_datetime(df.Date.dt.date) +
pd.to_timedelta(df.Date.dt.hour, unit='H'))
Date Num
0 2011-01-01 00:00:00 0.577957
1 2011-01-01 01:00:00 0.995748
2 2011-01-01 02:00:00 0.864013
3 2011-01-01 03:00:00 0.468762
4 2011-01-01 04:00:00 0.866827