pandas 查找季度给定日期的结束日期,熊猫
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/31406059/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Find end date of quarter given date, pandas
提问by Tingiskhan
Assume we have a table like
假设我们有一张像
table = [[datetime.datetime(2015, 1, 1), 1, 0.5],
[datetime.datetime(2015, 1, 27), 1, 0.5],
[datetime.datetime(2015, 1, 31), 1, 0.5],
[datetime.datetime(2015, 2, 1), 1, 2],
[datetime.datetime(2015, 2, 3), 1, 2],
[datetime.datetime(2015, 2, 15), 1, 2],
[datetime.datetime(2015, 2, 28), 1, 2],
[datetime.datetime(2015, 3, 1), 1, 3],
[datetime.datetime(2015, 3, 17), 1, 3],
[datetime.datetime(2015, 3, 31), 1, 3]]
df = pd.DataFrame(table, columns=['Date', 'Id', 'Value'])
Is there a way to get the specific end date of the actualquarter given the dates in the column Date? E.g., I would like to add the column Q_dateto dfsuch that
有没有办法根据列中的日期获得实际季度的具体结束日期Date?例如,我想将列添加Q_date到df这样
Date Id Value Qdate
0 2015-01-01 1 0.5 2015-03-31
1 2015-01-27 1 0.5 2015-03-31
2 2015-01-31 1 0.5 2015-03-31
3 2015-02-01 1 2.0 2015-03-31
4 2015-02-03 1 2.0 2015-03-31
5 2015-02-15 1 2.0 2015-03-31
6 2015-02-28 1 2.0 2015-03-31
7 2015-03-01 1 3.0 2015-03-31
8 2015-03-17 1 3.0 2015-03-31
9 2015-03-31 1 3.0 2015-03-31
I've only considered the first quarter for simplicity - as I know what date it is.
为简单起见,我只考虑了第一季度 - 因为我知道它是什么日期。
回答by Jianxun Li
You can use pd.tseries.offsets.QuarterEnd()to achieve your goal here.
您可以使用pd.tseries.offsets.QuarterEnd()此处实现您的目标。
import pandas as pd
import datetime
# your data
# ================================
table = [[datetime.datetime(2015, 1, 1), 1, 0.5],
[datetime.datetime(2015, 1, 27), 1, 0.5],
[datetime.datetime(2015, 1, 31), 1, 0.5],
[datetime.datetime(2015, 2, 1), 1, 2],
[datetime.datetime(2015, 2, 3), 1, 2],
[datetime.datetime(2015, 2, 15), 1, 2],
[datetime.datetime(2015, 2, 28), 1, 2],
[datetime.datetime(2015, 3, 1), 1, 3],
[datetime.datetime(2015, 3, 17), 1, 3],
[datetime.datetime(2015, 3, 31), 1, 3]]
df = pd.DataFrame(table, columns=['Date', 'Id', 'Value'])
# processing
# ================================
# in case of 2015.03.31, simple QuarterEnd will roll forward to next quarter, so use DateOffset here to make it robust to this
df['Qdate'] = [date - pd.tseries.offsets.DateOffset(days=1) + pd.tseries.offsets.QuarterEnd() for date in df.Date]
print(df)
Date Id Value Qdate
0 2015-01-01 1 0.5 2015-03-31
1 2015-01-27 1 0.5 2015-03-31
2 2015-01-31 1 0.5 2015-03-31
3 2015-02-01 1 2.0 2015-03-31
4 2015-02-03 1 2.0 2015-03-31
5 2015-02-15 1 2.0 2015-03-31
6 2015-02-28 1 2.0 2015-03-31
7 2015-03-01 1 3.0 2015-03-31
8 2015-03-17 1 3.0 2015-03-31
9 2015-03-31 1 3.0 2015-03-31
回答by Juan A. Navarro
An easier way to do it would be to convert the date to a (quarter) period, and then back to a date, e.g.:
一种更简单的方法是将日期转换为(季度)期间,然后再转换为日期,例如:
df['Qdate'] = df['Date'].dt.to_period("Q").dt.end_time
Note there is also .start_timefor the start of the quarter.
请注意.start_time,本季度初也是如此。
回答by Colonel Beauvel
Really great @Jianxun! Here is an alternative approach:
真的很棒@Jianxun!这是一种替代方法:
import calendar
def f(x):
q = ((x[0].month-1)//3 + 1)*3
last = calendar.monthrange(x[0].year,q)[1]
return datetime.date(x[0].year, q, last)
df['QDate'] = df.apply(f,axis=1)
In [24]: df
Out[24]:
Date Id Value QDate
0 2015-01-01 1 0.5 2015-03-31
1 2015-01-27 1 0.5 2015-03-31
2 2015-01-31 1 0.5 2015-03-31
3 2015-02-01 1 2.0 2015-03-31
4 2015-02-03 1 2.0 2015-03-31
5 2015-02-15 1 2.0 2015-03-31
6 2015-02-28 1 2.0 2015-03-31
7 2015-03-01 1 3.0 2015-03-31
8 2015-03-17 1 3.0 2015-03-31
9 2015-03-31 1 3.0 2015-03-31
回答by unutbu
Using searchsortedis another option:
使用searchsorted是另一种选择:
import datetime
import pandas as pd
table = [[datetime.datetime(2015, 1, 1), 1, 0.5],
[datetime.datetime(2015, 1, 27), 1, 0.5],
[datetime.datetime(2015, 1, 31), 1, 0.5],
[datetime.datetime(2015, 2, 1), 1, 2],
[datetime.datetime(2015, 2, 3), 1, 2],
[datetime.datetime(2015, 2, 15), 1, 2],
[datetime.datetime(2015, 2, 28), 1, 2],
[datetime.datetime(2015, 3, 1), 1, 3],
[datetime.datetime(2015, 3, 17), 1, 3],
[datetime.datetime(2015, 3, 31), 1, 3],
[datetime.datetime(2015, 4, 1), 1, 3],
]
df = pd.DataFrame(table, columns=['Date', 'Id', 'Value'])
quarters = pd.date_range(
df['Date'].min(),
df['Date'].max()+pd.tseries.offsets.QuarterEnd(), freq='Q')
df['Qdate'] = quarters[quarters.searchsorted(df['Date'].values)]
print(df)
yields
产量
Date Id Value Qdate
0 2015-01-01 1 0.5 2015-03-31
1 2015-01-27 1 0.5 2015-03-31
2 2015-01-31 1 0.5 2015-03-31
3 2015-02-01 1 2.0 2015-03-31
4 2015-02-03 1 2.0 2015-03-31
5 2015-02-15 1 2.0 2015-03-31
6 2015-02-28 1 2.0 2015-03-31
7 2015-03-01 1 3.0 2015-03-31
8 2015-03-17 1 3.0 2015-03-31
9 2015-03-31 1 3.0 2015-03-31
10 2015-04-01 1 3.0 2015-06-30
By avoiding computation row-by-row, using searchsorted as above can be orders of magnitude faster for moderately large DataFrames.
通过避免逐行计算,对于中等大的 DataFrame,使用上述 searchsorted 可以快几个数量级。

