pandas 获取时间序列熊猫每个月的最后一个日期

Question

提问by ikemblem

Currently I'm generating a DateTimeIndex using a certain function, zipline.utils.tradingcalendar.get_trading_days. The time series is roughly daily but with some gaps.

目前我正在使用某个函数生成 DateTimeIndex zipline.utils.tradingcalendar.get_trading_days。时间序列大致是每天，但有一些差距。

My goal is to get the last date in the DateTimeIndexfor each month.

我的目标是获得DateTimeIndex每个月的最后一个日期。

.to_period('M')& .to_timestamp('M')don't work since they give the last day of the month rather than the last value of the variable in each month.

.to_period('M')&.to_timestamp('M')不工作，因为他们给出了一个月的最后一天，而不是每个月变量的最后一个值。

As an example, if this is my time series I would want to select '2015-05-29' while the last day of the month is '2015-05-31'.

例如，如果这是我的时间序列，我想选择“2015-05-29”，而当月的最后一天是“2015-05-31”。

['2015-05-18', '2015-05-19', '2015-05-20', '2015-05-21', '2015-05-22', '2015-05-26', '2015-05-27', '2015-05-28', '2015-05-29', '2015-06-01']

['2015-05-18'、'2015-05-19'、'2015-05-20'、'2015-05-21'、'2015-05-22'、'2015-05-26'、' 2015-05-27'、'2015-05-28'、'2015-05-29'、'2015-06-01']

Answer 1

采纳答案by ikemblem

Condla's answer came closest to what I needed except that since my time index stretched for more than a year I needed to groupby by both month and year and then select the maximum date. Below is the code I ended up with.

Condla 的回答最接近我的需要，除了因为我的时间索引延长了一年多，我需要按月份和年份分组，然后选择最大日期。下面是我最终得到的代码。

# tempTradeDays is the initial DatetimeIndex
dateRange = []  
tempYear = None  
dictYears = tempTradeDays.groupby(tempTradeDays.year)
for yr in dictYears.keys():
    tempYear = pd.DatetimeIndex(dictYears[yr]).groupby(pd.DatetimeIndex(dictYears[yr]).month)
    for m in tempYear.keys():
        dateRange.append(max(tempYear[m]))
dateRange = pd.DatetimeIndex(dateRange).order()

Answer 2

回答by Condla

My strategy would be to group by month and then select the "maximum" of each group:

我的策略是按月分组，然后选择每个组的“最大值”：

If "dt" is your DatetimeIndex object:

如果“dt”是您的 DatetimeIndex 对象：

last_dates_of_the_month = []
dt_month_group_dict = dt.groupby(dt.month)
for month in dt_month_group_dict:
    last_date = max(dt_month_group_dict[month])
    last_dates_of_the_month.append(last_date)

The list "last_date_of_the_month" contains all occuring last dates of each month in your dataset. You can use this list to create a DatetimeIndex in pandas again (or whatever you want to do with it).

列表“last_date_of_the_month”包含数据集中每个月的所有最后日期。您可以使用此列表再次在 Pandas 中创建 DatetimeIndex（或您想用它做的任何事情）。

Answer 3

回答by Maxim

This is an old question, but all existing answers here aren't perfect. This is the solution I came up with (assuming that date is a sorted index), which can be even written in one line, but I split it for readability:

这是一个老问题，但这里所有现有的答案都不完美。这是我想出的解决方案（假设日期是一个排序索引），它甚至可以写在一行中，但为了可读性我将其拆分：

month1 = pd.Series(apple.index.month)
month2 = pd.Series(apple.index.month).shift(-1)
mask = (month1 != month2)
apple[mask.values].head(10)

Few notes here:

这里有一些注意事项：

Shifting a datetime series requires another pd.Seriesinstance (see here)
Boolean mask indexing requires .values(see here)

移动日期时间序列需要另一个pd.Series实例（请参阅此处）
布尔掩码索引需要.values（见这里）

By the way, when the dates are the business days, it'd be easier to use resampling: apple.resample('BM')

顺便说一句，当日期是工作日时，使用重采样会更容易：apple.resample('BM')

Answer 4

回答by MMCM_

Maybe the answer is not needed anymore, but while searching for an answer to the same question I found maybe a simpler solution:

也许不再需要答案，但在寻找同一问题的答案时，我发现了一个更简单的解决方案：

import pandas as pd 

sample_dates = pd.date_range(start='2010-01-01', periods=100, freq='B')
month_end_dates = sample_dates[sample_dates.is_month_end]

Answer 5

回答by user3570984

Suppose your data frame looks like this

假设您的数据框如下所示

original dataframe

原始数据框

Then the following Code will give you the last day of each month.

那么下面的代码会给你每个月的最后一天。

df_monthly = df.reset_index().groupby([df.index.year,df.index.month],as_index=False).last().set_index('index')

transformed_dataframe

转换数据帧

This one line code does its job :)

这一行代码完成了它的工作:)

pandas 获取时间序列熊猫每个月的最后一个日期

提问by ikemblem

采纳答案by ikemblem

回答by Condla

回答by Maxim

回答by MMCM_

回答by user3570984

相关推荐

最近更新

标签

pandas 获取时间序列熊猫每个月的最后一个日期

提问by ikemblem

采纳答案by ikemblem

回答by Condla

回答by Maxim

回答by MMCM_

回答by user3570984

相关推荐

Python pandas：排除低于特定频率计数的行

pandas 将熊猫数据框可视化为热图时键入错误

pandas 熊猫：如何选择每个 GROUP BY 组中的第一行？

如何强制 Pandas read_csv 对所有浮点列使用 float32？

相关推荐

最近更新

标签