pandas 将一列时间戳转换为熊猫中的句点
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/23840797/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Convert a column of timestamps into periods in pandas
提问by user3576212
I have a column of timestamps that need to be converted into period ('Month'). e.g.
我有一列需要转换为句点(“月”)的时间戳。例如
1985-12-31 00:00:00 to 1985-12
Pandas have a .to_period function, but it only works for timestamps index, not column. So you can only have a period index, but not a period column?
Pandas 有一个 .to_period 函数,但它只适用于时间戳索引,而不适用于列。所以你只能有一个周期索引,而不能有一个周期列?
And it only work if timestamps is the only index. That is, if timestamps are part of a multIndex, the .to_period() function doesn't work as well.
并且仅当时间戳是唯一索引时才有效。也就是说,如果时间戳是 multIndex 的一部分,则 .to_period() 函数也无法正常工作。
It seems that Pandas assume people will always use timestamps and periods as index, but not a single column, which is apparently not the case.
似乎 Pandas 假设人们总是使用时间戳和句点作为索引,而不是单个列,显然事实并非如此。
Anyway I can get around with this? Or if not in Pandas, can it be done in numpy?
无论如何,我可以解决这个问题吗?或者如果不在 Pandas 中,可以在 numpy 中完成吗?
Thanks!
谢谢!
回答by mattvivier
I came across this thread today, and after further digging found that Pandas .15 affords an easier option use .dt, you can avoid the step of creating an index and create the column directly. You can use the following to get the same result:
我今天遇到了这个线程,进一步挖掘后发现 Pandas .15 提供了一个更简单的选项使用 .dt,您可以避免创建索引的步骤并直接创建列。您可以使用以下方法获得相同的结果:
df[1] = df[0].dt.to_period('M')
回答by Andy Hayden
You're right, you need to do this one DatetimeIndex objects rather than just columns of datetimes. However, this is pretty easy - just wrap it in a DatetimeIndex constructor:
你是对的,你需要做一个 DatetimeIndex 对象,而不仅仅是日期时间的列。然而,这很简单——只需将它包装在 DatetimeIndex 构造函数中:
In [11]: df = pd.DataFrame(pd.date_range('2014-01-01', freq='2w', periods=12))
In [12]: df
Out[12]:
0
0 2014-01-05
1 2014-01-19
2 2014-02-02
3 2014-02-16
4 2014-03-02
5 2014-03-16
6 2014-03-30
7 2014-04-13
8 2014-04-27
9 2014-05-11
10 2014-05-25
11 2014-06-08
In [13]: pd.DatetimeIndex(df[0]).to_period('M')
Out[13]:
<class 'pandas.tseries.period.PeriodIndex'>
freq: M
[2014-01, ..., 2014-06]
length: 12
This is a PeriodIndex, but you can make it a column:
这是一个 PeriodIndex,但您可以将其设为一列:
In [14]: df[1] = pd.DatetimeIndex(df[0]).to_period('M')
In [15]: df
Out[15]:
0 1
0 2014-01-05 2014-01
1 2014-01-19 2014-01
2 2014-02-02 2014-02
3 2014-02-16 2014-02
4 2014-03-02 2014-03
5 2014-03-16 2014-03
6 2014-03-30 2014-03
7 2014-04-13 2014-04
8 2014-04-27 2014-04
9 2014-05-11 2014-05
10 2014-05-25 2014-05
11 2014-06-08 2014-06
You can do a similar trick if the timestamps are part of a MultiIndex by extracting that "column" and passing it to DatetimeIndex as above, e.g. using df.index.get_level_values:
For example:
如果时间戳是 MultiIndex 的一部分,您可以通过提取该“列”并将其传递给上面的 DatetimeIndex来执行类似的技巧,例如使用df.index.get_level_values:
例如:
df[2] = 2
df.set_index([0, 1], inplace=True)
df.index.get_level_values(0) # returns a DatetimeIndex

