Python 熊猫重置系列上的索引以删除多索引
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/18624039/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Pandas reset index on series to remove multiindex
提问by dartdog
I created a Series
from a DataFrame
, when I resampled some data with a count
like so: where H2
is a DataFrame
:
我创建了一个Series
从DataFrame
,当我重新采样一些数据,象这样一个数:其中H2
是DataFrame
:
H3=H2[['SOLD_PRICE']]
H5=H3.resample('Q',how='count')
H6=pd.rolling_mean(H5,4)
This yielded a series that looks like this:
这产生了一个如下所示的系列:
1999-03-31 SOLD_PRICE NaN
1999-06-30 SOLD_PRICE NaN
1999-09-30 SOLD_PRICE NaN
1999-12-31 SOLD_PRICE 3.00
2000-03-31 SOLD_PRICE 3.00
with an index that looks like:
具有如下所示的索引:
MultiIndex
[(1999-03-31 00:00:00, u'SOLD_PRICE'), (1999-06-30 00:00:00, u'SOLD_PRICE'), (1999-09-30 00:00:00, u'SOLD_PRICE'), (1999-12-31 00:00:00, u'SOLD_PRICE'),.....
I don't want the second column as an index. Ideally I'd have a DataFrame
with column 1 as "Date" and column 2 as "Sales" (dropping the second level of the index). I don't quite see how to reconfigure the index.
我不希望第二列作为索引。理想情况下,我将第DataFrame
1 列作为“日期”,第 2 列作为“销售额”(删除索引的第二级)。我不太明白如何重新配置索引。
采纳答案by Phillip Cloud
Just call reset_index()
:
只需致电reset_index()
:
In [130]: s
Out[130]:
0 1
1999-03-31 SOLD_PRICE NaN
1999-06-30 SOLD_PRICE NaN
1999-09-30 SOLD_PRICE NaN
1999-12-31 SOLD_PRICE 3
2000-03-31 SOLD_PRICE 3
Name: 2, dtype: float64
In [131]: s.reset_index()
Out[131]:
0 1 2
0 1999-03-31 SOLD_PRICE NaN
1 1999-06-30 SOLD_PRICE NaN
2 1999-09-30 SOLD_PRICE NaN
3 1999-12-31 SOLD_PRICE 3
4 2000-03-31 SOLD_PRICE 3
There are many ways to drop columns:
有很多方法可以删除列:
Call reset_index()
twice and specify a column:
调用reset_index()
两次并指定一列:
In [136]: s.reset_index(0).reset_index(drop=True)
Out[136]:
0 2
0 1999-03-31 NaN
1 1999-06-30 NaN
2 1999-09-30 NaN
3 1999-12-31 3
4 2000-03-31 3
Delete the column after resetting the index:
重置索引后删除列:
In [137]: df = s.reset_index()
In [138]: df
Out[138]:
0 1 2
0 1999-03-31 SOLD_PRICE NaN
1 1999-06-30 SOLD_PRICE NaN
2 1999-09-30 SOLD_PRICE NaN
3 1999-12-31 SOLD_PRICE 3
4 2000-03-31 SOLD_PRICE 3
In [139]: del df[1]
In [140]: df
Out[140]:
0 2
0 1999-03-31 NaN
1 1999-06-30 NaN
2 1999-09-30 NaN
3 1999-12-31 3
4 2000-03-31 3
Call drop()
after resetting:
drop()
重置后调用:
In [144]: s.reset_index().drop(1, axis=1)
Out[144]:
0 2
0 1999-03-31 NaN
1 1999-06-30 NaN
2 1999-09-30 NaN
3 1999-12-31 3
4 2000-03-31 3
Then, after you've reset your index, just rename the columns
然后,在您重置索引后,只需重命名列
In [146]: df.columns = ['Date', 'Sales']
In [147]: df
Out[147]:
Date Sales
0 1999-03-31 NaN
1 1999-06-30 NaN
2 1999-09-30 NaN
3 1999-12-31 3
4 2000-03-31 3
回答by unutbu
When you use double brackets, such as
当您使用双括号时,例如
H3 = H2[['SOLD_PRICE']]
H3 becomes a DataFrame. If you use single brackets,
H3 成为一个 DataFrame。如果使用单括号,
H3 = H2['SOLD_PRICE']
then H3 becomes a Series. If H3 is a Series, then the result you desire follows naturally:
然后 H3 成为一个系列。如果 H3 是一个系列,那么你想要的结果自然如下:
import pandas as pd
import numpy as np
rng = pd.date_range('1/1/2011', periods=72, freq='M')
H2 = pd.DataFrame(np.arange(len(rng)), index=rng, columns=['SOLD_PRICE'])
H3 = H2['SOLD_PRICE']
H5 = H3.resample('Q', how='count')
H6 = pd.rolling_mean(H5,4)
print(H6.head())
yields
产量
2011-03-31 NaN
2011-06-30 NaN
2011-09-30 NaN
2011-12-31 3
2012-03-31 3
dtype: float64