pandas 如何在多索引数据帧的第一级最后一个键中选择行?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/15952586/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How to select rows in the last key of first level in a Multiindex dataframe?
提问by Andy Hayden
I have a Pandas DataFrame that looks like the following:
我有一个如下所示的 Pandas DataFrame:
data
date signal
2012-11-01 a 0.04
b 0.03
2012-12-01 a -0.01
b 0.00
2013-01-01 a -0.00
b -0.01
I am trying to get only the last row based on the first level of the multiindex, which is date in this case.
我试图仅根据多索引的第一级(在本例中为日期)获取最后一行。
2013-01-01 a -0.00
b -0.01
The first level index is datetime. What would be the most elegant way to select the last row?
第一级索引是日期时间。选择最后一行的最优雅方式是什么?
回答by Andy Hayden
One way is to access the MultiIndex's levels directly (and use the last one):
一种方法是直接访问 MultiIndex 的级别(并使用最后一个):
In [11]: df.index.levels
Out[11]: [Index([bar, baz, foo, qux], dtype=object), Index([one, two], dtype=object)]
In [12]: df.index.levels[0][-1]
Out[12]: 'qux'
And select these rows with ix:
并选择这些行ix:
In [13]: df.ix[df.index.levels[0][-1]]
Out[13]:
0 1 2 3
one 1.225973 -0.703952 0.265889 1.069345
two -1.521503 0.024696 0.109501 -1.584634
In [14]: df.ix[df.index.levels[0][-1]:]
Out[14]:
0 1 2 3
qux one 1.225973 -0.703952 0.265889 1.069345
two -1.521503 0.024696 0.109501 -1.584634
(Using @Jeff's example DataFrame.)
(使用@Jeff 的示例 DataFrame。)
Perhaps a more elegant way is to use tail(if you knew there would always be two):
也许更优雅的方法是使用tail(如果你知道总会有两个):
In [15]: df.tail(2)
Out[15]:
0 1 2 3
qux one 1.225973 -0.703952 0.265889 1.069345
two -1.521503 0.024696 0.109501 -1.584634
回答by Jeff
In 0.11 (coming this week), this is a reasonable way to do this
在 0.11(本周即将推出)中,这是一个合理的方法
In [50]: arrays = [np.array(['bar', 'bar', 'baz', 'baz', 'foo', 'foo', 'qux', 'qux']),
.....: np.array(['one', 'two', 'one', 'two', 'one', 'two', 'one', 'two'])]
In [51]: df = pd.DataFrame(np.random.randn(8, 4), index=arrays)
In [52]: df
Out[52]:
0 1 2 3
bar one -1.798562 0.852583 -0.148094 -2.107990
two -1.091486 -0.748130 0.519758 2.621751
baz one -1.257548 0.210936 -0.338363 -0.141486
two -0.810674 0.323798 -0.030920 -0.510224
foo one -0.427309 0.933469 -1.259559 -0.771702
two -2.060524 0.795388 -1.458060 -1.762406
qux one -0.574841 0.023691 -1.567137 0.462715
two 0.936323 0.346049 -0.709112 0.045066
In [53]: df.loc['qux'].iloc[[-1]]
Out[53]:
0 1 2 3
two 0.936323 0.346049 -0.709112 0.045066
This will work in 0.10.1
这将适用于 0.10.1
In [63]: df.ix['qux'].ix[-1]
Out[63]:
0 0.936323
1 0.346049
2 -0.709112
3 0.045066
Name: two, dtype: float64
And another way (this works in 0.10.1) as well
还有另一种方式(这适用于 0.10.1)
In [59]: df.xs(('qux','two'))
Out[59]:
0 0.936323
1 0.346049
2 -0.709112
3 0.045066
Name: (qux, two), dtype: float64
回答by bdiamante
If you have a dataframe dfwith a MultiIndex already defined, then:
如果你有一个df已经定义了 MultiIndex的数据框,那么:
df2 = df.ix[df.index[len(df.index)-1][0]]
would also work.
也会工作。
回答by arch
You can get the last row with iloc:
您可以使用以下命令获取最后一行iloc:
df.iloc[-1]

