Python 如何获取pandas DataFrame的最后N行?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/14663004/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-18 12:06:14  来源:igfitidea点击:

How to get the last N rows of a pandas DataFrame?

pythonpandasdataframe

提问by bigbug

I have pandas dataframe df1and df2(df1 is vanila dataframe, df2 is indexed by 'STK_ID' & 'RPT_Date') :

我有Pandas 数据框df1df2(df1 是 vanila 数据框,df2 由 'STK_ID' & 'RPT_Date' 索引):

>>> df1
    STK_ID  RPT_Date  TClose   sales  discount
0   000568  20060331    3.69   5.975       NaN
1   000568  20060630    9.14  10.143       NaN
2   000568  20060930    9.49  13.854       NaN
3   000568  20061231   15.84  19.262       NaN
4   000568  20070331   17.00   6.803       NaN
5   000568  20070630   26.31  12.940       NaN
6   000568  20070930   39.12  19.977       NaN
7   000568  20071231   45.94  29.269       NaN
8   000568  20080331   38.75  12.668       NaN
9   000568  20080630   30.09  21.102       NaN
10  000568  20080930   26.00  30.769       NaN

>>> df2
                 TClose   sales  discount  net_sales    cogs
STK_ID RPT_Date                                             
000568 20060331    3.69   5.975       NaN      5.975   2.591
       20060630    9.14  10.143       NaN     10.143   4.363
       20060930    9.49  13.854       NaN     13.854   5.901
       20061231   15.84  19.262       NaN     19.262   8.407
       20070331   17.00   6.803       NaN      6.803   2.815
       20070630   26.31  12.940       NaN     12.940   5.418
       20070930   39.12  19.977       NaN     19.977   8.452
       20071231   45.94  29.269       NaN     29.269  12.606
       20080331   38.75  12.668       NaN     12.668   3.958
       20080630   30.09  21.102       NaN     21.102   7.431

I can get the last 3 rows of df2 by:

我可以通过以下方式获取 df2 的最后 3 行:

>>> df2.ix[-3:]
                 TClose   sales  discount  net_sales    cogs
STK_ID RPT_Date                                             
000568 20071231   45.94  29.269       NaN     29.269  12.606
       20080331   38.75  12.668       NaN     12.668   3.958
       20080630   30.09  21.102       NaN     21.102   7.431

while df1.ix[-3:]give all the rows:

同时df1.ix[-3:]给出所有行:

>>> df1.ix[-3:]
    STK_ID  RPT_Date  TClose   sales  discount
0   000568  20060331    3.69   5.975       NaN
1   000568  20060630    9.14  10.143       NaN
2   000568  20060930    9.49  13.854       NaN
3   000568  20061231   15.84  19.262       NaN
4   000568  20070331   17.00   6.803       NaN
5   000568  20070630   26.31  12.940       NaN
6   000568  20070930   39.12  19.977       NaN
7   000568  20071231   45.94  29.269       NaN
8   000568  20080331   38.75  12.668       NaN
9   000568  20080630   30.09  21.102       NaN
10  000568  20080930   26.00  30.769       NaN

Why ? How to get the last 3 rows of df1(dataframe without index) ? Pandas 0.10.1

为什么 ?如何获取df1(没有索引的数据框)的最后 3 行?熊猫 0.10.1

采纳答案by Wes McKinney

Don't forget DataFrame.tail! e.g. df1.tail(10)

不要忘记DataFrame.tail!例如df1.tail(10)

回答by Andy Hayden

This is because of using integer indices (ixselects those by labelover -3 rather than position, and this is by design: see integer indexing in pandas "gotchas"*).

这是因为使用整数索引(通过 -3 上ix标签而不是position选择那些索引,这是设计使然:请参阅pandas“gotchas”*中的整数索引)。

*In newer versions of pandas prefer loc or iloc to remove the ambiguity of ix as position or label:

*在较新版本的熊猫中,更喜欢使用 loc 或 iloc 来消除 ix 作为位置或标签的歧义:

df.iloc[-3:]

see the docs.

请参阅文档

As Wes points out, in this specific case you should just use tail!

正如 Wes 指出的那样,在这种特定情况下,您应该只使用 tail!

回答by cs95

How to get the last N rows of a pandas DataFrame?

如何获取pandas DataFrame的最后N行?

If you are slicing by position, __getitem__(i.e., slicing with[]) works well, and is the most succinct solution I've found for this problem.

如果您按位置切片,__getitem__(即,用 切片[])效果很好,并且是我为这个问题找到的最简洁的解决方案。

pd.__version__
# '0.24.2'

df = pd.DataFrame({'A': list('aaabbbbc'), 'B': np.arange(1, 9)})
df

   A  B
0  a  1
1  a  2
2  a  3
3  b  4
4  b  5
5  b  6
6  b  7
7  c  8

df[-3:]

   A  B
5  b  6
6  b  7
7  c  8

This is the same as calling df.iloc[-3:], for instance (ilocinternally delegates to __getitem__).

例如,这与调用 相同df.iloc[-3:]iloc内部委托给__getitem__)。



As an aside, if you want to find the last N rows for each group, use groupbyand GroupBy.tail:

顺便说一句,如果您想找到每个组的最后 N 行,请使用groupbyand GroupBy.tail

df.groupby('A').tail(2)

   A  B
1  a  2
2  a  3
5  b  6
6  b  7
7  c  8