pandas 如何根据列值对熊猫数据框进行切片?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/28964495/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-13 23:02:02  来源:igfitidea点击:

how to slice a pandas data frame according to column values?

pythonpython-2.7pandas

提问by MJP

I have a pandas data frame with following format:

我有以下格式的Pandas数据框:

year    col1 
y1      val_1 
y1      val_2
y1      val_3
y2      val_4
y2      val_5
y2      val_6
y3      val_7
y3      val_8
y3      val_9

How do I select only the values till year 2 and omit year 3?

如何仅选择第 2 年之前的值并省略第 3 年?

I need a new_data frame as follows:

我需要一个 new_data 框架如下:

   year      col1 
    y1      val_1 
    y1      val_2
    y1      val_3
    y2      val_4
    y2      val_5
    y2      val_6

y1, y2, y3represent year values

y1, y2, y3代表年份值

回答by EdChum

On your sample dataset the following works:

在您的示例数据集上,以下工作:

In [35]:

df.iloc[0:df[df.year == 'y3'].index[0]]
Out[35]:
  year   col1
0   y1  val_1
1   y1  val_2
2   y1  val_3
3   y2  val_4
4   y2  val_5
5   y2  val_6

So breaking this down, we perform a boolean index to find the rows that equal the year value:

所以分解一下,我们执行一个布尔索引来找到等于年份值的行:

In [36]:

df[df.year == 'y3']
Out[36]:
  year   col1
6   y3  val_7
7   y3  val_8
8   y3  val_9

but we are interested in the index so we can use this for slicing:

但我们对索引感兴趣,所以我们可以用它来切片:

In [37]:

df[df.year == 'y3'].index
Out[37]:
Int64Index([6, 7, 8], dtype='int64')

But we only need the first value for slicing hence the call to index[0], however if you df is already sorted by year value then just performing df[df.year < y3]would be simpler and work.

但是我们只需要第一个切片值,因此调用index[0],但是如果您 df 已经按年份值排序,那么执行df[df.year < y3]会更简单和有效。