pandas 如何根据列值对熊猫数据框进行切片?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/28964495/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
how to slice a pandas data frame according to column values?
提问by MJP
I have a pandas data frame with following format:
我有以下格式的Pandas数据框:
year col1
y1 val_1
y1 val_2
y1 val_3
y2 val_4
y2 val_5
y2 val_6
y3 val_7
y3 val_8
y3 val_9
How do I select only the values till year 2 and omit year 3?
如何仅选择第 2 年之前的值并省略第 3 年?
I need a new_data frame as follows:
我需要一个 new_data 框架如下:
year col1
y1 val_1
y1 val_2
y1 val_3
y2 val_4
y2 val_5
y2 val_6
y1, y2, y3represent year values
y1, y2, y3代表年份值
回答by EdChum
On your sample dataset the following works:
在您的示例数据集上,以下工作:
In [35]:
df.iloc[0:df[df.year == 'y3'].index[0]]
Out[35]:
year col1
0 y1 val_1
1 y1 val_2
2 y1 val_3
3 y2 val_4
4 y2 val_5
5 y2 val_6
So breaking this down, we perform a boolean index to find the rows that equal the year value:
所以分解一下,我们执行一个布尔索引来找到等于年份值的行:
In [36]:
df[df.year == 'y3']
Out[36]:
year col1
6 y3 val_7
7 y3 val_8
8 y3 val_9
but we are interested in the index so we can use this for slicing:
但我们对索引感兴趣,所以我们可以用它来切片:
In [37]:
df[df.year == 'y3'].index
Out[37]:
Int64Index([6, 7, 8], dtype='int64')
But we only need the first value for slicing hence the call to index[0], however if you df is already sorted by year value then just performing df[df.year < y3]would be simpler and work.
但是我们只需要第一个切片值,因此调用index[0],但是如果您 df 已经按年份值排序,那么执行df[df.year < y3]会更简单和有效。

