根据条件获取 Python Pandas 中数据框的第一行

Question

提问by Tasos

Let's say that I have a dataframe like this one

假设我有一个像这样的数据框

import pandas as pd
df = pd.DataFrame([[1, 2, 1], [1, 3, 2], [4, 6, 3], [4, 3, 4], [5, 4, 5]], columns=['A', 'B', 'C'])

>> df
   A  B  C
0  1  2  1
1  1  3  2
2  4  6  3
3  4  3  4
4  5  4  5

The original table is more complicated with more columns and rows.

原始表更复杂，具有更多的列和行。

I want to get the first row that fulfil some criteria. Examples:

我想获得满足某些条件的第一行。例子：

Get first row where A > 3 (returns row 2)
Get first row where A > 4 AND B > 3 (returns row 4)
Get first row where A > 3 AND (B > 3 OR C > 2) (returns row 2)

获取 A > 3 的第一行（返回第 2 行）
获取 A > 4 AND B > 3 的第一行（返回第 4 行）
获取 A > 3 AND (B > 3 OR C > 2) 的第一行（返回第 2 行）

But, if there isn't any row that fulfil the specific criteria, then I want to get the first one after I just sort it descending by A (or other cases by B, C etc)

但是，如果没有任何行满足特定条件，那么我想在我按 A 降序排序（或按 B、C 等其他情况）后得到第一个行

Get first row where A > 6 (returns row 4 by ordering it by A desc and get the first one)

获取 A > 6 的第一行（通过按 A desc 排序返回第 4 行并获取第一个）

I was able to do it by iterating on the dataframe (I know that craps :P). So, I prefer a more pythonic way to solve it.

我能够通过迭代数据帧来做到这一点（我知道胡扯：P）。所以，我更喜欢用更 Pythonic 的方式来解决它。

Answer 1

回答by Tgsmith61591

This tutorialis a very good one for pandas slicing. Make sure you check it out. Onto some snippets... To slice a dataframe with a condition, you use this format:

本教程非常适合大熊猫切片。请务必检查一下。在一些片段上...要使用条件切片数据框，请使用以下格式：

>>> df[condition]

This will return a slice of your dataframe which you can index using iloc. Here are your examples:

这将返回数据帧的一部分，您可以使用iloc. 以下是您的示例：

Get first row where A > 3 (returns row 2)

>>> df[df.A > 3].iloc[0]
A    4
B    6
C    3
Name: 2, dtype: int64

获取 A > 3 的第一行（返回第 2 行）

>>> df[df.A > 3].iloc[0]
A    4
B    6
C    3
Name: 2, dtype: int64

If what you actually want is the row number, rather than using iloc, it would be df[df.A > 3].index[0].

如果您真正想要的是行号，而不是使用iloc，那就是df[df.A > 3].index[0].

Get first row where A > 4 AND B > 3:

>>> df[(df.A > 4) & (df.B > 3)].iloc[0]
A    5
B    4
C    5
Name: 4, dtype: int64

Get first row where A > 3 AND (B > 3 OR C > 2) (returns row 2)

>>> df[(df.A > 3) & ((df.B > 3) | (df.C > 2))].iloc[0]
A    4
B    6
C    3
Name: 2, dtype: int64

获取 A > 4 AND B > 3 的第一行：

>>> df[(df.A > 4) & (df.B > 3)].iloc[0]
A    5
B    4
C    5
Name: 4, dtype: int64

获取 A > 3 AND (B > 3 OR C > 2) 的第一行（返回第 2 行）

>>> df[(df.A > 3) & ((df.B > 3) | (df.C > 2))].iloc[0]
A    4
B    6
C    3
Name: 2, dtype: int64

Now, with your last case we can write a function that handles the default case of returning the descending-sorted frame:

现在，对于您的最后一种情况，我们可以编写一个函数来处理返回降序排序框架的默认情况：

>>> def series_or_default(X, condition, default_col, ascending=False):
...     sliced = X[condition]
...     if sliced.shape[0] == 0:
...         return X.sort_values(default_col, ascending=ascending).iloc[0]
...     return sliced.iloc[0]
>>> 
>>> series_or_default(df, df.A > 6, 'A')
A    5
B    4
C    5
Name: 4, dtype: int64

As expected, it returns row 4.

正如预期的那样，它返回第 4 行。

Answer 2

回答by Boud

For existing matches, use query:

对于现有匹配项，请使用query：

df.query(' A > 3' ).head(1)
Out[33]: 
   A  B  C
2  4  6  3

df.query(' A > 4 and B > 3' ).head(1)
Out[34]: 
   A  B  C
4  5  4  5

df.query(' A > 3 and (B > 3 or C > 2)' ).head(1)
Out[35]: 
   A  B  C
2  4  6  3

Answer 3

回答by PabTorre

you can take care of the first 3 items with slicing and head:

您可以使用切片和头部处理前 3 个项目：

df[df.A>=4].head(1)
df[(df.A>=4)&(df.B>=3)].head(1)
df[(df.A>=4)&((df.B>=3) * (df.C>=2))].head(1)

df[df.A>=4].head(1)
df[(df.A>=4)&(df.B>=3)].head(1)
df[(df.A>=4)&((df.B>=3) * (df.C>=2))].head(1)

The condition in case nothing comes back you can handle with a try or an if...

万一什么都没有回来的情况，你可以用 try 或 if 来处理......

try:
    output = df[df.A>=6].head(1)
    assert len(output) == 1
except: 
    output = df.sort_values('A',ascending=False).head(1)

根据条件获取 Python Pandas 中数据框的第一行

提问by Tasos

回答by Tgsmith61591

回答by Boud

回答by PabTorre

相关推荐

最近更新

标签

根据条件获取 Python Pandas 中数据框的第一行

提问by Tasos

回答by Tgsmith61591

回答by Boud

回答by PabTorre

相关推荐

Python 导入 pandas.plotting 的问题

在 Python 中将列表转换为字符串

Python 什么相当于 Matlab 元胞数组？

Python Pandas 中的分层抽样

相关推荐

最近更新

标签