pandas 找到两列之间差异最大的行

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/40952450/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 02:34:12  来源:igfitidea点击:

Find the row which has the maximum difference between two columns

pythonpandas

提问by ayhan

I have a DataFrame with columns Goldand Gold.1. I want to find the row where the difference of these two columns is the maximum.

我有一个带有列GoldGold.1. 我想找到这两列的差异最大的行。

For the following DataFrame, this should return me row 6.

对于以下 DataFrame,这应该返回第 6 行。

df
Out: 
   Gold  Gold.1
0     2       1
1     1       4
2     6       9
3     4       4
4     4       8
5     5       5
6     5       2 ---> The difference is maximum (3)
7     5       9
8     5       3
9     5       6

I tried using the following:

我尝试使用以下方法:

df.where(max(df['Gold']-df['Gold.1']))

However that raised a ValueError:

然而,这引发了一个 ValueError:

df.where(max(df['Gold']-df['Gold.1']))
Traceback (most recent call last):

  File "", line 1, in 
    df.where(max(df['Gold']-df['Gold.1']))

  File "../python3.5/site-packages/pandas/core/generic.py", line 5195, in where
    raise_on_error)

  File "../python3.5/site-packages/pandas/core/generic.py", line 4936, in _where
    raise ValueError('Array conditional must be same shape as '

ValueError: Array conditional must be same shape as self

How can I find the row that satisfies this condition?

如何找到满足此条件的行?

回答by ayhan

Instead of .where, you can use .idxmax:

代替.where,您可以使用.idxmax

(df['Gold'] - df['Gold.1']).idxmax()
Out: 6

This will return the index where the difference is maximum.

这将返回差异最大的索引。

If you want to find the row with the maximum absolutedifference, then you can call .abs()first.

如果要找到绝对差最大的行,那么可以.abs()先调用。

(df['Gold'] - df['Gold.1']).abs().idxmax()
Out: 4

回答by Loochie

Though my method is a longer than the above one, people who are comfortable working with lists may find this useful.

虽然我的方法比上面的方法长,但习惯使用列表的人可能会发现这很有用。

x= list((df['col1']-df['col2']).abs())
x.index(max(x))