Python 使用 Pandas 查找列的最大值并返回相应的行值

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/15741759/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-18 20:51:33  来源:igfitidea点击:

Find maximum value of a column and return the corresponding row values using Pandas

pythonpandasdataframemax

提问by richie

Structure of data;

数据结构;

Using Python Pandas I am trying to find the Country& Placewith the maximum value.

使用 Python Pandas 我试图找到具有最大值的Country& Place

This returns the maximum value:

这将返回最大值:

data.groupby(['Country','Place'])['Value'].max()

But how do I get the corresponding Countryand Placename?

但我怎么得到相应CountryPlace的名字吗?

采纳答案by unutbu

Assuming dfhas a unique index, this gives the row with the maximum value:

假设df有一个唯一索引,这给出了具有最大值的行:

In [34]: df.loc[df['Value'].idxmax()]
Out[34]: 
Country        US
Place      Kansas
Value         894
Name: 7

Note that idxmaxreturns index labels. So if the DataFrame has duplicates in the index, the label may not uniquely identify the row, so df.locmay return more than one row.

请注意,idxmax返回索引标签。所以如果DataFrame在索引中有重复,标签可能不会唯一标识该行,因此df.loc可能返回多于一行。

Therefore, if dfdoes not have a unique index, you must make the index unique before proceeding as above. Depending on the DataFrame, sometimes you can use stackor set_indexto make the index unique. Or, you can simply reset the index (so the rows become renumbered, starting at 0):

因此,如果df没有唯一索引,则必须在进行上述操作之前使索引唯一。根据 DataFrame,有时您可以使用stackset_index使索引唯一。或者,您可以简单地重置索引(因此行重新编号,从 0 开始):

df = df.reset_index()

回答by waitingkuo

Use the indexattribute of DataFrame. Note that I don't type all the rows in the example.

使用 的index属性DataFrame。请注意,我没有键入示例中的所有行。

In [14]: df = data.groupby(['Country','Place'])['Value'].max()

In [15]: df.index
Out[15]: 
MultiIndex
[Spain  Manchester, UK     London    , US     Mchigan   ,        NewYork   ]

In [16]: df.index[0]
Out[16]: ('Spain', 'Manchester')

In [17]: df.index[1]
Out[17]: ('UK', 'London')

You can also get the value by that index:

您还可以通过该索引获取值:

In [21]: for index in df.index:
    print index, df[index]
   ....:      
('Spain', 'Manchester') 512
('UK', 'London') 778
('US', 'Mchigan') 854
('US', 'NewYork') 562

Edit

编辑

Sorry for misunderstanding what you want, try followings:

抱歉误解了您想要的内容,请尝试以下操作:

In [52]: s=data.max()

In [53]: print '%s, %s, %s' % (s['Country'], s['Place'], s['Value'])
US, NewYork, 854

回答by HYRY

The country and place is the index of the series, if you don't need the index, you can set as_index=False:

country 和 place 是系列的索引,如果不需要索引,可以设置as_index=False

df.groupby(['country','place'], as_index=False)['value'].max()

Edit:

编辑:

It seems that you want the place with max value for every country, following code will do what you want:

似乎您想要每个国家/地区都具有最大值的地方,以下代码将执行您想要的操作:

df.groupby("country").apply(lambda df:df.irow(df.value.argmax()))

回答by Arpit Sharma

In order to print the Country and Place with maximum value, use the following line of code.

为了打印具有最大值的国家和地区,请使用以下代码行。

print(df[['Country', 'Place']][df.Value == df.Value.max()])

回答by Gaurav

df[df['Value']==df['Value'].max()]

This will return the entire row with max value

这将返回具有最大值的整行

回答by sharad kakran

I think the easiest way to return a row with the maximum value is by getting its index. argmax()can be used to return the index of the row with the largest value.

我认为返回具有最大值的行的最简单方法是获取其索引。argmax()可用于返回具有最大值的行的索引。

index = df.Value.argmax()

Now the index could be used to get the features for that particular row:

现在可以使用索引来获取该特定行的特征:

df.iloc[df.Value.argmax(), 0:2]

回答by Marcin Lentner

My solution for finding maximum values in columns:

我在列中查找最大值的解决方案:

df.ix[df.idxmax()]

, also minimum:

, 也是最小值:

df.ix[df.idxmin()]

回答by saran3h

I'd recommend using nlargestfor better performance and shorter code. import pandas

我建议使用nlargest以获得更好的性能和更短的代码。进口pandas

df[col_name].value_counts().nlargest(n=1)

回答by Jefferson Sankara

I encountered a similar error while trying to import data using pandas, The first column on my dataset had spaces before the start of the words. I removed the spaces and it worked like a charm!!

我在尝试使用 Pandas 导入数据时遇到了类似的错误,数据集的第一列在单词开始前有空格。我删除了空格,它就像一个魅力!

回答by kelvinkahuro

You can use:

您可以使用:

print(df[df['Value']==df['Value'].max()])

打印(df[df['Value']==df['Value'].max()])