根据列中的最大值过滤 Pandas Dataframe

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/25071937/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-13 22:19:16  来源:igfitidea点击:

Filter pandas Dataframe based on max values in a column

pythonnumpypandas

提问by wrcobb

I have a DataFrame with repeating values in the index. I would like to filter this dataset down to only show me one instance of each index by selecting the row within the index with the greatest value in a different column. For example, my DataFrame looks like this:

我有一个在索引中包含重复值的 DataFrame。我想通过选择索引中不同列中具有最大值的行来过滤此数据集,以仅显示每个索引的一个实例。例如,我的 DataFrame 如下所示:

df:

df:

Product ID     Store     Sales
    1            A         50
    1            B        200
    1            C         20
    2            A        400
    2            B         10
    3            A        200
    4            A         50
    4            B        100
    4            C        500

I would like to filter this data down to this:

我想将此数据过滤为:

df2:

df2:

Product ID     Store     Sales
    1            B        200
    2            A        400
    3            A        200
    4            C        500

Any thoughts on how best to approach this issue in pandas?

关于如何最好地在大Pandas中解决这个问题的任何想法?

Thanks very much for your time -

非常感谢你花时间陪伴 -

回答by EdChum

You can perform a groupbyon 'Product ID', then apply idxmaxon 'Sales' column. This will create a series with the index of the highest values. We can then use the index values to index into the original dataframe using iloc

您可以groupby在“产品 ID”上执行,然后idxmax在“销售”列上应用。这将创建一个具有最高值索引的系列。然后我们可以使用索引值索引到原始数据帧中iloc

In [201]:

df.iloc[df.groupby('Product ID')['Sales'].agg(pd.Series.idxmax)]
Out[201]:
   Product_ID Store  Sales
1           1     B    200
3           2     A    400
5           3     A    200
8           4     C    500