Python 查找每行具有最大值的列名

Question

提问by markov zain

I have a DataFrame like this one:

我有一个像这样的 DataFrame：

In [7]:
frame.head()
Out[7]:
Communications and Search   Business    General Lifestyle
0   0.745763    0.050847    0.118644    0.084746
0   0.333333    0.000000    0.583333    0.083333
0   0.617021    0.042553    0.297872    0.042553
0   0.435897    0.000000    0.410256    0.153846
0   0.358974    0.076923    0.410256    0.153846

In here, I want to ask how to get column name which has maximum value for each row, the desired output is like this:

在这里，我想问一下如何获取每行具有最大值的列名，所需的输出是这样的：

In [7]:
    frame.head()
    Out[7]:
    Communications and Search   Business    General Lifestyle   Max
    0   0.745763    0.050847    0.118644    0.084746           Communications 
    0   0.333333    0.000000    0.583333    0.083333           Business  
    0   0.617021    0.042553    0.297872    0.042553           Communications 
    0   0.435897    0.000000    0.410256    0.153846           Communications 
    0   0.358974    0.076923    0.410256    0.153846           Business

Answer 1

采纳答案by Alex Riley

You can use idxmaxwith axis=1to find the column with the greatest value on each row:

您可以使用idxmaxwithaxis=1查找每行中具有最大值的列：

>>> df.idxmax(axis=1)
0    Communications
1          Business
2    Communications
3    Communications
4          Business
dtype: object

To create the new column 'Max', use df['Max'] = df.idxmax(axis=1).

要创建新列“Max”，请使用df['Max'] = df.idxmax(axis=1).

To find the rowindex at which the maximum value occurs in each column, use df.idxmax()(or equivalently df.idxmax(axis=0)).

要查找每列中出现最大值的行索引，请使用df.idxmax()（或等效地df.idxmax(axis=0)）。

Answer 2

回答by Zero

You could applyon dataframe and get argmax()of each row via axis=1

您可以apply在数据帧上argmax()通过axis=1

In [144]: df.apply(lambda x: x.argmax(), axis=1)
Out[144]:
0    Communications
1          Business
2    Communications
3    Communications
4          Business
dtype: object

Here's a benchmark to compare how slow applymethod is to idxmax()for len(df) ~ 20K

这里有一个基准来比较慢apply的方法是idxmax()为len(df) ~ 20K

In [146]: %timeit df.apply(lambda x: x.argmax(), axis=1)
1 loops, best of 3: 479 ms per loop

In [147]: %timeit df.idxmax(axis=1)
10 loops, best of 3: 47.3 ms per loop

Answer 3

回答by user1718097

And if you want to produce a column containing the name of the column with the maximum value but considering only a subset of columns then you use a variation of @ajcr's answer:

如果您想生成一个包含具有最大值的列的名称但只考虑列的子集的列，那么您可以使用@ajcr 答案的变体：

df['Max'] = df[['Communications','Business']].idxmax(axis=1)

Python 查找每行具有最大值的列名

提问by markov zain

采纳答案by Alex Riley

回答by Zero

回答by user1718097

相关推荐

最近更新

标签

Python 查找每行具有最大值的列名

提问by markov zain

采纳答案by Alex Riley

回答by Zero

回答by user1718097

相关推荐

Python 熊猫中的不同 read_csv index_col = None / 0 / False

Python / Pillow：如何缩放图像

Python Selenium/PhantomJS 引发错误

Python GroupBy 结果到列表字典

相关推荐

最近更新

标签