Pandas 第二大值的列名

Question

提问by AtotheSiv

I am trying to find column name associated with the largest and second largest values in a DataFrame, here's a simplified example (the real one has over 500 columns):

我试图在 DataFrame 中找到与最大和第二大值相关联的列名，这是一个简化的例子（真实的有 500 多列）：

Date  val1  val2 val3 val4
1990   5     7    1    10
1991   2     1    10   3
1992   10    9    6    1
1993   50    10   2    15
1994   1     15   7    8

Needs to become:

需要变成：

Date  1larg   2larg
1990  val4    val2
1991  val3    val4
1992  val1    val2
1993  val1    val4
1994  val2    val4

I can find the column name with the largest value (i,e, 1larg above) with idxmax, but how can I find the second largest?

我可以使用 idxmax 找到具有最大值（即上面的 1larg）的列名，但是如何找到第二大的列名？

Answer 1

回答by DSM

(You don't have any duplicate maximum values in your rows, so I'll guess that if you have [1,1,2,2]you want val3and val4to be selected.)

（您的行中没有任何重复的最大值，所以我猜如果[1,1,2,2]您想要val3并被val4选中。）

One way would be to use the result of argsortas an index into a Series with the column names.

一种方法是将结果argsort用作具有列名的系列的索引。

df = df.set_index("Date")
arank = df.apply(np.argsort, axis=1)
ranked_cols = df.columns.to_series()[arank.values[:,::-1][:,:2]]
new_frame = pd.DataFrame(ranked_cols, index=df.index)

produces

产生

         0     1
Date            
1990  val4  val2
1991  val3  val4
1992  val1  val2
1993  val1  val4
1994  val2  val4
1995  val4  val3

(where I've added an extra 1995 [1,1,2,2]row.)

（我在其中添加了额外的 1995[1,1,2,2]行。）

Alternatively, you could probably meltinto a flat format, pick out the largest two values in each Date group, and then turn it again.

或者，您可能会melt转换为平面格式，在每个日期组中挑选出最大的两个值，然后再次将其转换。

Pandas 第二大值的列名

提问by AtotheSiv

回答by DSM

相关推荐

最近更新

标签

Pandas 第二大值的列名

提问by AtotheSiv

回答by DSM

相关推荐

pandas 在python pandas的数据框中为具有选定列的每行数据创建哈希值

Python pandas - 特定的合并/替换

pandas 带有熊猫数据框的矢量化半正弦公式

如何在 IPython 笔记本的 Pandas DataFrame 列中左对齐文本

相关推荐

最近更新

标签