Pandas 第二大值的列名
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/26015489/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Pandas second largest value's column name
提问by AtotheSiv
I am trying to find column name associated with the largest and second largest values in a DataFrame, here's a simplified example (the real one has over 500 columns):
我试图在 DataFrame 中找到与最大和第二大值相关联的列名,这是一个简化的例子(真实的有 500 多列):
Date val1 val2 val3 val4
1990 5 7 1 10
1991 2 1 10 3
1992 10 9 6 1
1993 50 10 2 15
1994 1 15 7 8
Needs to become:
需要变成:
Date 1larg 2larg
1990 val4 val2
1991 val3 val4
1992 val1 val2
1993 val1 val4
1994 val2 val4
I can find the column name with the largest value (i,e, 1larg above) with idxmax, but how can I find the second largest?
我可以使用 idxmax 找到具有最大值(即上面的 1larg)的列名,但是如何找到第二大的列名?
回答by DSM
(You don't have any duplicate maximum values in your rows, so I'll guess that if you have [1,1,2,2]you want val3and val4to be selected.)
(您的行中没有任何重复的最大值,所以我猜如果[1,1,2,2]您想要val3并被val4选中。)
One way would be to use the result of argsortas an index into a Series with the column names.
一种方法是将结果argsort用作具有列名的系列的索引。
df = df.set_index("Date")
arank = df.apply(np.argsort, axis=1)
ranked_cols = df.columns.to_series()[arank.values[:,::-1][:,:2]]
new_frame = pd.DataFrame(ranked_cols, index=df.index)
produces
产生
0 1
Date
1990 val4 val2
1991 val3 val4
1992 val1 val2
1993 val1 val4
1994 val2 val4
1995 val4 val3
(where I've added an extra 1995 [1,1,2,2]row.)
(我在其中添加了额外的 1995[1,1,2,2]行。)
Alternatively, you could probably meltinto a flat format, pick out the largest two values in each Date group, and then turn it again.
或者,您可能会melt转换为平面格式,在每个日期组中挑选出最大的两个值,然后再次将其转换。

