pandas 熊猫:如何找到每行最频繁的值?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/36091902/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
pandas: how to find the most frequent value of each row?
提问by Robin1988
how to find the most frequent value of each row of a dataframe? For example:
如何找到数据帧每一行的最频繁值?例如:
In [14]: df
Out[14]:
a b c
0 2 3 3
1 1 1 2
2 7 7 8
return: [3,1,7]
返回:[3,1,7]
回答by MaxU
try .mode()method:
尝试.mode()方法:
In [88]: df
Out[88]:
a b c
0 2 3 3
1 1 1 2
2 7 7 8
In [89]: df.mode(axis=1)
Out[89]:
0
0 3
1 1
2 7
From docs:
从文档:
Gets the mode(s) of each element along the axis selected. Adds a row for each mode per label, fills in gaps with nan.
Notethat there could be multiple values returned for the selected axis (when more than one item share the maximum frequency), which is the reason why a dataframe is returned. If you want to impute missing values with the mode in a dataframe df, you can just do this: df.fillna(df.mode().iloc[0])
获取沿所选轴的每个元素的模式。为每个标签的每个模式添加一行,用 nan 填充空白。
请注意,所选轴可能返回多个值(当多个项目共享最大频率时),这就是返回数据帧的原因。如果你想用数据帧df中的模式来估算缺失值,你可以这样做:df.fillna(df.mode().iloc[0])