Python Pandas 获取列中出现频率最高的值
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/48590268/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Pandas get the most frequent values of a column
提问by aleale
i have this dataframe:
我有这个数据框:
0 name data
1 alex asd
2 helen sdd
3 alex dss
4 helen sdsd
5 john sdadd
so i am trying to get the most frequent value or values(in this case its values)so what i do is:
所以我试图获得最频繁的一个或多个值(在这种情况下是它的值),所以我要做的是:
dataframe['name'].value_counts().idxmax()
but it returns only the value: Alexeven if it Helenappears two times as well.
但它只返回值:Alex,即使Helen 也出现了两次。
回答by YOBEN_S
By using mode
通过使用 mode
df.name.mode()
Out[712]:
0 alex
1 helen
dtype: object
回答by Jared Wilber
To get the n
most frequent values, just subset .value_counts()
and grab the index:
要获得n
最频繁的值,只需设置子集.value_counts()
并获取索引:
# get top 10 most frequent names
n = 10
dataframe['name'].value_counts()[:n].index.tolist()
回答by Lunar_one
You could try argmax
like this:
你可以这样尝试argmax
:
dataframe['name'].value_counts().argmax()
Out[13]: 'alex'
dataframe['name'].value_counts().argmax()
Out[13]: 'alex'
The value_counts
will return a count object of pandas.core.series.Series
and argmax
could be used to achieve the key of max values.
在value_counts
返回的计数的对象pandas.core.series.Series
,并argmax
可以用来实现最大价值的关键。
回答by paul okoduwa
You can use this to get a perfect count, it calculates the mode a particular column
您可以使用它来获得完美的计数,它计算特定列的模式
df['name'].value_counts()
回答by Taie
df['name'].value_counts()[:5].sort_values(ascending=False)
The value_counts
will return a count object of pandas.core.series.Series
and sort_values(ascending=False)
will get you the highest values first.
在value_counts
返回的计数对象pandas.core.series.Series
,并sort_values(ascending=False)
会得到你的最高值第一。
回答by piRSquared
Not Obvious, But Fast
不明显,但很快
f, u = pd.factorize(df.name.values)
counts = np.bincount(f)
u[counts == counts.max()]
array(['alex', 'helen'], dtype=object)
回答by pault
Here's one way:
这是一种方法:
df['name'].value_counts()[df['name'].value_counts() == df['name'].value_counts().max()]
which prints:
打印:
helen 2
alex 2
Name: name, dtype: int64
回答by Brian
You could use .apply and pd.value_counts to get a count the occurrence of all the names in the name column.
您可以使用 .apply 和 pd.value_counts 来计算名称列中所有名称的出现次数。
dataframe['name'].apply(pd.value_counts)
回答by Naomi Fridman
to get top 5:
获得前 5 名:
dataframe['name'].value_counts()[0:5]
回答by venergiac
my best solution to get the first is
我获得第一个的最佳解决方案是
df['my_column'].value_counts().sort_values(ascending=False).argmax()