Python Pandas 获取列中出现频率最高的值

Question

提问by aleale

i have this dataframe:

我有这个数据框：

0 name data
1 alex asd
2 helen sdd
3 alex dss
4 helen sdsd
5 john sdadd

so i am trying to get the most frequent value or values(in this case its values)so what i do is:

所以我试图获得最频繁的一个或多个值（在这种情况下是它的值），所以我要做的是：

dataframe['name'].value_counts().idxmax()

but it returns only the value: Alexeven if it Helenappears two times as well.

但它只返回值：Alex，即使Helen 也出现了两次。

Answer 1

回答by YOBEN_S

By using mode

通过使用 mode

df.name.mode()
Out[712]: 
0     alex
1    helen
dtype: object

Answer 2

回答by Jared Wilber

To get the nmost frequent values, just subset .value_counts()and grab the index:

要获得n最频繁的值，只需设置子集.value_counts()并获取索引：

# get top 10 most frequent names
n = 10
dataframe['name'].value_counts()[:n].index.tolist()

Answer 3

回答by Lunar_one

You could try argmaxlike this:

你可以这样尝试argmax：

dataframe['name'].value_counts().argmax() Out[13]: 'alex'

The value_countswill return a count object of pandas.core.series.Seriesand argmaxcould be used to achieve the key of max values.

在value_counts返回的计数的对象pandas.core.series.Series，并argmax可以用来实现最大价值的关键。

Answer 4

回答by paul okoduwa

You can use this to get a perfect count, it calculates the mode a particular column

您可以使用它来获得完美的计数，它计算特定列的模式

df['name'].value_counts()

Answer 5

回答by Taie

df['name'].value_counts()[:5].sort_values(ascending=False)

The value_countswill return a count object of pandas.core.series.Seriesand sort_values(ascending=False)will get you the highest values first.

在value_counts返回的计数对象pandas.core.series.Series，并sort_values(ascending=False)会得到你的最高值第一。

Answer 6

回答by piRSquared

Not Obvious, But Fast

不明显，但很快

f, u = pd.factorize(df.name.values)
counts = np.bincount(f)
u[counts == counts.max()]

array(['alex', 'helen'], dtype=object)

Answer 7

回答by pault

Here's one way:

这是一种方法：

df['name'].value_counts()[df['name'].value_counts() == df['name'].value_counts().max()]

which prints:

打印：

helen    2
alex     2
Name: name, dtype: int64

Answer 8

回答by Brian

You could use .apply and pd.value_counts to get a count the occurrence of all the names in the name column.

您可以使用 .apply 和 pd.value_counts 来计算名称列中所有名称的出现次数。

dataframe['name'].apply(pd.value_counts)

Answer 9

回答by Naomi Fridman

to get top 5:

获得前 5 名：

dataframe['name'].value_counts()[0:5]

Answer 10

回答by venergiac

my best solution to get the first is

我获得第一个的最佳解决方案是

 df['my_column'].value_counts().sort_values(ascending=False).argmax()

Python Pandas 获取列中出现频率最高的值

提问by aleale

回答by YOBEN_S

回答by Jared Wilber

回答by Lunar_one

回答by paul okoduwa

回答by Taie

回答by piRSquared

回答by pault

回答by Brian

回答by Naomi Fridman

回答by venergiac

相关推荐

最近更新

标签

Python Pandas 获取列中出现频率最高的值

提问by aleale

回答by YOBEN_S

回答by Jared Wilber

回答by Lunar_one

回答by paul okoduwa

回答by Taie

回答by piRSquared

回答by pault

回答by Brian

回答by Naomi Fridman

回答by venergiac

相关推荐

Python 无法使用 pip 安装 matplotlib

Python *args 和 **kwargs 的类型注释

无法在 python 3.6 中 pip install pickle

Python Pyspark 替换 Spark 数据框列中的字符串

相关推荐

最近更新

标签