Python 绘制熊猫中每个唯一值计数的键计数

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/15126679/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-18 13:25:24  来源:igfitidea点击:

Plot key count per unique value count in pandas

pythonplotpandas

提问by monkut

I have a set of data from which I want to plot the number of keysper unique id count(x=unique_id_count, y=key_count), and I'm trying to learn how to take advantage of pandas.

我有一组数据,我想从中绘制每个唯一 id 计数(x=unique_id_count,y=key_count)的键数,我正在尝试学习如何利用.pandas

In this case:

在这种情况下:

unique_ids 1 = key count 2

unique_ids 1 = 键数 2

unique_ids 2 = key count 1

unique_ids 2 = 键数 1

from pandas import *
key_items = ("a", "a", "a", "a", "a", "b", "b", "b", "b", "b", "c", "c", "c")
id_data = ("X", "X", "X", "X", "X", "X", "X", "Y", "Y", "Y", "X", "X", "X")

df = DataFrame({'keys': key_items, 'ids': id_data})

I've managed to mangle the data into what I want by pulling out the data from the dataframe and restructuring it, and rebuilding a new dataframe. In this case it's probably better to do it all in python without pandas...

通过从数据框中提取数据并对其进行重组,并重建一个新的数据框,我设法将数据改造成我想要的数据。在这种情况下,最好在没有熊猫的情况下在 python 中完成所有工作......

unique_values = defaultdict(list)
for items in df.itertuples(index=False):
    key = items[1]
    v = items[0]
    unique_values[key].append(v)

unique_values_count = {}
for k, values in unique_values.iteritems():
    unique_values_count[k] = [len(set(values))]

# reformat for plotting
key_col = ("a", "b", "c")
id_col = [unique_values_count[k][0] for k in key_col]



df2 = DataFrame({"keys":key_col, "unique_id_count": id_col})
df2.groupby("unique_id_count").size().plot(kind="bar")

Is there a better way to do this more directly using the initial dataframe?

有没有更好的方法可以更直接地使用初始数据框来做到这一点?

采纳答案by HYRY

s = df.groupby("keys").ids.agg(lambda x:len(x.unique()))
pd.value_counts(s).plot(kind="bar")

回答by Aziz Alto

How about just directly use value_counts()

直接用怎么样 value_counts()

pd.value_counts(df['ids']).plot.bar()

enter image description here

在此处输入图片说明