Python Pandas 'DataFrame' 对象没有属性 'unique'

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/29244549/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 04:16:25  来源:igfitidea点击:

Pandas 'DataFrame' object has no attribute 'unique'

pythonpandaspivot-table

提问by thesebeth

I'm working in pandas doing pivot tables and when doing the groupby (to count distinct observations) aggfunc={"person":{lambda x: len(x.unique())}}gives me the following error: 'DataFrame' object has no attribute 'unique'any ideas how to fix it?

我正在使用 Pandas 做数据透视表,并且在执行 groupby(计算不同的观察值)时 aggfunc={"person":{lambda x: len(x.unique())}}出现以下错误: 'DataFrame' object has no attribute 'unique'任何想法如何解决它?

回答by Alexander

DataFrames do not have that method; columns in DataFrames do:

DataFrames 没有那个方法;DataFrame 中的列执行以下操作:

df['A'].unique()

Or, to get the names with the number of observations (using the DataFrame given by closedloop):

或者,获取具有观察次数的名称(使用闭环给出的数据帧):

>>> df.groupby('person').person.count()
Out[80]: 
person
0         2
1         3
Name: person, dtype: int64

回答by closedloop

Rather than removing duplicates during the pivot table process, use the df.drop_duplicates()function to selectively drop duplicates.

与其在数据透视表过程中删除重复项,不如使用该df.drop_duplicates()函数有选择地删除重复项。

For example if you are pivoting using these index='c0'and columns='c1'then this simple step yields the correct counts.

例如,如果您使用这些进行旋转index='c0'columns='c1'那么这个简单的步骤会产生正确的计数。

In this example the 5th row is a duplicate of the 4th (ignoring the non-pivoted c2column

在此示例中,第 5 行是第 4 行的副本(忽略非透视c2

import pandas as pd
data = {'c0':[0,1,0,1,1], 'c1':[0,0,1,1,1], 'person':[0,0,1,1,1], 'c_other':[1,2,3,4,5]}
df = pd.DataFrame(data)
df2 = df.drop_duplicates(subset=['c0','c1','person'])
pd.pivot_table(df2, index='c0',columns='c1',values='person', aggfunc='count')

This correctly outputs

这正确输出

c1  0  1
c0      
0   1  1
1   1  1