pandas 无论如何要取消分组熊猫数据框中的数据?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/45807794/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 04:17:28  来源:igfitidea点击:

Is there anyway to ungroup data in a grouped-by pandas dataframe?

pythonpandasdataframegroup-bypandas-groupby

提问by Omido

I have a dataset that for simplicity I need to group by and aggregate based on one column so that I can remove some rows easily. Once I am done with the calculations, I need to reverse the group by actions so that I can see the dataframe easily in excel. If I do not inverse the action, I would export the whole list to excel which is not easy to analyse. Any help is gretaly appreciated.

我有一个数据集,为简单起见,我需要根据一列进行分组和聚合,以便我可以轻松删除一些行。完成计算后,我需要按操作反转组,以便我可以在 excel 中轻松查看数据框。如果我不反转操作,我会将整个列表导出到excel,这不容易分析。非常感谢任何帮助。

Example:

例子:

Col1  Col2 Col3
123   11   Yes
123   22   Yes
256   33   Yes
256   33   No
337   00   No
337   44   No

After applying groupby and aggregate:

应用 groupby 和聚合后:

X=dataset.groupby('Col1').agg(lambda x:set(x)).reset_index()

I get

我得到

Col1   Col2      Col3
123   {11,22}   {Yes}
256   {33}      {Yes, No}
337   {00,44}   {No}

I then remove all the columns that contain Yes using drop

然后我使用 drop 删除所有包含 Yes 的列

X=X.reset_index(drop=True)

what I need to get before exporting to excel is

在导出到 excel 之前我需要得到的是

Col1 Col2 Col3
337   00   No
337   44   No

Hope this is clear enough

希望这足够清楚

Thaks in advance

提前谢谢

采纳答案by cs95

I don't believe converting to a set is a good idea. Here's an alternative: First sort in descending order by Col3, then create a mapping of Col2 : Yes/Noand filter based on that.

我不相信转换为集合是一个好主意。这是一个替代方案:首先按降序排序 by Col3,然后Col2 : Yes/No根据它创建一个和 过滤器的映射。

In [1191]: df = df.sort_values('Col3', ascending=True)

In [1192]: mapping = dict(df[['Col2', 'Col3']].values)

In [1193]: df[df.Col2.replace(mapping) == 'No'] # or df.Col2.map(mapping)
Out[1193]: 
   Col1  Col2 Col3
4   337     0   No
5   337    44   No

回答by YOBEN_S

I am agree with COLDSPEED. You do not need convert to set

我同意 COLDSPEED 的观点。您不需要转换为设置

df['Temp']=df.Col3.eq('Yes')
DF=df.groupby('Col1')['Temp'].sum()
df[df.Col1==DF.index[DF==0].values[0]].drop('Temp',axis=1)


Out[113]: 
   Col1  Col2 Col3
4   337     0   No
5   337    44   No