pandas groupby 删除列

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/37575944/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 01:19:41  来源:igfitidea点击:

pandas groupby dropping columns

pythonpandasdataframepandas-groupby

提问by user3334415

I'm doing a simple group by operation, trying to compare group means. As you can see below, I have selected specific columns from a larger dataframe, from which all missing values have been removed.

我正在做一个简单的分组操作,试图比较分组的平均值。正如您在下面看到的,我从更大的数据框中选择了特定的列,从中删除了所有缺失值。

selected columns and df head

选定的列和 df 头

But when I group by, I am losing a couple of columns:

但是当我分组时,我丢失了几列:

group-by logic and resulting df

分组逻辑和结果 df

I have never encountered this with pandas, and I'm not finding anything else on stack overflow that is all that similar. Does anybody have any insight?

我从未在 Pandas 中遇到过这种情况,而且我在堆栈溢出中没有发现任何其他类似的东西。有人有任何见解吗?

回答by jezrael

I think it is Automatic exclusion of 'nuisance' columns, what described here.

我想是的Automatic exclusion of 'nuisance' columns这里描述的是什么。

Sample:

样本:

df = pd.DataFrame({'C': {0: -0.91985400000000006, 1: -0.042379, 2: 1.2476419999999999, 3: -0.00992, 4: 0.290213, 5: 0.49576700000000001, 6: 0.36294899999999997, 7: 1.548106}, 'A': {0: 'foo', 1: 'bar', 2: 'foo', 3: 'bar', 4: 'foo', 5: 'bar', 6: 'foo', 7: 'foo'}, 'B': {0: 'one', 1: 'one', 2: 'two', 3: 'three', 4: 'two', 5: 'two', 6: 'one', 7: 'three'}, 'D': {0: -1.131345, 1: -0.089328999999999992, 2: 0.33786300000000002, 3: -0.94586700000000001, 4: -0.93213199999999996, 5: 1.9560299999999999, 6: 0.017587000000000002, 7: -0.016691999999999999}})
print (df)
     A      B         C         D
0  foo    one -0.919854 -1.131345
1  bar    one -0.042379 -0.089329
2  foo    two  1.247642  0.337863
3  bar  three -0.009920 -0.945867
4  foo    two  0.290213 -0.932132
5  bar    two  0.495767  1.956030
6  foo    one  0.362949  0.017587
7  foo  three  1.548106 -0.016692

print( df.groupby('A').mean())
            C         D
A                      
bar  0.147823  0.306945
foo  0.505811 -0.344944

I think you can check DataFrame.dtypes.

我想你可以检查一下DataFrame.dtypes