Python 如何从pandas中的groupby对象中选择列?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/19202093/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 13:08:02  来源:igfitidea点击:

How to select columns from groupby object in pandas?

pythonpandas

提问by

I grouped my dataframe by the two columns below

我按下面的两列对我的数据框进行了分组

df = pd.DataFrame({'a': [1, 1, 3],
                   'b': [4.0, 5.5, 6.0],
                   'c': [7L, 8L, 9L],
                   'name': ['hello', 'hello', 'foo']})
df.groupby(['a', 'name']).median()

and the result is:

结果是:

            b    c
a name            
1 hello  4.75  7.5
3 foo    6.00  9.0

How can I access the namefield of the resulting median (in this case hello, foo)? This fails:

如何访问name结果中位数的字段(在本例中hello, foo)?这失败了:

df.groupby(['a', 'name']).median().name

采纳答案by EdChum

You need to get the index values, they are not columns. In this case level 1

您需要获取索引值,它们不是列。在这种情况下,级别 1

df.groupby(["a", "name"]).median().index.get_level_values(1)

Out[2]:

Index([u'hello', u'foo'], dtype=object)

You can also pass the index name

您还可以传递索引名称

df.groupby(["a", "name"]).median().index.get_level_values('name')

as this will be more intuitive than passing integer values.

因为这比传递整数值更直观。

You can convert the index values to a list by calling tolist()

您可以通过调用将索引值转换为列表 tolist()

df.groupby(["a", "name"]).median().index.get_level_values(1).tolist()

Out[5]:

['hello', 'foo']

回答by cwharland

You can also reset_index()on your groupby result to get back a dataframe with the name column now accessible.

您还可以reset_index()在 groupby 结果上取回名称列现在可访问的数据框。

import pandas as pd
df = pd.DataFrame({"a":[1,1,3], "b":[4,5.5,6], "c":[7,8,9], "name":["hello","hello","foo"]})
df_grouped = df.groupby(["a", "name"]).median().reset_index()
df_grouped.name
 0    hello
 1      foo
 Name: name, dtype: object

If you perform an operation on a single column the return will be a series with multiindex and you can simply apply pd.DataFrameto it and then reset_index.

如果您对单个列执行操作,则返回将是一个具有多pd.DataFrame索引的系列,您可以简单地对其应用然后 reset_index。

回答by proutray

Set as_index = Falseduring groupby

as_index = False在 groupby 期间设置

df = pandas.DataFrame({"a":[1,1,3], "b":[4,5.5,6], "c":[7,8,9], "name":["hello","hello","foo"]})
df.groupby(["a", "name"] , as_index = False).median()

回答by Mina

Using reset_index() after the group by will do the trick:

在 group by 之后使用 reset_index() 可以解决问题:

df = pd.DataFrame({'a': [1, 1, 3],
                   'b': [4.0, 5.5, 6.0],
                   'c': ['7L', '8L', '9L'],
                   'name': ['hello', 'hello', 'foo']})
df.groupby(['a', 'name']).median().reset_index().name

here is the result:

结果如下:

 0    hello
 1      foo
 Name: name, dtype: object

and if you want the list of the values, you can simply:

如果您想要值列表,您可以简单地:

df = pd.DataFrame({'a': [1, 1, 3],
                   'b': [4.0, 5.5, 6.0],
                   'c': ['7L', '8L', '9L'],
                   'name': ['hello', 'hello', 'foo']})

df.groupby(['a', 'name']).median().reset_index().name.values

The result of using values will be a list containing the values for the name column. The code above returns the following list as the results:

使用值的结果将是一个包含名称列值的列表。上面的代码返回以下列表作为结果:

array(['hello', 'foo'], dtype=object)