Python 如何从pandas中的groupby对象中选择列？

Question

提问by

I grouped my dataframe by the two columns below

我按下面的两列对我的数据框进行了分组

df = pd.DataFrame({'a': [1, 1, 3],
                   'b': [4.0, 5.5, 6.0],
                   'c': [7L, 8L, 9L],
                   'name': ['hello', 'hello', 'foo']})
df.groupby(['a', 'name']).median()

and the result is:

结果是：

            b    c
a name            
1 hello  4.75  7.5
3 foo    6.00  9.0

How can I access the namefield of the resulting median (in this case hello, foo)? This fails:

如何访问name结果中位数的字段（在本例中hello, foo）？这失败了：

df.groupby(['a', 'name']).median().name

Answer 1

采纳答案by EdChum

You need to get the index values, they are not columns. In this case level 1

您需要获取索引值，它们不是列。在这种情况下，级别 1

df.groupby(["a", "name"]).median().index.get_level_values(1)

Out[2]:

Index([u'hello', u'foo'], dtype=object)

You can also pass the index name

您还可以传递索引名称

df.groupby(["a", "name"]).median().index.get_level_values('name')

as this will be more intuitive than passing integer values.

因为这比传递整数值更直观。

You can convert the index values to a list by calling tolist()

您可以通过调用将索引值转换为列表 tolist()

df.groupby(["a", "name"]).median().index.get_level_values(1).tolist()

Out[5]:

['hello', 'foo']

Answer 2

回答by cwharland

You can also reset_index()on your groupby result to get back a dataframe with the name column now accessible.

您还可以reset_index()在 groupby 结果上取回名称列现在可访问的数据框。

import pandas as pd
df = pd.DataFrame({"a":[1,1,3], "b":[4,5.5,6], "c":[7,8,9], "name":["hello","hello","foo"]})
df_grouped = df.groupby(["a", "name"]).median().reset_index()
df_grouped.name
 0    hello
 1      foo
 Name: name, dtype: object

If you perform an operation on a single column the return will be a series with multiindex and you can simply apply pd.DataFrameto it and then reset_index.

如果您对单个列执行操作，则返回将是一个具有多pd.DataFrame索引的系列，您可以简单地对其应用然后 reset_index。

Answer 3

回答by proutray

Set as_index = Falseduring groupby

as_index = False在 groupby 期间设置

df = pandas.DataFrame({"a":[1,1,3], "b":[4,5.5,6], "c":[7,8,9], "name":["hello","hello","foo"]})
df.groupby(["a", "name"] , as_index = False).median()

Answer 4

回答by Mina

Using reset_index() after the group by will do the trick:

在 group by 之后使用 reset_index() 可以解决问题：

df = pd.DataFrame({'a': [1, 1, 3],
                   'b': [4.0, 5.5, 6.0],
                   'c': ['7L', '8L', '9L'],
                   'name': ['hello', 'hello', 'foo']})
df.groupby(['a', 'name']).median().reset_index().name

here is the result:

结果如下：

 0    hello
 1      foo
 Name: name, dtype: object

and if you want the list of the values, you can simply:

如果您想要值列表，您可以简单地：

df = pd.DataFrame({'a': [1, 1, 3],
                   'b': [4.0, 5.5, 6.0],
                   'c': ['7L', '8L', '9L'],
                   'name': ['hello', 'hello', 'foo']})

df.groupby(['a', 'name']).median().reset_index().name.values

The result of using values will be a list containing the values for the name column. The code above returns the following list as the results:

使用值的结果将是一个包含名称列值的列表。上面的代码返回以下列表作为结果：

array(['hello', 'foo'], dtype=object)

Python 如何从pandas中的groupby对象中选择列？

提问by

采纳答案by EdChum

回答by cwharland

回答by proutray

回答by Mina

相关推荐

最近更新

标签

Python 如何从pandas中的groupby对象中选择列？

提问by

采纳答案by EdChum

回答by cwharland

回答by proutray

回答by Mina

相关推荐

Python 是否需要范围（len（a））？

如何检查 MySQL 连接是否在 Python 中打开？

Python骰子滚动模拟

Python 如何使用两点的 x 和 y 坐标绘制一条线？

相关推荐

最近更新

标签