带有 dict 的 Pandas groupby

Question

提问by Christopher Short

Is it possible to use a dict to group on elements of a column?

是否可以使用字典对列的元素进行分组？

For example:

例如：

In [3]: df = pd.DataFrame({'A' : ['one', 'one', 'two', 'three','two', 'two', 'one', 'three'],
   ...:          'B' : np.random.randn(8)})
In [4]: df
Out[4]: 
       A         B
0    one  0.751612
1    one  0.333008
2    two  0.395667
3  three  1.636125
4    two  0.916435
5    two  1.076679
6    one -0.992324
7  three -0.593476

In [5]: d = {'one':'Start', 'two':'Start', 'three':'End'}
In [6]: grouped = df[['A','B']].groupby(d)

This (and other variations) returns an empty groupby object. And my variations on using .applyall fail too.

这个（和其他变体）返回一个空的 groupby 对象。我对使用.applyall 的变化也失败了。

I'd like to match the values of column Ato the keys of the dictionary and put rows into the groups defined by the values. The output would look something like this:

我想将列的值A与字典的键相匹配，并将行放入由值定义的组中。输出看起来像这样：

 Start:
           A         B
    0    one  0.751612
    1    one  0.333008
    2    two  0.395667
    4    two  0.916435
    5    two  1.076679
    6    one -0.992324
End:
           A         B
    3  three  1.636125
    7  three -0.593476

Answer 1

采纳答案by Marius

From the docs, the dict has to map from labelsto group names, so this will work if you put 'A'into the index:

从docs，字典必须从标签映射到组名，所以如果你放入'A'索引，这将起作用：

grouped2 = df.set_index('A').groupby(d)
for group_name, data in grouped2:
    print group_name
    print '---------'
    print data

# Output:
End
---------
              B
A              
three -1.234795
three  0.239209

Start
---------
            B
A            
one -1.924156
one  0.506046
two -1.681980
two  0.605248
two -0.861364
one  0.800431

Column names and row indices are both labels, whereas before you put 'A'into the index, the elements of 'A'are values.

列名和行索引都是label，而在放入'A'索引之前，的元素'A'是values。

If you have other info in the index that makes doing a set_index()tricky, you can just create a grouping column with map():

如果您在索引中有其他信息使操作变得set_index()棘手，您可以使用以下内容创建一个分组列map()：

df['group'] = df['A'].map(d)
grouped3 = df.groupby('group')

Answer 2

回答by David Robinson

You can group with a dictionary, but (as with any group by operation) you need to set the index column first.

您可以使用字典进行分组，但是（与任何分组操作一样）您需要先设置索引列。

grouped = df.set_index("A").groupby(d)

list(grouped)
# [('End',               B
# A              
# three -1.550727
# three  1.048730
# 
# [2 rows x 1 columns]), ('Start',             B
# A            
# one -1.552152
# one -2.018647
# two -0.968068
# two  0.449016
# two -0.374453
# one  0.116770
# 
# [6 rows x 1 columns])]

带有 dict 的 Pandas groupby

提问by Christopher Short

采纳答案by Marius

回答by David Robinson

相关推荐

最近更新

标签

带有 dict 的 Pandas groupby

提问by Christopher Short

采纳答案by Marius

回答by David Robinson

相关推荐

在 Pandas 中将行名称转换为列

pandas 从 hdf 文件中获取列名（标题）

Python Pandas iterrows() 与以前的值

Python Pandas 多处理应用

相关推荐

最近更新

标签