pandas 多列的熊猫 get_level_values

Question

提问by danielhadar

Is there a way to get the result of get_level_valuesfor more than one column?

有没有办法获得get_level_values不止一列的结果？

Given the following DataFrame:

鉴于以下情况DataFrame：

I wish to get the values (i.e.list of tuples) of levels aand c:

我希望获得级别的值（即元组列表）a和c：

[(1, 10), (1, 11), (1, 12), (2, 13), (2, 14), (3, 15)]

Notes:

笔记：

It is impossible to give get_level_valuesmore than one level (e.g.df.index.get_level_values(['a','c'])
There's a workaround in which one could use get_level_valuesover each desired column and zipthem together:

不可能给出get_level_values多个级别（例如df.index.get_level_values(['a','c']）
有一种解决方法，可以将get_level_values每个所需的列和zip它们一起使用：

For example:

例如：

a_list = df.index.get_level_values('a').values
c_list = df.index.get_level_values('c').values

print([i for i in zip(a_list,c_list)])
[(1, 10), (1, 11), (1, 12), (2, 13), (2, 14), (3, 15)]

but it get cumbersome as the number of columns grow.

但随着列数的增加，它变得很麻烦。

The code to build the example DataFrame:

构建示例的代码DataFrame：

df = pd.DataFrame({'a':[1,1,1,2,2,3],'b':[4,4,5,5,6,7,],'c':[10,11,12,13,14,15], 'd':[16,17,18,19,20,21]}).set_index(['a','b','c'])

Answer 1

采纳答案by Alberto Garcia-Raboso

The .tolist()method of a MultiIndexgives a list of tuples for all the levels in the MultiIndex. For example, with your example DataFrame,

a 的.tolist()方法MultiIndex给出了 .a 文件中所有级别的元组列表MultiIndex。例如，用你的例子DataFrame，

df.index.tolist()
# => [(1, 4, 10), (1, 4, 11), (1, 5, 12), (2, 5, 13), (2, 6, 14), (3, 7, 15)]

So here are two ideas:

所以这里有两个想法：

Get the list of tuples from the original MultiIndexand filter the result.

[(a, c) for a, b, c in df.index.tolist()]
# => [(1, 10), (1, 11), (1, 12), (2, 13), (2, 14), (3, 15)]

The disadvantage of this simple method is that you have you manually specify the order of the levels you want. You can leverage itertools.compressto select them by name instead.

from itertools import compress

mask = [1 if name in ['a', 'c'] else 0 for name in df.index.names]
[tuple(compress(t, mask)) for t in df.index.tolist()]
# => [(1, 10), (1, 11), (1, 12), (2, 13), (2, 14), (3, 15)]

Create a MultiIndex that has exactly the levels you want and call .tolist()on it.

df.index.droplevel('b').tolist()
# => [(1, 10), (1, 11), (1, 12), (2, 13), (2, 14), (3, 15)]

If you would prefer to name the levels you want to keep — instead of those that you want to drop — you could do something like

df.index.droplevel([level for level in df.index.names
                if not level in ['a', 'c']]).tolist()
# => [(1, 10), (1, 11), (1, 12), (2, 13), (2, 14), (3, 15)]

从原始获取元组列表MultiIndex并过滤结果。

[(a, c) for a, b, c in df.index.tolist()]
# => [(1, 10), (1, 11), (1, 12), (2, 13), (2, 14), (3, 15)]

这种简单方法的缺点是您必须手动指定所需级别的顺序。您可以itertools.compress改为按名称选择它们。

from itertools import compress

mask = [1 if name in ['a', 'c'] else 0 for name in df.index.names]
[tuple(compress(t, mask)) for t in df.index.tolist()]
# => [(1, 10), (1, 11), (1, 12), (2, 13), (2, 14), (3, 15)]

创建一个完全具有您想要的级别的 MultiIndex 并调用.tolist()它。

df.index.droplevel('b').tolist()
# => [(1, 10), (1, 11), (1, 12), (2, 13), (2, 14), (3, 15)]

如果您更愿意命名您想要保留的级别 - 而不是您想要删除的级别 - 您可以执行以下操作

df.index.droplevel([level for level in df.index.names
                if not level in ['a', 'c']]).tolist()
# => [(1, 10), (1, 11), (1, 12), (2, 13), (2, 14), (3, 15)]

Answer 2

回答by IanS

This is less cumbersome insofar as you can pass the list of index names you want to select:

这不那么麻烦，因为您可以传递要选择的索引名称列表：

df.reset_index()[['a', 'c']].to_dict(orient='split')['data']

I have not found a way of selecting levels 'a'and 'b'from the index object directly, hence the use of reset_index.

我还没有找到一种方法选择水平'a'和'b'索引对象直接，因此，使用的reset_index。

Note that to_dictreturns a list of lists and not tuples:

请注意，to_dict返回列表而不是元组列表：

[[1, 10], [1, 11], [1, 12], [2, 13], [2, 14], [3, 15]]

pandas 多列的熊猫 get_level_values

提问by danielhadar

采纳答案by Alberto Garcia-Raboso

回答by IanS

相关推荐

最近更新

标签

pandas 多列的熊猫 get_level_values

提问by danielhadar

采纳答案by Alberto Garcia-Raboso

回答by IanS

相关推荐

dplyr 由多个函数汇总/聚合的 Pandas 等价物是什么？

为什么 apply 有时并不比 pandas 数据帧中的 for-loop 快？

pandas 如何按行对数据框进行排序？

pandas 使用数据透视表熊猫后如何摆脱多级索引？

相关推荐

最近更新

标签