如何在 Pandas 中创建多索引
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/40236436/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How to create a multi-index in Pandas
提问by Ray
Question
题
There are two questions that look similar but they're not the same question: hereand here. They both call a method of GroupBy, such as count()or aggregate(), which I know returns a DataFrame. What I'm asking is how to convert the GroupBy(class pandas.core.groupby.DataFrameGroupBy) object itself into a DataFrame. I'll illustrate below.
有两个问题看起来很相似,但它们不是同一个问题:here和here。它们都调用 的方法GroupBy,例如count()or aggregate(),我知道它返回一个DataFrame. 我要问的是如何将GroupBy(类pandas.core.groupby.DataFrameGroupBy)对象本身转换为DataFrame. 下面我来举例说明。
Example
例子
Construct an example DataFrameas follows.
构造一个例子DataFrame如下。
data_list = []
for name in ["sasha", "asa"]:
for take in ["one", "two"]:
row = {"name": name, "take": take, "score": numpy.random.rand(), "ping": numpy.random.randint(10, 100)}
data_list.append(row)
data = pandas.DataFrame(data_list)
The above DataFrameshould look like the following (with different numbers obviously).
上面DataFrame应该如下所示(显然数字不同)。
name ping score take
0 sasha 72 0.923263 one
1 sasha 14 0.724720 two
2 asa 76 0.774320 one
3 asa 71 0.128721 two
What I want to do is to group by the columns "name" and "take" (in that order), so that I can get a DataFrameindexed by the multiindex constructed from the columns "name" and "take", like below.
我想要做的是按列“name”和“take”(按该顺序)进行分组,这样我就可以获得DataFrame由“name”和“take”列构造的多索引索引,如下所示。
score ping
name take
sasha one 0.923263 72
two 0.724720 14
asa one 0.774320 76
two 0.128721 71
How do I achieve that? If I do grouped = data.groupby(["name", "take"]), then groupedis a pandas.core.groupby.DataFrameGroupByinstance. What is the correct way of doing this?
我如何做到这一点?如果我这样做grouped = data.groupby(["name", "take"]),那么grouped就是一个pandas.core.groupby.DataFrameGroupBy实例。这样做的正确方法是什么?

