Pandas：使用多索引数据进行透视

Question

提问by Brendon McLean

I have two dataframes which looks like this:

我有两个数据框，看起来像这样：

rating
   BMW  Fiat  Toyota
0    7     2       3
1    8     1       8
2    9    10       7
3    8     3       9

own
   BMW  Fiat  Toyota
0    1     1       0
1    0     1       1
2    0     0       1
3    0     1       1

I'm ultimately trying to get a pivot table of mean ratingfor usageby brand. Or something like this:

我最终想获得的数据透视表的平均得分为使用的品牌。或者像这样：

            BMW  Fiat  Toyota
Usage                        
0      8.333333    10       3
1      7.000000     2       8

My approach was to merge the datasets like this:

我的方法是像这样合并数据集：

Measure  Rating                Own              
Brand       BMW  Fiat  Toyota  BMW  Fiat  Toyota
0             7     2       3    1     1       0
1             8     1       8    0     1       1
2             9    10       7    0     0       1
3             8     3       9    0     1       1

And then attempt to create a pivot table using ratingas the value, ownas the rows and brandas the columns. But I kept running to key issues. I have also attempted unstacking either the measure or brand levels, but I can't seem to use row index names as pivot keys.

然后尝试创建一个使用评级作为值、拥有作为行和品牌作为列的数据透视表。但我一直在跑到关键问题上。我还尝试对度量或品牌级别进行拆分，但我似乎无法将行索引名称用作枢轴键。

What am I doing wrong? Is there a better approach to this?

我究竟做错了什么？有没有更好的方法来解决这个问题？

Answer 1

采纳答案by Roman Pekar

I'm not an expert in Pandas, so the solution may be more clumsy than you want, but:

我不是 Pandas 的专家，所以解决方案可能比你想要的更笨拙，但是：

rating = pd.DataFrame({"BMW":[7, 8, 9, 8], "Fiat":[2, 1, 10, 3], "Toyota":[3, 8, 7,9]})
own = pd.DataFrame({"BMW":[1, 0, 0, 0], "Fiat":[1, 1, 0, 1], "Toyota":[0, 1, 1, 1]})

r = rating.unstack().reset_index(name='value')
o = own.unstack().reset_index(name='value')
res = DataFrame({"Brand":r["level_0"], "Rating": r["value"], "Own": o["value"]})
res = res.groupby(["Own", "Brand"]).mean().reset_index()
res.pivot(index="Own", columns="Brand", values="Rating")

# result
# Brand       BMW  Fiat  Toyota
# Own                          
# 0      8.333333    10       3
# 1      7.000000     2       8

another solution, although not very much generalizable (you can use for loop, but you have to know which values do you have in owndataframe):

另一个解决方案，虽然不是很普遍（您可以使用 for 循环，但您必须知道own数据帧中有哪些值）：

d = []
for o in (0, 1):
    t = rating[own == o]
    t["own"] = o
    d.append(t)

res = pd.concat(d).groupby("own").mean()

Answer 2

回答by Brendon McLean

I have a new answer to my own question (based on Roman's initial answer). The key is to get the index at the required dimensionality. For example

我对自己的问题有了新的答案（基于 Roman 的初始答案）。关键是获取所需维度的索引。例如

rating.columns.names = ["Brand"]
rating.index.names = ["n"]
print rating

Brand  BMW  Fiat  Toyota
n                       
0        7     2       3
1        8     1       8
2        9    10       7
3        8     3       9

own.columns.names = ["Brand"]
own.index.names = ["n"]
print own

Brand  BMW  Fiat  Toyota
n                       
0        1     1       0
1        0     1       1
2        0     0       1
3        0     1       1

merged = pd.merge(own.unstack().reset_index(name="Own"), 
                  rating.unstack().reset_index(name="Rating"))
print merged

     Brand  n  Own  Rating
0      BMW  0    1       7
1      BMW  1    0       8
2      BMW  2    0       9
3      BMW  3    0       8
4     Fiat  0    1       2
5     Fiat  1    1       1
6     Fiat  2    0      10
7     Fiat  3    1       3
8   Toyota  0    0       3
9   Toyota  1    1       8
10  Toyota  2    1       7
11  Toyota  3    1       9

Then it's easy to use the pivot_tablecommand to turn this into the desired result:

然后很容易使用pivot_table命令将其转换为所需的结果：

print merged.pivot_table(rows="Brand", cols="Own", values="Rating")

Own             0  1
Brand               
BMW      8.333333  7
Fiat    10.000000  2
Toyota   3.000000  8

And that is what I was looking for. Thanks again to Roman for pointing the way.

这就是我一直在寻找的。再次感谢罗曼指路。

Pandas：使用多索引数据进行透视

提问by Brendon McLean

采纳答案by Roman Pekar

回答by Brendon McLean

相关推荐

最近更新

标签

Pandas：使用多索引数据进行透视

提问by Brendon McLean

采纳答案by Roman Pekar

回答by Brendon McLean

相关推荐

pandas 用另一个系列的值覆盖（更新）一个熊猫系列？

pandas 如何将熊猫数据帧行快速转换为ordereddict

比较 2 个不同的 Pandas 数据帧的 2 列，如果相同，则在 Python 中将 1 插入另一个

pandas 如何使用熊猫按周对数据透视表结果进行分组？

相关推荐

最近更新

标签