pandas 为熊猫数据框中的两列创建邻接矩阵

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/42806398/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 03:11:59  来源:igfitidea点击:

Create adjacency matrix for two columns in pandas dataframe

pythonpandasdataframe

提问by The Ref

I have a dataframe of the form:

我有一个如下形式的数据框:

index  Name_A  Name_B
  0    Adam    Ben
  1    Chris   David
  2    Adam    Chris
  3    Ben     Chris

And I'd like to obtain the adjacency matrix for Name_Aand Name_B, ie:

我想获得Name_Aand的邻接矩阵Name_B,即:

      Adam Ben Chris David
Adam   0    1    1     0
Ben    0    0    1     0
Chris  0    0    0     1
David  0    0    0     0

What is the most pythonic/scaleable way of tackling this?

解决这个问题的最pythonic/可扩展的方法是什么?

EDIT:Also, I know that if the row Adam, Benis in the dataset, then at some other point, Ben, Adamwill also be in the dataset.

编辑:另外,我知道如果该行在Adam, Ben数据集中,那么在其他时候,Ben, Adam也将在数据集中。

回答by jezrael

You can use crosstaband then reindexby unionof column and index values:

您可以使用crosstaband then reindexby unionof 列和索引值:

df = pd.crosstab(df.Name_A, df.Name_B)
print (df)
Name_B  Ben  Chris  David
Name_A                   
Adam      1      1      0
Ben       0      1      0
Chris     0      0      1

df = pd.crosstab(df.Name_A, df.Name_B)
idx = df.columns.union(df.index)
df = df.reindex(index = idx, columns=idx, fill_value=0)
print (df)
       Adam  Ben  Chris  David
Adam      0    1      1      0
Ben       0    0      1      0
Chris     0    0      0      1
David     0    0      0      0