pandas 为熊猫数据框中的两列创建邻接矩阵
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/42806398/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Create adjacency matrix for two columns in pandas dataframe
提问by The Ref
I have a dataframe of the form:
我有一个如下形式的数据框:
index Name_A Name_B
0 Adam Ben
1 Chris David
2 Adam Chris
3 Ben Chris
And I'd like to obtain the adjacency matrix for Name_A
and Name_B
, ie:
我想获得Name_A
and的邻接矩阵Name_B
,即:
Adam Ben Chris David
Adam 0 1 1 0
Ben 0 0 1 0
Chris 0 0 0 1
David 0 0 0 0
What is the most pythonic/scaleable way of tackling this?
解决这个问题的最pythonic/可扩展的方法是什么?
EDIT:Also, I know that if the row Adam, Ben
is in the dataset, then at some other point, Ben, Adam
will also be in the dataset.
编辑:另外,我知道如果该行在Adam, Ben
数据集中,那么在其他时候,Ben, Adam
也将在数据集中。
回答by jezrael
You can use crosstab
and then reindex
by union
of column and index values:
您可以使用crosstab
and then reindex
by union
of 列和索引值:
df = pd.crosstab(df.Name_A, df.Name_B)
print (df)
Name_B Ben Chris David
Name_A
Adam 1 1 0
Ben 0 1 0
Chris 0 0 1
df = pd.crosstab(df.Name_A, df.Name_B)
idx = df.columns.union(df.index)
df = df.reindex(index = idx, columns=idx, fill_value=0)
print (df)
Adam Ben Chris David
Adam 0 1 1 0
Ben 0 0 1 0
Chris 0 0 0 1
David 0 0 0 0