pandas 计算两个熊猫数据帧的行之间的欧几里德距离
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/47782104/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Compute Euclidean distance between rows of two pandas dataframes
提问by j1897
I have two pandas dataframes d1
and d2
that look like these:
我有两个Pandasdataframesd1
和d2
看起来像这些:
d1
looks like:
d1
好像:
output value1 value2 value2
1 100 103 87
1 201 97.5 88.9
1 144 54 85
d2
looks like:
d2
好像:
output value1 value2 value2
0 100 103 87
0 201 97.5 88.9
0 144 54 85
0 100 103 87
0 201 97.5 88.9
0 144 54 85
The column output has a value of 1 for all rows in d1 and 0 for all rows in d2. It's a grouping variable. I need to find euclidean distance between each rows of d1 and d2 (not within d1 or d2). If d1
has m
rows and d2
has n
rows, then the distance matrix will have m
rows and n columns
d1 中所有行的列输出值为 1,d2 中所有行的列输出值为 0。这是一个分组变量。我需要找到 d1 和 d2 的每一行之间的欧几里得距离(不在 d1 或 d2 内)。如果d1
有m
行,d2
有n
行,那么距离矩阵有m
行n列
回答by YOBEN_S
By using scipy.spatial.distance.cdist
:
通过使用scipy.spatial.distance.cdist
:
import scipy
ary = scipy.spatial.distance.cdist(d1.iloc[:,1:], d2.iloc[:,1:], metric='euclidean')
pd.DataFrame(ary)
Out[1274]:
0 1 2 3 4 5
0 0.000000 101.167485 65.886266 0.000000 101.167485 65.886266
1 101.167485 0.000000 71.808495 101.167485 0.000000 71.808495
2 65.886266 71.808495 0.000000 65.886266 71.808495 0.000000