pandas 计算两个熊猫数据帧的行之间的欧几里德距离
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/47782104/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Compute Euclidean distance between rows of two pandas dataframes
提问by j1897
I have two pandas dataframes d1and d2that look like these:
我有两个Pandasdataframesd1和d2看起来像这些:
d1looks like:
d1好像:
output value1 value2 value2
1 100 103 87
1 201 97.5 88.9
1 144 54 85
d2looks like:
d2好像:
output value1 value2 value2
0 100 103 87
0 201 97.5 88.9
0 144 54 85
0 100 103 87
0 201 97.5 88.9
0 144 54 85
The column output has a value of 1 for all rows in d1 and 0 for all rows in d2. It's a grouping variable. I need to find euclidean distance between each rows of d1 and d2 (not within d1 or d2). If d1has mrows and d2has nrows, then the distance matrix will have mrows and n columns
d1 中所有行的列输出值为 1,d2 中所有行的列输出值为 0。这是一个分组变量。我需要找到 d1 和 d2 的每一行之间的欧几里得距离(不在 d1 或 d2 内)。如果d1有m行,d2有n行,那么距离矩阵有m行n列
回答by YOBEN_S
By using scipy.spatial.distance.cdist:
通过使用scipy.spatial.distance.cdist:
import scipy
ary = scipy.spatial.distance.cdist(d1.iloc[:,1:], d2.iloc[:,1:], metric='euclidean')
pd.DataFrame(ary)
Out[1274]:
0 1 2 3 4 5
0 0.000000 101.167485 65.886266 0.000000 101.167485 65.886266
1 101.167485 0.000000 71.808495 101.167485 0.000000 71.808495
2 65.886266 71.808495 0.000000 65.886266 71.808495 0.000000

