C++ 如果已知外部和内部参数,则从 2D 图像像素获取 3D 坐标
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/7836134/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Get 3D coordinates from 2D image pixel if extrinsic and intrinsic parameters are known
提问by YAHOOOOO
I am doing camera calibration from tsai algo. I got intrensic and extrinsic matrix, but how can I reconstruct the 3D coordinates from that inormation?
我正在使用蔡算法进行相机校准。我得到了内部和外部矩阵,但是如何从该信息重建 3D 坐标?
1)I can use Gaussian Elimination for find X,Y,Z,W and then points will be X/W , Y/W , Z/W as homogeneous system.
1)我可以使用高斯消元法来查找 X、Y、Z、W,然后点将是 X/W 、 Y/W 、 Z/W 作为齐次系统。
2)I can use the
OpenCV documentationapproach:
2)我可以使用
OpenCV 文档方法:
as I know u
, v
, R
, t
, I can compute X,Y,Z
.
因为我知道u
,v
,R
,t
,我可以计算X,Y,Z
。
However both methods end up in different results that are not correct.
然而,这两种方法最终都会得到不正确的不同结果。
What am I'm doing wrong?
我做错了什么?
采纳答案by Jav_Rock
If you got extrinsic parameters then you got everything. That means that you can have Homography from the extrinsics (also called CameraPose). Pose is a 3x4 matrix, homography is a 3x3 matrix, Hdefined as
如果您有外部参数,那么您就拥有了一切。这意味着您可以从外部(也称为 CameraPose)获得 Homography。Pose 是一个 3x4 矩阵,单应性是一个 3x3 矩阵,H定义为
H = K*[r1, r2, t], //eqn 8.1, Hartley and Zisserman
with Kbeing the camera intrinsic matrix, r1and r2being the first two columns of the rotation matrix, R; tis the translation vector.
其中K是相机内在矩阵,r1和r2是旋转矩阵R的前两列;t是平移向量。
Then normalize dividing everything by t3.
然后将所有内容归一化除以t3。
What happens to column r3, don't we use it? No, because it is redundant as it is the cross-product of the 2 first columns of pose.
列r3会发生什么,我们不使用它吗?不,因为它是多余的,因为它是姿势的前 2 列的叉积。
Now that you have homography, project the points. Your 2d points are x,y. Add them a z=1, so they are now 3d. Project them as follows:
现在你有了单应性,投影点。你的 2d 点是 x,y。添加它们 az=1,所以它们现在是 3d。将它们投影如下:
p = [x y 1];
projection = H * p; //project
projnorm = projection / p(z); //normalize
Hope this helps.
希望这可以帮助。
回答by DerekG
As nicely stated in the comments above, projecting 2D image coordinates into 3D "camera space" inherently requires making up the z coordinates, as this information is totally lost in the image. One solution is to assign a dummy value (z = 1) to each of the 2D image space points before projection as answered by Jav_Rock.
正如上面的评论中所说的那样,将 2D 图像坐标投影到 3D“相机空间”本质上需要弥补 z 坐标,因为这些信息在图像中完全丢失了。一种解决方案是在投影之前为每个 2D 图像空间点分配一个虚拟值 (z = 1),正如 Jav_Rock 所回答的那样。
p = [x y 1];
projection = H * p; //project
projnorm = projection / p(z); //normalize
One interesting alternative to this dummy solution is to train a model to predict the depth of each point prior to reprojection into 3D camera-space. I tried this method and had a high degree of success using a Pytorch CNN trained on 3D bounding boxes from the KITTI dataset. Would be happy to provide code but it'd be a bit lengthy for posting here.
这个虚拟解决方案的一个有趣的替代方案是训练一个模型,在重新投影到 3D 相机空间之前预测每个点的深度。我尝试了这种方法,并使用在 KITTI 数据集的 3D 边界框上训练的 Pytorch CNN 取得了很大的成功。很乐意提供代码,但在这里发布会有点冗长。