C++ 如果已知外部和内部参数,则从 2D 图像像素获取 3D 坐标

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/7836134/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-28 17:33:46  来源:igfitidea点击:

Get 3D coordinates from 2D image pixel if extrinsic and intrinsic parameters are known

c++opencvcamera-calibrationhomogenous-transformationpose-estimation

提问by YAHOOOOO

I am doing camera calibration from tsai algo. I got intrensic and extrinsic matrix, but how can I reconstruct the 3D coordinates from that inormation?

我正在使用蔡算法进行相机校准。我得到了内部和外部矩阵,但是如何从该信息重建 3D 坐标?

enter image description here

enter image description here

1)I can use Gaussian Elimination for find X,Y,Z,W and then points will be X/W , Y/W , Z/W as homogeneous system.

1)我可以使用高斯消元法来查找 X、Y、Z、W,然后点将是 X/W 、 Y/W 、 Z/W 作为齐次系统。

2)I can use the OpenCV documentationapproach:

enter image description here

2)我可以使用 OpenCV 文档方法:

enter image description here

as I know u, v, R, t, I can compute X,Y,Z.

因为我知道uvRt,我可以计算X,Y,Z

However both methods end up in different results that are not correct.

然而,这两种方法最终都会得到不正确的不同结果。

What am I'm doing wrong?

我做错了什么?

采纳答案by Jav_Rock

If you got extrinsic parameters then you got everything. That means that you can have Homography from the extrinsics (also called CameraPose). Pose is a 3x4 matrix, homography is a 3x3 matrix, Hdefined as

如果您有外部参数,那么您就拥有了一切。这意味着您可以从外部(也称为 CameraPose)获得 Homography。Pose 是一个 3x4 矩阵,单应性是一个 3x3 矩阵,H定义为

                   H = K*[r1, r2, t],       //eqn 8.1, Hartley and Zisserman

with Kbeing the camera intrinsic matrix, r1and r2being the first two columns of the rotation matrix, R; tis the translation vector.

其中K是相机内在矩阵,r1r2是旋转矩阵R的前两列;t是平移向量。

Then normalize dividing everything by t3.

然后将所有内容归一化除以t3

What happens to column r3, don't we use it? No, because it is redundant as it is the cross-product of the 2 first columns of pose.

r3会发生什么,我们不使用它吗?不,因为它是多余的,因为它是姿势的前 2 列的叉积。

Now that you have homography, project the points. Your 2d points are x,y. Add them a z=1, so they are now 3d. Project them as follows:

现在你有了单应性,投影点。你的 2d 点是 x,y。添加它们 az=1,所以它们现在是 3d。将它们投影如下:

        p          = [x y 1];
        projection = H * p;                   //project
        projnorm   = projection / p(z);      //normalize

Hope this helps.

希望这可以帮助。

回答by DerekG

As nicely stated in the comments above, projecting 2D image coordinates into 3D "camera space" inherently requires making up the z coordinates, as this information is totally lost in the image. One solution is to assign a dummy value (z = 1) to each of the 2D image space points before projection as answered by Jav_Rock.

正如上面的评论中所说的那样,将 2D 图像坐标投影到 3D“相机空间”本质上需要弥补 z 坐标,因为这些信息在图像中完全丢失了。一种解决方案是在投影之前为每个 2D 图像空间点分配一个虚拟值 (z = 1),正如 Jav_Rock 所回答的那样。

p          = [x y 1];
projection = H * p;                   //project
projnorm   = projection / p(z);      //normalize

One interesting alternative to this dummy solution is to train a model to predict the depth of each point prior to reprojection into 3D camera-space. I tried this method and had a high degree of success using a Pytorch CNN trained on 3D bounding boxes from the KITTI dataset. Would be happy to provide code but it'd be a bit lengthy for posting here.

这个虚拟解决方案的一个有趣的替代方案是训练一个模型,在重新投影到 3D 相机空间之前预测每个点的深度。我尝试了这种方法,并使用在 KITTI 数据集的 3D 边界框上训练的 Pytorch CNN 取得了很大的成功。很乐意提供代码,但在这里发布会有点冗长。