C++ 来自 cv::solvePnP 的世界坐标中的相机位置

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/18637494/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-27 22:05:00  来源:igfitidea点击:

Camera position in world coordinate from cv::solvePnP

c++openglopencvcomputer-visionpose-estimation

提问by nkint

I have a calibrated camera (intrinsic matrix and distortion coefficients) and I want to know the camera position knowing some 3d points and their corresponding points in the image (2d points).

我有一个校准的相机(内在矩阵和失真系数),我想知道相机位置,知道图像中的一些 3d 点及其对应点(2d 点)。

I know that cv::solvePnPcould help me, and after reading thisand thisI understand that I the outputs of solvePnP rvecand tvecare the rotation and translation of the object in camera coordinate system.

我知道,cv::solvePnP能帮助我,看完这个这个我明白solvePnP的,我的产出rvectvec是对象在相机的旋转和平移坐标系。

So I need to find out the camera rotation/translation in the world coordinate system.

所以我需要找出世界坐标系中的相机旋转/平移。

From the links above it seems that the code is straightforward, in python:

从上面的链接看来,代码很简单,在 python 中:

found,rvec,tvec = cv2.solvePnP(object_3d_points, object_2d_points, camera_matrix, dist_coefs)
rotM = cv2.Rodrigues(rvec)[0]
cameraPosition = -np.matrix(rotM).T * np.matrix(tvec)

I don't know python/numpy stuffs (I'm using C++) but this does not make a lot of sense to me:

我不知道 python/numpy 的东西(我使用的是 C++),但这对我来说没有多大意义:

  • rvec, tvec output from solvePnP are 3x1 matrix, 3 element vectors
  • cv2.Rodrigues(rvec) is a 3x3 matrix
  • cv2.Rodrigues(rvec)[0] is a 3x1 matrix, 3 element vectors
  • cameraPosition is a 3x1 * 1x3 matrix multiplication that is a.. 3x3 matrix. how can I use this in opengl with simple glTranslatefand glRotatecalls?
  • rvec、solvePnP 的 tvec 输出是 3x1 矩阵,3 个元素向量
  • cv2.Rodrigues(rvec) 是一个 3x3 矩阵
  • cv2.Rodrigues(rvec)[0] 是一个 3x1 矩阵,3 个元素向量
  • cameraPosition 是一个 3x1 * 1x3 矩阵乘法,它是一个.. 3x3 矩阵。如何在 opengl 中使用 simpleglTranslatefglRotatecall使用它?

回答by ChronoTrigger

If with "world coordinates" you mean "object coordinates", you have to get the inverse transformation of the result given by the pnp algorithm.

如果“世界坐标”是指“对象坐标”,则必须对 pnp 算法给出的结果进行逆变换。

There is a trick to invert transformation matrices that allows you to save the inversion operation, which is usually expensive, and that explains the code in Python. Given a transformation [R|t], we have that inv([R|t]) = [R'|-R'*t], where R'is the transpose of R. So, you can code (not tested):

有一个反转变换矩阵的技巧,它允许您保存反转操作,这通常很昂贵,并且解释了 Python 中的代码。给定一个变换[R|t],我们有 ,的转置inv([R|t]) = [R'|-R'*t]在哪里。因此,您可以编码(未测试):R'R

cv::Mat rvec, tvec;
solvePnP(..., rvec, tvec, ...);
// rvec is 3x1, tvec is 3x1

cv::Mat R;
cv::Rodrigues(rvec, R); // R is 3x3

R = R.t();  // rotation of inverse
tvec = -R * tvec; // translation of inverse

cv::Mat T = cv::Mat::eye(4, 4, R.type()); // T is 4x4
T( cv::Range(0,3), cv::Range(0,3) ) = R * 1; // copies R into T
T( cv::Range(0,3), cv::Range(3,4) ) = tvec * 1; // copies tvec into T

// T is a 4x4 matrix with the pose of the camera in the object frame

Update:Later, to use Twith OpenGL you have to keep in mind that the axes of the camera frame differ between OpenCV and OpenGL.

更新:稍后,要T与 OpenGL一起使用,您必须记住 OpenCV 和 OpenGL 之间的相机框架轴不同。

OpenCV uses the reference usually used in computer vision: X points to the right, Y down, Z to the front (as in this image). The frame of the camera in OpenGL is: X points to the right, Y up, Z to the back (as in the left hand side of this image). So, you need to apply a rotation around X axis of 180 degrees. The formula of this rotation matrix is in wikipedia.

OpenCV 使用计算机视觉中常用的参考:X 指向右侧,Y 指向下方,Z 指向前方(如图所示)。OpenGL 中相机的框架是:X 指向右边,Y 指向上,Z 指向后面(如该图像的左侧)。因此,您需要绕 X 轴旋转 180 度。这个旋转矩阵的公式在维基百科中

// T is your 4x4 matrix in the OpenCV frame
cv::Mat RotX = ...; // 4x4 matrix with a 180 deg rotation around X
cv::Mat Tgl = T * RotX; // OpenGL camera in the object frame

These transformations are always confusing and I may be wrong at some step, so take this with a grain of salt.

这些转换总是令人困惑,我可能在某些步骤上是错的,所以请谨慎对待。

Finally, take into account that matrices in OpenCV are stored in row-major order in memory, and OpenGL ones, in column-major order.

最后,考虑到 OpenCV 中的矩阵在内存中以行优先顺序存储,而在 OpenGL 中以列优先顺序存储。

回答by Hammer

If you want to turn it into a standard 4x4 pose matrix specifying the position of your camera. Use rotM as the top left 3x3 square, tvec as the 3 elements on the right, and 0,0,0,1 as the bottom row

如果你想把它变成一个标准的 4x4 姿势矩阵,指定你的相机的位置。使用 rotM 作为左上角的 3x3 正方形,tvec 作为右侧的 3 个元素,0,0,0,1 作为底行

pose = [rotation   tvec(0)
        matrix     tvec(1)
        here       tvec(2)
        0  , 0, 0,  1]

then invert it (to get pose of camera instead of pose of world)

然后反转它(以获得相机的姿势而不是世界的姿势)