Python PIL 中的透视转换是如何工作的?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/14177744/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How does perspective transformation work in PIL?
提问by Hedge
PIL's Image.transformhas a perspective-mode which requires an 8-tuple of data but I can't figure out how to convert let's say a right tilt of 30 degrees to that tuple.
PILImage.transform有一个透视模式,它需要一个 8 元组的数据,但我不知道如何转换,比如说向右倾斜 30 度到那个元组。
Can anyone explain it?
谁能解释一下?
采纳答案by mmgp
To apply a perspective transformation you first have to know four points in a plane A that will be mapped to four points in a plane B. With those points, you can derive the homographic transform. By doing this, you obtain your 8 coefficients and the transformation can take place.
要应用透视变换,您首先必须知道平面 A 中的四个点将映射到平面 B 中的四个点。通过这些点,您可以导出单应变换。通过这样做,您可以获得 8 个系数并且可以进行转换。
The site http://xenia.media.mit.edu/~cwren/interpolator/(mirror: WebArchive), as well as many other texts, describes how those coefficients can be determined. To make things easy, here is a direct implementation according from the mentioned link:
站点http://xenia.media.mit.edu/~cwren/interpolator/(镜像:WebArchive)以及许多其他文本描述了如何确定这些系数。为方便起见,这里是根据上述链接的直接实现:
import numpy
def find_coeffs(pa, pb):
matrix = []
for p1, p2 in zip(pa, pb):
matrix.append([p1[0], p1[1], 1, 0, 0, 0, -p2[0]*p1[0], -p2[0]*p1[1]])
matrix.append([0, 0, 0, p1[0], p1[1], 1, -p2[1]*p1[0], -p2[1]*p1[1]])
A = numpy.matrix(matrix, dtype=numpy.float)
B = numpy.array(pb).reshape(8)
res = numpy.dot(numpy.linalg.inv(A.T * A) * A.T, B)
return numpy.array(res).reshape(8)
where pbis the four vertices in the current plane, and pacontains four vertices in the resulting plane.
其中pb是当前平面pa中的四个顶点,并包含结果平面中的四个顶点。
So, suppose we transform an image as in:
因此,假设我们将图像转换为:
import sys
from PIL import Image
img = Image.open(sys.argv[1])
width, height = img.size
m = -0.5
xshift = abs(m) * width
new_width = width + int(round(xshift))
img = img.transform((new_width, height), Image.AFFINE,
(1, m, -xshift if m > 0 else 0, 0, 1, 0), Image.BICUBIC)
img.save(sys.argv[2])
Here is a sample input and output with the code above:
以下是使用上述代码的示例输入和输出:




We can continue on the last code and perform a perspective transformation to revert the shear:
我们可以继续上一个代码并执行透视变换以恢复剪切:
coeffs = find_coeffs(
[(0, 0), (256, 0), (256, 256), (0, 256)],
[(0, 0), (256, 0), (new_width, height), (xshift, height)])
img.transform((width, height), Image.PERSPECTIVE, coeffs,
Image.BICUBIC).save(sys.argv[3])
Resulting in:
导致:


You can also have some fun with the destination points:
您还可以通过目的地点获得一些乐趣:




回答by David Wolever
I'm going to hiHyman this question just a tiny bitbecause it's the only thing on Google pertaining to perspective transformations in Python. Here is some slightly more general code based on the above which creates a perspective transform matrix and generates a function which will run that transform on arbitrary points:
我将稍微劫持这个问题,因为它是 Google 上唯一与 Python 中的透视转换有关的问题。下面是一些基于上面的稍微更通用的代码,它创建一个透视变换矩阵并生成一个函数,该函数将在任意点上运行该变换:
import numpy as np
def create_perspective_transform_matrix(src, dst):
""" Creates a perspective transformation matrix which transforms points
in quadrilateral ``src`` to the corresponding points on quadrilateral
``dst``.
Will raise a ``np.linalg.LinAlgError`` on invalid input.
"""
# See:
# * http://xenia.media.mit.edu/~cwren/interpolator/
# * http://stackoverflow.com/a/14178717/71522
in_matrix = []
for (x, y), (X, Y) in zip(src, dst):
in_matrix.extend([
[x, y, 1, 0, 0, 0, -X * x, -X * y],
[0, 0, 0, x, y, 1, -Y * x, -Y * y],
])
A = np.matrix(in_matrix, dtype=np.float)
B = np.array(dst).reshape(8)
af = np.dot(np.linalg.inv(A.T * A) * A.T, B)
return np.append(np.array(af).reshape(8), 1).reshape((3, 3))
def create_perspective_transform(src, dst, round=False, splat_args=False):
""" Returns a function which will transform points in quadrilateral
``src`` to the corresponding points on quadrilateral ``dst``::
>>> transform = create_perspective_transform(
... [(0, 0), (10, 0), (10, 10), (0, 10)],
... [(50, 50), (100, 50), (100, 100), (50, 100)],
... )
>>> transform((5, 5))
(74.99999999999639, 74.999999999999957)
If ``round`` is ``True`` then points will be rounded to the nearest
integer and integer values will be returned.
>>> transform = create_perspective_transform(
... [(0, 0), (10, 0), (10, 10), (0, 10)],
... [(50, 50), (100, 50), (100, 100), (50, 100)],
... round=True,
... )
>>> transform((5, 5))
(75, 75)
If ``splat_args`` is ``True`` the function will accept two arguments
instead of a tuple.
>>> transform = create_perspective_transform(
... [(0, 0), (10, 0), (10, 10), (0, 10)],
... [(50, 50), (100, 50), (100, 100), (50, 100)],
... splat_args=True,
... )
>>> transform(5, 5)
(74.99999999999639, 74.999999999999957)
If the input values yield an invalid transformation matrix an identity
function will be returned and the ``error`` attribute will be set to a
description of the error::
>>> tranform = create_perspective_transform(
... np.zeros((4, 2)),
... np.zeros((4, 2)),
... )
>>> transform((5, 5))
(5.0, 5.0)
>>> transform.error
'invalid input quads (...): Singular matrix
"""
try:
transform_matrix = create_perspective_transform_matrix(src, dst)
error = None
except np.linalg.LinAlgError as e:
transform_matrix = np.identity(3, dtype=np.float)
error = "invalid input quads (%s and %s): %s" %(src, dst, e)
error = error.replace("\n", "")
to_eval = "def perspective_transform(%s):\n" %(
splat_args and "*pt" or "pt",
)
to_eval += " res = np.dot(transform_matrix, ((pt[0], ), (pt[1], ), (1, )))\n"
to_eval += " res = res / res[2]\n"
if round:
to_eval += " return (int(round(res[0][0])), int(round(res[1][0])))\n"
else:
to_eval += " return (res[0][0], res[1][0])\n"
locals = {
"transform_matrix": transform_matrix,
}
locals.update(globals())
exec to_eval in locals, locals
res = locals["perspective_transform"]
res.matrix = transform_matrix
res.error = error
return res
回答by Karim Bahgat
Here is a pure-Python versionof generating the transform coefficients (as I've seen this requested by several). I made and used it for making the PyDrawpure-Python image drawing package.
这是生成变换系数的纯 Python 版本(正如我所看到的,有几个人要求这样做)。我制作并使用它来制作PyDraw纯 Python 图像绘制包。
If using it for your own project, note that the calculations requires several advanced matrix operations which means that this function requires another, luckily pure-Python, matrix library called matfuncoriginally written by Raymond Hettinger and which you can download hereor here.
如果将它用于您自己的项目,请注意计算需要几个高级矩阵运算,这意味着该函数需要另一个幸运的纯 Python 矩阵库,该库matfunc最初由 Raymond Hettinger 编写,您可以在此处或此处下载。
import matfunc as mt
def perspective_coefficients(self, oldplane, newplane):
"""
Calculates and returns the transform coefficients needed for a perspective
transform, ie tilting an image in 3D.
Note: it is not very obvious how to set the oldplane and newplane arguments
in order to tilt an image the way one wants. Need to make the arguments more
user-friendly and handle the oldplane/newplane behind the scenes.
Some hints on how to do that at http://www.cs.utexas.edu/~fussell/courses/cs384g/lectures/lecture20-Z_buffer_pipeline.pdf
| **option** | **description**
| --- | ---
| oldplane | a list of four old xy coordinate pairs
| newplane | four points in the new plane corresponding to the old points
"""
# first find the transform coefficients, thanks to http://stackoverflow.com/questions/14177744/how-does-perspective-transformation-work-in-pil
pb,pa = oldplane,newplane
grid = []
for p1,p2 in zip(pa, pb):
grid.append([p1[0], p1[1], 1, 0, 0, 0, -p2[0]*p1[0], -p2[0]*p1[1]])
grid.append([0, 0, 0, p1[0], p1[1], 1, -p2[1]*p1[0], -p2[1]*p1[1]])
# then do some matrix magic
A = mt.Matrix(grid)
B = mt.Vec([xory for xy in pb for xory in xy])
AT = A.tr()
ATA = AT.mmul(A)
gridinv = ATA.inverse()
invAT = gridinv.mmul(AT)
res = invAT.mmul(B)
a,b,c,d,e,f,g,h = res.flatten()
# finito
return a,b,c,d,e,f,g,h
回答by Amir
The 8 transform coefficients (a, b, c, d, e, f, g, h) correspond to the following transformation:
8个变换系数(a、b、c、d、e、f、g、h)对应如下变换:
x' = (ax + by + c) / (gx + hy + 1)
y' = (dx + ey + f) / (gx + hy + 1)
x' = (a x + by + c) / (g x + hy + 1)
y' = (d x + ey + f) / (g x + hy + 1)
These 8 coefficients can in general be found from solving 8 (linear) equations that define how 4 points on the plane transform (4 points in 2D -> 8 equations), see the answer by mmgp for a code that solves this, although you might find it a tad more accurate to change the line
这 8 个系数通常可以从求解 8 个(线性)方程中找到,这些方程定义了平面上的 4 个点如何变换(2D 中的 4 个点 -> 8 个方程),请参阅 mmgp 的答案以获取解决此问题的代码,尽管您可能发现改变线路更准确
res = numpy.dot(numpy.linalg.inv(A.T * A) * A.T, B)
to
到
res = numpy.linalg.solve(A, B)
i.e., there is no real reason to actually invert the A matrix there, or to multiply it by its transpose and losing a bit of accuracy, in order to solve the equations.
即,没有真正的理由在那里实际反转 A 矩阵,或者将它乘以它的转置并失去一点精度,以解决方程。
As for your question, for a simple tilt of theta degrees around (x0, y0), the coefficients you are looking for are:
至于你的问题,对于 (x0, y0) 附近的θ度的简单倾斜,你正在寻找的系数是:
def find_rotation_coeffs(theta, x0, y0):
ct = cos(theta)
st = sin(theta)
return np.array([ct, -st, x0*(1-ct) + y0*st, st, ct, y0*(1-ct)-x0*st,0,0])
And in general any Affine transformation must have (g, h) equal to zero. Hope that helps!
一般而言,任何仿射变换都必须使 (g, h) 为零。希望有帮助!

