Python 如何使用 PIL 读取原始图像?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/3397157/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-18 10:50:31  来源:igfitidea点击:

How to read a raw image using PIL?

pythonimageimage-processingpython-imaging-library

提问by Alceu Costa

I have a raw image where each pixel corresponds to a 16 bits unsigned integer. I am trying to read using the PIL Image.fromstring() function as in the following code:

我有一个原始图像,其中每个像素对应一个 16 位无符号整数。我正在尝试使用 PIL Image.fromstring() 函数进行读取,如下面的代码所示:

if __name__ == "__main__":
    if (len(sys.argv) != 4):
        print 'Error: missing input argument'
        sys.exit()

    file = open(sys.argv[1], 'rb')
    rawData = file.read()
    file.close()

    imgSize = (int(sys.argv[2]), int(sys.argv[3]))

    # Use the PIL raw decoder to read the data.
    #   - the 'F;16' informs the raw decoder that we are reading a little endian, unsigned integer 16 bit data.
    img = Image.fromstring('L', imgSize, rawData, 'raw', 'F;16')

    im.save('out.png')

The PIL documentation informs that the first argument of the fromstring() function is 'mode'. However, looking at the documentation and googling I wasn't able to find details about what that argument really means (I believe that it is related to the color space or something like that). Does anyone knows where I can find a more detailed reference about the fromstring() function and what the mode argument means?

PIL 文档通知 fromstring() 函数的第一个参数是“模式”。但是,查看文档和谷歌搜索我无法找到有关该参数真正含义的详细信息(我相信它与色彩空间或类似的东西有关)。有谁知道在哪里可以找到有关 fromstring() 函数以及 mode 参数含义的更详细参考资料?

采纳答案by Katriel

The specific documentation is at http://effbot.org/imagingbook/concepts.htm:

具体文档位于http://effbot.org/imagingbook/concepts.htm

Mode

The mode of an image defines the type and depth of a pixel in the image. The current release supports the following standard modes:

  • 1 (1-bit pixels, black and white, stored with one pixel per byte)
  • L (8-bit pixels, black and white)
  • P (8-bit pixels, mapped to any other mode using a colour palette)
  • RGB (3x8-bit pixels, true colour)
  • RGBA (4x8-bit pixels, true colour with transparency mask)
  • CMYK (4x8-bit pixels, colour separation)
  • YCbCr (3x8-bit pixels, colour video format)
  • I (32-bit signed integer pixels)
  • F (32-bit floating point pixels)

PIL also provides limited support for a few special modes, including LA (L with alpha), RGBX (true colour with padding) and RGBa (true colour with premultiplied alpha).

模式

图像的模式定义了图像中像素的类型和深度。当前版本支持以下标准模式:

  • 1(1位像素,黑白,每字节一个像素存储)
  • L(8 位像素,黑白)
  • P(8 位像素,使用调色板映射到任何其他模式)
  • RGB(3x8 位像素,真彩色)
  • RGBA(4x8 位像素,带透明蒙版的真彩色)
  • CMYK(4x8 位像素,分色)
  • YCbCr(3x8 位像素,彩色视频格式)
  • I(32 位有符号整数像素)
  • F(32 位浮点像素)

PIL 还对一些特殊模式提供有限的支持,包括 LA(带 alpha 的 L)、RGBX(带填充的真彩色)和 RGBa(带预乘 alpha 的真彩色)。

回答by Wayne Werner

Image.frombuffer(mode, size, data) => image

(New in PIL 1.1.4). Creates an image memory from pixel data in a string or buffer object, using the standard "raw" decoder. For some modes, the image memory will share memory with the original buffer (this means that changes to the original buffer object are reflected in the image). Not all modes can share memory; supported modes include "L", "RGBX", "RGBA", and "CMYK". For other modes, this function behaves like a corresponding call to the fromstring function.

Image.frombuffer(mode, size, data) => image

(PIL 1.1.4 中的新内容)。使用标准的“原始”解码器从字符串或缓冲区对象中的像素数据创建图像内存。对于某些模式,图像内存将与原始缓冲区共享内存(这意味着对原始缓冲区对象的更改会反映在图像中)。并非所有模式都可以共享内存;支持的模式包括“L”、“RGBX”、“RGBA”和“CMYK”。对于其他模式,此函数的行为类似于对 fromstring 函数的相应调用。

I'm not sure what "L" stands for, but "RGBA" stands for Red-Green-Blue-Alpha, so I presume RGBX is equivalent to RGB (edit: upon testing this isn't the case)? CMYK is Cyan-Magenta-Yellow-Kelvin, which is another type of colorspace. Of course I assume that if you know about PIL you also know about colorspaces. If not, Wikipediahas a great article.

我不确定“L”代表什么,但“RGBA”代表 Red-Green-Blue-Alpha,所以我认为 RGBX 等同于 RGB(编辑:在测试时并非如此)?CMYK 是 Cyan-Magenta-Yellow-Kelvin,这是另一种颜色空间。当然,我假设如果您了解 PIL,那么您也了解色彩空间。如果没有,维基百科有一篇很棒的文章。

As for what it really means (if that's not enough): pixel values will be encoded differently for each colorspace. In regular RGB you have 3 bytes per pixel - 0-254, 0-254, 0-254. For Alpha you add another byte to each pixel. If you decode an RGB image as RGBA, you'll end out reading the R pixel to the right of the first pixel as your alpha, which means you'll get the G pixel as your R value. This will be magnified depending on how large your image, but it will really make your colors go wonky. Similarly, trying to read a CMYK encoded image as RGB (or RGBA) will make your image look very much not like it's supposed to. For instance, try this with an image:

至于它的真正含义(如果这还不够):对于每个色彩空间,像素值的编码方式都不同。在常规 RGB 中,每个像素有 3 个字节 - 0-254、0-254、0-254。对于 Alpha,您向每个像素添加另一个字节。如果您将 RGB 图像解码为 RGBA,您最终将读取第一个像素右侧的 R 像素作为您的 alpha,这意味着您将获得 G 像素作为您的 R 值。这将根据您的图像有多大而被放大,但它确实会使您的颜色变得不稳定。同样,尝试将 CMYK 编码的图像读取为 RGB(或 RGBA)会使您的图像看起来非常不像它应该的那样。例如,用一张图片试试这个:

i = Image.open('image.png')
imgSize = i.size
rawData = i.tostring()
img = Image.fromstring('L', imgSize, rawData)
img.save('lmode.png')
img = Image.fromstring('RGB', imgSize, rawData)
img.save('rgbmode.png')
img = Image.fromstring('RGBX', imgSize, rawData)
img.save('rgbxmode.jfif')
img = Image.fromstring('RGBA', imgSize, rawData)
img.save('rgbamode.png')
img = Image.fromstring('CMYK', imgSize, rawData)
img.save('rgbamode.tiff')

And you'll see what the different modes do - try it with a variety of input images: png with alpha, png without alpha, bmp, gif, and jpeg. It's kinda a fun experiment, actually.

您将看到不同模式的作用 - 尝试使用各种输入图像:带 alpha 的 png、不带 alpha 的 png、bmp、gif 和 jpeg。实际上,这是一个有趣的实验。

回答by martineau

If all else fails, you can always read the source code. For PIL, the downloads are here.

如果所有其他方法都失败了,您可以随时阅读源代码。对于 PIL,下载在这里

You never said exactly what format the pixel data in the 16 bits unsigned integers was in, but I'd guess it's something like RRRRRGGGGGGBBBBBB, (5-bits Red, 6-bits Green, 5-bits Blue), or RRRRRGGGGGBBBBBA (5-bits Red, 5-bits Green, 5-bits Blue, 1-bit Alpha or Transparency). I didn't see support for those formats after a very quick peek at the some of the sources myself, but can't say one way or the other for sure.

您从未说过 16 位无符号整数中的像素数据的确切格式,但我猜它类似于 RRRRRGGGGGGBBBBBB(5 位红色、6 位绿色、5 位蓝色)或 RRRRRGGGGGBBBBBA(5-位红色、5 位绿色、5 位蓝色、1 位 Alpha 或透明)。在我自己快速浏览了一些来源之后,我没有看到对这些格式的支持,但不能肯定地说是哪种方式。

On the same web page where the PIL downloads are, they mention that one can send questions to the Python Image SIG mailing list and provide a link for it. That might be a better source than asking here.

在 PIL 下载所在的同一网页上,他们提到可以向 Python Image SIG 邮件列表发送问题并提供链接。这可能比在这里提问更好。

Hope this helps.

希望这可以帮助。

回答by matiasg

This is an old question, but this might help someone in the future. One of the problems with the original code snippet is that in Image.fromstring('L', imgSize, rawData, 'raw', 'F;16'), the F;16part works for 'F'mode.

这是一个古老的问题,但这可能会对将来的某人有所帮助。原始代码片段的问题之一是,在 中Image.fromstring('L', imgSize, rawData, 'raw', 'F;16'),该F;16部分适用于'F'模式。

This works for me:

这对我有用:

image = Image.fromstring('F', imgSize, rawData, 'raw', 'F;16')
image.convert('L').save('out.png')