Python 如何显示存储在熊猫数据框中的图像?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/46107348/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 17:29:01  来源:igfitidea点击:

How to display image stored in pandas dataframe?

pythonpandascsvmatplotlib

提问by PiccolMan

import pandas as pd
from scipy import misc
import numpy as np
import matplotlib.pyplot as plt

W = {'img':[misc.imread('pic.jpg')]}
df = pd.DataFrame(W)

# This displays the image
plt.imshow(df.img1[0])
plt.show()

df.to_csv('mypic.csv')
new_df= pd.read_csv('mypic.csv')

# This does not display the image
plt.imshow(new_df.img1[0])
plt.show()

When I try to display the image as loaded by the csv file I obtain the error: Image data can not convert to float. However, I was able to correctly display the image when using the dataframe df.

当我尝试显示由 csv 文件加载的图像时,出现错误:图像数据无法转换为浮点数。但是,我在使用 dataframe 时能够正确显示图像df

I suspect that something went wrong with the data type when I stored df onto a csv file. How would I fix this issue?

当我将 df 存储到 csv 文件时,我怀疑数据类型出了问题。我将如何解决这个问题?

edit: I should add that my main objective is to

编辑:我应该补充一点,我的主要目标是

  1. Write a pandas dataframe that contains images onto a csv file
  2. Read the csv file from disk as opposed to storing the entire dataframe on RAM
  1. 将包含图像的 Pandas 数据框写入 csv 文件
  2. 从磁盘读取 csv 文件,而不是将整个数据帧存储在 RAM 上

回答by ImportanceOfBeingErnest

It is not clear from the question why you would want to use pandas dataframes to store the image. I think this makes things unnecessarily complicated. You may instead directly store the numpy array in binary format and load it again at some point later.

从问题中不清楚为什么要使用熊猫数据帧来存储图像。我认为这会使事情变得不必要地复杂化。您可以改为直接以二进制格式存储 numpy 数组,并在稍后的某个时间再次加载它。

import numpy as np
import matplotlib.pyplot as plt

#create an image
imar = np.array([[[1.,0.],[0.,0.]],
                 [[0.,1.],[0.,1.]],
                 [[0.,0.],[1.,1.]]]).transpose()
plt.imsave('pic.jpg', imar)

# read the image
im = plt.imread('pic.jpg')
# show the image
plt.imshow(im)
plt.show()

#save the image array to binary file
np.save('mypic', im)
# load the image from binary file
new_im= np.load('mypic.npy')
# show the loaded image
plt.imshow(new_im)
plt.show()

As a response to the comments below, which turn the question somehow in a different direction, you may surely store the path/name of the image in the dataframe.

作为对以下评论的回应,这些评论以某种方式将问题转向了不同的方向,您肯定可以将图像的路径/名称存储在数据框中。

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

#create an image
imar = np.array([[[1.,0.],[0.,0.]],
                 [[0.,1.],[0.,1.]],
                 [[0.,0.],[1.,1.]]]).transpose()
plt.imsave('pic.jpg', imar)

#create dataframe

df = pd.DataFrame([[0,""]], columns=["Feature1","Feature2"])

# read the image
im = plt.imread('pic.jpg')

plt.imshow(im)
plt.show()

#save the image array to binary file
np.save('mypic.npy', im)
# store name of image in dataframe
df.iloc[0,1] = 'mypic.npy'
#save dataframe
df.to_csv("mydf.csv")
del df

#read dataframe from csv
df = pd.read_csv("mydf.csv")
# load the image from binary file, given the path from the Dataframe
new_im= np.load(df["Feature2"][0])
# show the loaded image
plt.imshow(new_im)
plt.show()

Last, you may go along the initally planned way of storing the actual image in a dataframe cell, but instead of writing to csv, you map pickle the dataframe, such that it can be read out just as if it had never been saved before.

最后,您可以按照最初计划的方式将实际图像存储在数据帧单元格中,但不是写入 csv,而是映射 pickle 数据帧,这样它就可以被读出,就像它以前从未保存过一样。

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import pickle

#create an image
imar = np.array([[[1.,0.],[0.,0.]],
                 [[0.,1.],[0.,1.]],
                 [[0.,0.],[1.,1.]]]).transpose()
plt.imsave('pic.jpg', imar)

#create dataframe

df = pd.DataFrame([[0,""]], columns=["Feature1","Feature2"])

# read the image
im = plt.imread('pic.jpg')

plt.imshow(im)
plt.show()

# store the image itself  in dataframe
df.iloc[0,1] = [im]
#save dataframe
pickle.dump(df, file("mydf.pickle", "wb"))
del df

#read dataframe from pickle
df = pickle.load(file("mydf.pickle", "rb"))

# show the loaded image from dataframe cell
plt.imshow(df["Feature2"][0][0])
plt.show()

回答by Harvey

How to display images in pandas dataframe

如何在熊猫数据框中显示图像

If you have Pandas column that contains URL or local path you can generate Image column which will display thumbnail or any other image size.

如果您有包含 URL 或本地路径的 Pandas 列,您可以生成将显示缩略图或任何其他图像大小的图像列。

1. In case you have URLs of images in list.

1. 如果您在列表中有图像的 URL。

You will first need to download images based on image URLs. adImageListcontains list of URL of images which you want to add to pandas as column.

您首先需要根据图像 URL 下载图像。adImageList包含要作为列添加到 Pandas 的图像 URL 列表。

dir_base = os.getcwd() # Get your current directory
for i, URL in enumerate(adImageList):
                image_name= '0{}_{}'.format(i+1,'_image.jpg') # This will show for example 01_image.jpg
                urllib.request.urlretrieve(URL, image_name)
                local_path_thumb = os.path.join(dir_base , image_name)
                df[i]['local_image_path']=local_path # adding that locally fetched image path to pandas column

2. In case you have image URLs in separate column in Pandas dataframe.First create function for getting local URL for single image

2. 如果您在 Pandas 数据框中的单独列中有图像 URL。首先创建用于获取单个图像的本地 URL 的函数

   get_image_local(URL):            
        image_name= '0{}_{}'.format(i+1,'_image.jpg')
        urllib.request.urlretrieve(URL, image_name)
        local_path_image = os.path.join(dir_base, image_name)
        return (local_path_image)

Than use lambda expression to map that to new column imageLocal:

比使用 lambda 表达式将其映射到新列imageLocal

df['imageLocal'] = df.URL.map(lambda f: get_image_local(f)) 

df['imageLocal']should look something like this:

df['imageLocal']应该是这样的:

0 C:\Users\username\Documents\Base_folder_image.jpg         
1 C:\Users\username\Documents\Base_folder_image.jpg                          
2 C:\Users\username\Documents\Base_folder_image.jpg
0 C:\Users\username\Documents\Base_folder_image.jpg         
1 C:\Users\username\Documents\Base_folder_image.jpg                          
2 C:\Users\username\Documents\Base_folder_image.jpg

Next 3 PILL functions you can just copy paste:

接下来的 3 个 PILL 函数,您只需复制粘贴即可:

import glob
import random
import base64
import pandas as pd

from PIL import Image
from io import BytesIO
from IPython.display import HTML
import io

pd.set_option('display.max_colwidth', -1)


def get_thumbnail(path):
    path = "\\?\"+path # This "\\?\" is used to prevent problems with long Windows paths
    i = Image.open(path)    
    return i

def image_base64(im):
    if isinstance(im, str):
        im = get_thumbnail(im)
    with BytesIO() as buffer:
        im.save(buffer, 'jpeg')
        return base64.b64encode(buffer.getvalue()).decode()

def image_formatter(im):
    return f'<img src="data:image/jpeg;base64,{image_base64(im)}">'

We can pass our local image path to get_thumbnail(path)with following:

我们可以将我们的本地图像路径传递给get_thumbnail(path)以下内容:

df['imagePILL'] = df.imageLocal.map(lambda f: get_thumbnail(f))

And df['imagePILL']should look like this:

而且df['imagePILL']应该是这样的:

0    <PIL.JpegImagePlugin.JpegImageFile image mode=RGB size=300x400 at 0x265BA323240>
1    <PIL.JpegImagePlugin.JpegImageFile image mode=RGB size=200x150 at 0x265BA3231D0>
2    <PIL.JpegImagePlugin.JpegImageFile image mode=RGB size=300x400 at 0x265BA3238D0>
0    <PIL.JpegImagePlugin.JpegImageFile image mode=RGB size=300x400 at 0x265BA323240>
1    <PIL.JpegImagePlugin.JpegImageFile image mode=RGB size=200x150 at 0x265BA3231D0>
2    <PIL.JpegImagePlugin.JpegImageFile image mode=RGB size=300x400 at 0x265BA3238D0>

You can resort pandas dataframe to get your new column in desired position:

您可以利用 Pandas 数据框将新列置于所需位置:

df= df.reindex(sorted(df.columns), axis=1)

And now if you want to view pandas dataframe with resized images just call image_formatterfunction in IPython.displayHTML function:

现在,如果您想查看带有调整大小图像的 Pandas 数据框,只需image_formatterIPython.displayHTML 函数中调用函数:

HTML(df.to_html(formatters={'imagePILL': image_formatter}, escape=False))

You can use any other way of showing HTML, important thing is to get PIL object inside pandas dataframe.

您可以使用任何其他方式来显示 HTML,重要的是在 pandas 数据框中获取 PIL 对象。