pandas 从图像文件列表创建熊猫数据框

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/38351224/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 01:34:20  来源:igfitidea点击:

creating a pandas dataframe from a list of image files

pythonpandas

提问by amitava

I am trying to create a pandas dataframe from a list of image files (.png files)

我正在尝试从图像文件列表(.png 文件)创建一个 Pandas 数据框

samples = []
img = misc.imread('a.png')
X = img.reshape(-1, 3)
samples.append(X)

I added multiple .png files in samples like this. I am then trying to create a pandas dataframe from this.

我在这样的示例中添加了多个 .png 文件。然后我试图从中创建一个Pandas数据框。

df = pd.DataFrame(samples)

It is throwing error "ValueError: Must pass 2-d input". What is wrong here? Is it really possible to convert a list of image files to pandas dataframe. I am totally new to panda, so do not mind if this looks silly.
For ex.
X = [[1,2,3,4],[2,3,4,5]] df = pd.DataFrame(X)
gives me a nice dataframe of samples 2 as expected (row 2 column 4), but it is not happening with image files.

它抛出错误“ValueError:必须通过二维输入”。这里有什么问题?是否真的可以将图像文件列表转换为Pandas数据框。我对Pandas完全陌生,所以不要介意这看起来很傻。
例如。
X = [[1,2,3,4],[2,3,4,5]] df = pd.DataFrame(X)
按预期给了我一个很好的样本 2 数据框(第 2 行第 4 列),但图像文件不会发生这种情况。

回答by Nils

you can use:

您可以使用:

df = pd.DataFrame.from_records(samples)

回答by Gal Dreiman

If you want to create a DataFrame from a list, the easiest way to do this is to create a pandas.Series, like the following example:

如果要从列表创建 DataFrame,最简单的方法是创建一个pandas.Series,如下例所示:

import pandas as pd

samples = ['a','b','c']
s = pd.Series(samples)
print s

output:

输出:



0 a
1 b
2 c

0 a
1 b
2 c

回答by Philippe Grassia

X = img.reshape(-1, 3)
samples.append(X)

So X is a 2D array of size (number_of_pixels,3), and that makes samples a 3D list of size (number_of_images, numbers_pixels, 3) . So the error you're getting ( "ValueError: Must pass 2-d input") is legitimate.

所以 X 是大小为 (number_of_pixels,3) 的 2D 数组,这使样本成为大小为 (number_of_images, numbers_pixels, 3) 的 3D 列表。所以你得到的错误(“ValueError: Must pass 2-d input”)是合法的。

what you probably want is :

你可能想要的是:

X = img.flatten() 

or

或者

X = img.reshape(-1)

either is going to give you X of size (number_of_pixels*3,) and samples of size (number_of_images, number_of_pixels*3).

要么会给你 X 的大小 (number_of_pixels*3,) 和大小的样本 (number_of_images, number_of_pixels*3)。

you will probably take extra care to ensure that all images have the same number of pixels and channels.

您可能会特别注意确保所有图像具有相同数量的像素和通道。

回答by Juan Pablo Pineda

You can use reshape(-1)

您可以使用 reshape(-1)

x.append((img[::2,::2]/255.0).reshape(-1)) 
df = pd.DataFrame(x)