pandas 从图像文件列表创建熊猫数据框
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/38351224/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
creating a pandas dataframe from a list of image files
提问by amitava
I am trying to create a pandas dataframe from a list of image files (.png files)
我正在尝试从图像文件列表(.png 文件)创建一个 Pandas 数据框
samples = []
img = misc.imread('a.png')
X = img.reshape(-1, 3)
samples.append(X)
I added multiple .png files in samples like this. I am then trying to create a pandas dataframe from this.
我在这样的示例中添加了多个 .png 文件。然后我试图从中创建一个Pandas数据框。
df = pd.DataFrame(samples)
It is throwing error "ValueError: Must pass 2-d input". What is wrong here? Is it really possible to convert a list of image files to pandas dataframe. I am totally new to panda, so do not mind if this looks silly.
For ex.X = [[1,2,3,4],[2,3,4,5]] df = pd.DataFrame(X)
gives me a nice dataframe of samples 2 as expected (row 2 column 4), but it is not happening with image files.
它抛出错误“ValueError:必须通过二维输入”。这里有什么问题?是否真的可以将图像文件列表转换为Pandas数据框。我对Pandas完全陌生,所以不要介意这看起来很傻。
例如。X = [[1,2,3,4],[2,3,4,5]] df = pd.DataFrame(X)
按预期给了我一个很好的样本 2 数据框(第 2 行第 4 列),但图像文件不会发生这种情况。
回答by Nils
you can use:
您可以使用:
df = pd.DataFrame.from_records(samples)
回答by Gal Dreiman
If you want to create a DataFrame from a list, the easiest way to do this is to create a pandas.Series
, like the following example:
如果要从列表创建 DataFrame,最简单的方法是创建一个pandas.Series
,如下例所示:
import pandas as pd
samples = ['a','b','c']
s = pd.Series(samples)
print s
output:
输出:
0 a
1 b
2 c
0 a
1 b
2 c
回答by Philippe Grassia
X = img.reshape(-1, 3)
samples.append(X)
So X is a 2D array of size (number_of_pixels,3), and that makes samples a 3D list of size (number_of_images, numbers_pixels, 3) . So the error you're getting ( "ValueError: Must pass 2-d input") is legitimate.
所以 X 是大小为 (number_of_pixels,3) 的 2D 数组,这使样本成为大小为 (number_of_images, numbers_pixels, 3) 的 3D 列表。所以你得到的错误(“ValueError: Must pass 2-d input”)是合法的。
what you probably want is :
你可能想要的是:
X = img.flatten()
or
或者
X = img.reshape(-1)
either is going to give you X of size (number_of_pixels*3,) and samples of size (number_of_images, number_of_pixels*3).
要么会给你 X 的大小 (number_of_pixels*3,) 和大小的样本 (number_of_images, number_of_pixels*3)。
you will probably take extra care to ensure that all images have the same number of pixels and channels.
您可能会特别注意确保所有图像具有相同数量的像素和通道。
回答by Juan Pablo Pineda
You can use reshape(-1)
您可以使用 reshape(-1)
x.append((img[::2,::2]/255.0).reshape(-1))
df = pd.DataFrame(x)