pandas 如何使用pandas将excel文件数据转换为numpy数组？

Question

提问by Rian Zaman

I am really new in keras library and also Python. I am trying to import an excel file using pandas and convert it to a numpy.ndarrayusing as_matrix()function of pandas. But it seams to read my file wrong. Like I have a 90x1049 data set in Excel file. But when i am trying to convert it into numpy array it reads my data as 89x1049. I am using the following code, which is not working:

我真的是 keras 库和 Python 的新手。我正在尝试使用 pandas 导入一个 excel 文件并将其转换为 pandas的numpy.ndarrayusingas_matrix()函数。但它接缝读取我的文件错误。就像我在 Excel 文件中有一个 90x1049 的数据集。但是当我尝试将其转换为 numpy 数组时，它会将我的数据读取为 89x1049。我正在使用以下代码，但它不起作用：

training_data_x = pd.read_excel("/home/workstation/ANN/new_input.xlsx")
X_train = training_data_x.as_matrix()

Answer 1

采纳答案by Ilja Everil?

Probably what happens is that your Excel file has no header row and so pandas.read_excelconsumes your first data row as such.

可能发生的情况是您的 Excel 文件没有标题行，因此pandas.read_excel消耗了您的第一个数据行。

I tried creating an xlsx containing

我尝试创建一个 xlsx 包含

Reading that resulted in

阅读导致

In [3]: df = pandas.read_excel('test.xlsx')

In [4]: df
Out[4]: 
    1   2   3
0   2   3   4
1   3   4   5
2   4   5   6
3   5   6   7
4   6   7   8
5   7   8   9
6   8   9  10
7   9  10  11
8  10  11  12

As can be seen, the first data row has been used as labels for columns.

可以看出，第一个数据行已用作列的标签。

To avoid consuming the first data row as headers, pass headers=Noneto read_excel. Interestingly the documentationdid not mention this usage before, but has been fixed since:

为避免将第一行数据用作标题，请传递headers=None给read_excel. 有趣的是，文档之前没有提到这种用法，但已修复：

header: int, list of ints, default 0
Row (0-indexed) to use for the column labels of the parsed DataFrame. If a list of integers is passed those row positions will be combined into a MultiIndex. Use None if there is no header.

标头：int，整数列表，默认为 0
用于解析的 DataFrame 的列标签的行（0 索引）。如果传递整数列表，则这些行位置将组合成一个MultiIndex. 如果没有标题，请使用 None。

Answer 2

回答by pylang

If you have no header, try the following:

如果没有标题，请尝试以下操作：

training_data = pd.read_excel("/home/workstation/ANN/new_input.xlsx", header=None)

X_train = training_data_x.as_matrix()

See also answers from a previous question.

另请参阅上一个问题的答案。

pandas 如何使用pandas将excel文件数据转换为numpy数组？

提问by Rian Zaman

采纳答案by Ilja Everil?

回答by pylang

相关推荐

最近更新

标签

pandas 如何使用pandas将excel文件数据转换为numpy数组？

提问by Rian Zaman

采纳答案by Ilja Everil?

回答by pylang

相关推荐

pandas 如何升级 iPython 使用的软件包？

pandas 在fillna中使用自定义函数Series

从 Numpy 3d 数组有效地创建 Pandas DataFrame

Pandas 使用变量作为列名

相关推荐

最近更新

标签