Python 将 Excel 文件加载到 numpy 二维数组中

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/17052991/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 00:18:01  来源:igfitidea点击:

Load Excel file into numpy 2D array

pythonexcelnumpy

提问by AniketD

Is there an easier way to load an excel file directly into a Numpy array?

有没有更简单的方法将 excel 文件直接加载到 Numpy 数组中?

I have looked at the numpy.genfromtxtautoloading function from numpy documentation but it doesn't load excel files directly.

我查看了numpy.genfromtxtnumpy 文档中的自动加载功能,但它不直接加载 excel 文件。

array = np.genfromtxt("Stats.xlsx")
ValueError: Some errors were detected !
Line #3 (got 2 columns instead of 1)
Line #5 (got 5 columns instead of 1)
......

Right now I am using using openpyxl.reader.excelto read the excel file and then append to numpy 2D arrays. This seems to be inefficient. Ideally I would like to have to excel file directly loaded to numpy 2D array.

现在我正在使用 usingopenpyxl.reader.excel来读取 excel 文件,然后附加到 numpy 2D 数组。这似乎是低效的。理想情况下,我希望将 excel 文件直接加载到 numpy 2D 数组。

采纳答案by Joe Kington

Honestly, if you're working with heterogeneous data (as spreadsheets are likely to contain) using a pandas.DataFrameis a better choice than using numpydirectly.

老实说,如果您正在处理异构数据(因为电子表格可能包含),使用 apandas.DataFrame是比numpy直接使用更好的选择。

While pandasis in some sense just a wrapper around numpy, it handles heterogeneous data very very nicely. (As well as a ton of other things... For "spreadsheet-like" data, it's the gold standard in the python world.)

虽然pandas在某种意义上只是 numpy 的包装器,但它非常好地处理异构数据。(还有很多其他的东西......对于“类似电子表格”的数据,它是 Python 世界的黄金标准。)

If you decide to go that route, just use pandas.read_excel.

如果您决定走那条路,只需使用pandas.read_excel.