如何使用 Pandas 从 Excel 中读取某些列 - Python

Question

提问by Ana

I am reading from an Excel sheet and I want to read certain columns: column 0 because it is the row-index, and columns 22:37. Now here is what I do:

我正在从 Excel 工作表中读取数据，我想读取某些列：第 0 列因为它是行索引，以及第 22:37 列。现在这是我要做的：

import pandas as pd
import numpy as np
file_loc = "path.xlsx"
df = pd.read_excel(file_loc, index_col=None, na_values=['NA'], parse_cols = 37)
df= pd.concat([df[df.columns[0]], df[df.columns[22:]]], axis=1)

But I would hope there is better way to do that! I know if I do parse_cols=[0, 22,..,37]I can do it, but for large datasets this doesn't make sense.

但我希望有更好的方法来做到这一点！我知道如果我这样做parse_cols=[0, 22,..,37]我可以做到，但是对于大型数据集，这没有意义。

I also did this:

我也这样做了：

s = pd.Series(0)
s[1]=22
for i in range(2,14):
    s[i]=s[i-1]+1
df = pd.read_excel(file_loc, index_col=None, na_values=['NA'], parse_cols = s)

But it reads the first 15 columns which is the length of s.

但它读取前 15 列，即s.

Answer 1

回答by Martin Vseticka

You can use column indices (letters) like this:

您可以像这样使用列索引（字母）：

import pandas as pd
import numpy as np
file_loc = "path.xlsx"
df = pd.read_excel(file_loc, index_col=None, na_values=['NA'], usecols = "A,C:AA")
print(df)

[Corresponding documentation][1]:

[对应文件][1]：

usecolsint, str, list-like, or callable default None
If None, then parse all columns.
If str, then indicates comma separated list of Excel column letters and column ranges (e.g. “A:E” or “A,C,E:F”). Ranges are inclusive of both sides.
If list of int, then indicates list of column numbers to be parsed.
If list of string, then indicates list of column names to be parsed.
New in version 0.24.0.
If callable, then evaluate each column name against it and parse the column if the callable returns True.
Returns a subset of the columns according to behavior above.
New in version 0.24.0.

usecolsint, str, list-like, or callable default None
如果没有，则解析所有列。
如果是 str，则表示 Excel 列字母和列范围的逗号分隔列表（例如“A:E”或“A,C,E:F”）。范围包括双方。
如果是 int 列表，则表示要解析的列号列表。
如果是字符串列表，则表示要解析的列名列表。
0.24.0 版中的新功能。
如果可调用，则根据它评估每个列名，如果可调用返回 True，则解析该列。
根据上述行为返回列的子集。
0.24.0 版中的新功能。

Answer 2

回答by Leoli

parse_colsis deprecated, use usecolsinstead

parse_cols已弃用，请usecols改用

that is:

那是：

df = pd.read_excel(file_loc, index_col=None, na_values=['NA'], usecols = "A,C:AA")

Answer 3

回答by Uday Kiran

"usecols" should help, use range of columns (as per excel worksheet, A,B...etc.) below are the examples

“usecols”应该会有所帮助，使用列范围（根据excel工作表，A，B...等）下面是示例

Selected Columns

选定的列

df = pd.read_excel(file_location,sheet_name='Sheet1', usecols="A,C,F")

Range of Columns and selected column

列范围和选定列

df = pd.read_excel(file_location,sheet_name='Sheet1', usecols="A:F,H")

Multiple Ranges

多个范围

df = pd.read_excel(file_location,sheet_name='Sheet1', usecols="A:F,H,J:N")

Range of columns

列范围

df = pd.read_excel(file_location,sheet_name='Sheet1', usecols="A:N")

如何使用 Pandas 从 Excel 中读取某些列 - Python

提问by Ana

回答by Martin Vseticka

回答by Leoli

回答by Uday Kiran

相关推荐

最近更新

标签

如何使用 Pandas 从 Excel 中读取某些列 - Python

提问by Ana

回答by Martin Vseticka

回答by Leoli

回答by Uday Kiran

相关推荐

Python 删除列表中前 N 个元素的最有效方法？

Python AttributeError: 'DataFrame' 对象没有属性

使用python将列表转换为字符串

Python 类型错误：只能将列表（不是“str”）连接到列表

相关推荐

最近更新

标签