pandas 通过pandas.read_excel跳过标题后的行范围

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/49801060/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 05:27:37  来源:igfitidea点击:

Skipping range of rows after header through pandas.read_excel

pythonexcelpandasdataframe

提问by florence-y

I know the argument usecolsin pandas.read_excel()allows you to select specific columns.

我知道参数usecolspandas.read_excel(),您可以选择特定的列。

Say I read an Excel file in with pandas.read_excel(). My excel spreadsheet has 1161 rows. I want to keep the 1st row (with index 0), and skip rows 2:337. Seems like the argument skiprowsworks only when 0 indexing is involved. I don't know if I could be wrong, but several runs of my code always produces an output of reading allmy 1161 rows rather than only after the 337th row on. Such as this:

假设我用pandas.read_excel(). 我的 Excel 电子表格有 1161 行。我想保留第一行(索引为 0),并跳过 2:337 行。似乎该参数skiprows仅在涉及 0 索引时才有效。我不知道我是否可能是错的,但我的代码的几次运行总是产生读取我所有1161 行的输出,而不是仅在第 337 行之后。比如这个:

documentationscore_dataframe = pd.read_excel("Documentation Score Card_17DEC2015 Rev 2 17JAN2017.xlsx",
                                        sheet_name = "Sheet1",
                                        skiprows = "336",
                                        usecols = "H:BD")

Here is another attempt of what I have set up.

这是我设置的另一种尝试。

documentationscore_dataframe = pd.read_excel("Documentation Score Card_17DEC2015 Rev 2 17JAN2017.xlsx",
                                        sheet_name = "Sheet1",
                                        skiprows = "1:336",
                                        usecols = "H:BD")

I would like the dataframe to exclude rows 2 through 337 in the original Excel import.

我希望数据框在原始 Excel 导入中排除第 2 行到第 337 行。

回答by jpp

As per the documentationfor pandas.read_excel, skiprowsmust be list-like.

按照该文件pandas.read_excelskiprows必须是列表等。

Try this instead to exclude rows 1 to 336 inclusive:

试试这个来排除第 1 行到第 336 行:

df = pd.read_excel("file.xlsx",
                   sheet_name = "Sheet1",
                   skiprows = range(1, 337),
                   usecols = "H:BD")

Note: rangeconstructor is considered list-like for this purpose, so no explicit list conversion is necessary.

注意:出于此目的,range构造函数被视为list类似,因此不需要显式列表转换。

回答by Abdul-Razak Adam

Try this out

试试这个

rows_to_skip = list(range(1, 337)) #list of rows you want to skip
documentationscore_dataframe = pd.read_excel("Documentation Score Card_17DEC2015 Rev 2 17JAN2017.xlsx",
                                    sheet_name = "Sheet1",
                                    skiprows = rows_to_skip,
                                    usecols = "H:BD")