pandas 通过pandas.read_excel跳过标题后的行范围
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/49801060/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Skipping range of rows after header through pandas.read_excel
提问by florence-y
I know the argument usecols
in pandas.read_excel()
allows you to select specific columns.
我知道参数usecols
中pandas.read_excel()
,您可以选择特定的列。
Say I read an Excel file in with pandas.read_excel()
. My excel spreadsheet has 1161 rows. I want to keep the 1st row (with index 0), and skip rows 2:337. Seems like the argument skiprows
works only when 0 indexing is involved. I don't know if I could be wrong, but several runs of my code always produces an output of reading allmy 1161 rows rather than only after the 337th row on. Such as this:
假设我用pandas.read_excel()
. 我的 Excel 电子表格有 1161 行。我想保留第一行(索引为 0),并跳过 2:337 行。似乎该参数skiprows
仅在涉及 0 索引时才有效。我不知道我是否可能是错的,但我的代码的几次运行总是产生读取我所有1161 行的输出,而不是仅在第 337 行之后。比如这个:
documentationscore_dataframe = pd.read_excel("Documentation Score Card_17DEC2015 Rev 2 17JAN2017.xlsx",
sheet_name = "Sheet1",
skiprows = "336",
usecols = "H:BD")
Here is another attempt of what I have set up.
这是我设置的另一种尝试。
documentationscore_dataframe = pd.read_excel("Documentation Score Card_17DEC2015 Rev 2 17JAN2017.xlsx",
sheet_name = "Sheet1",
skiprows = "1:336",
usecols = "H:BD")
I would like the dataframe to exclude rows 2 through 337 in the original Excel import.
我希望数据框在原始 Excel 导入中排除第 2 行到第 337 行。
回答by jpp
As per the documentationfor pandas.read_excel
, skiprows
must be list-like.
按照该文件的pandas.read_excel
,skiprows
必须是列表等。
Try this instead to exclude rows 1 to 336 inclusive:
试试这个来排除第 1 行到第 336 行:
df = pd.read_excel("file.xlsx",
sheet_name = "Sheet1",
skiprows = range(1, 337),
usecols = "H:BD")
Note: range
constructor is considered list
-like for this purpose, so no explicit list conversion is necessary.
注意:出于此目的,range
构造函数被视为list
类似,因此不需要显式列表转换。
回答by Abdul-Razak Adam
Try this out
试试这个
rows_to_skip = list(range(1, 337)) #list of rows you want to skip
documentationscore_dataframe = pd.read_excel("Documentation Score Card_17DEC2015 Rev 2 17JAN2017.xlsx",
sheet_name = "Sheet1",
skiprows = rows_to_skip,
usecols = "H:BD")