pandas pd.read_html() 导入列表而不是数据框

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/39710903/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 02:05:52  来源:igfitidea点击:

pd.read_html() imports a list rather than a dataframe

pythonhtmlpandas

提问by AlK

I used pd.read_html()to import a table from a webpage but instead of structuring the data as a dataframe Python imported it as a list. How can I import the data as a dataframe? Thank you!

我曾经pd.read_html()从网页中导入表格,但 Python 没有将数据构建为数据框,而是将其作为列表导入。如何将数据作为数据框导入?谢谢!

The code is the following:

代码如下:

import pandas as pd

import html5lib

url = 'http://www.fdic.gov/bank/individual/failed/banklist.html'

dfs = pd.read_html(url)

type(dfs)

Out[1]: list

回答by alecxe

.read_html()produces a list of dataframes(there could be multiple tables in an HTML source), get the desired one by index. In your case, there is a single dataframe:

.read_html()生成数据框列表(HTML 源中可能有多个表),通过索引获取所需的一个。在您的情况下,有一个数据框:

dfs = pd.read_html(url)
df = dfs[0]
print(df)

Note that, if there are no tables in the HTML source, it would return an error and would never produce an empty list.

请注意,如果tableHTML 源代码中没有s,它将返回错误并且永远不会产生空列表。

回答by Nikhil Chawla

import pandas as pd
import html5lib
url = 'http://www.fdic.gov/bank/individual/failed/banklist.html'
dfs = pd.read_html(url)
df = pd.concat(dfs)
df