pandas pd.read_html() 导入列表而不是数据框
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/39710903/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
pd.read_html() imports a list rather than a dataframe
提问by AlK
I used pd.read_html()
to import a table from a webpage but instead of structuring the data as a dataframe Python imported it as a list. How can I import the data as a dataframe? Thank you!
我曾经pd.read_html()
从网页中导入表格,但 Python 没有将数据构建为数据框,而是将其作为列表导入。如何将数据作为数据框导入?谢谢!
The code is the following:
代码如下:
import pandas as pd
import html5lib
url = 'http://www.fdic.gov/bank/individual/failed/banklist.html'
dfs = pd.read_html(url)
type(dfs)
Out[1]: list
回答by alecxe
.read_html()
produces a list of dataframes(there could be multiple tables in an HTML source), get the desired one by index. In your case, there is a single dataframe:
.read_html()
生成数据框列表(HTML 源中可能有多个表),通过索引获取所需的一个。在您的情况下,有一个数据框:
dfs = pd.read_html(url)
df = dfs[0]
print(df)
Note that, if there are no table
s in the HTML source, it would return an error and would never produce an empty list.
请注意,如果table
HTML 源代码中没有s,它将返回错误并且永远不会产生空列表。
回答by Nikhil Chawla
import pandas as pd
import html5lib
url = 'http://www.fdic.gov/bank/individual/failed/banklist.html'
dfs = pd.read_html(url)
df = pd.concat(dfs)
df