将 html 表转换为 Pandas 数据框
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/42128760/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Converting html table to a pandas dataframe
提问by beebeckzzz
I have been trying to import a html table from a website and to convert it into a pandas DataFrame
. This is my code:
我一直在尝试从网站导入 html 表并将其转换为 pandas DataFrame
。这是我的代码:
import pandas as pd
table = pd.read_html("http://www.sharesansar.com/c/today-share-price.html")
dfs = pd.DataFrame(data = table)
print dfs
It just displays this:
它只是显示这个:
0 S.No ...
But if I do;
但如果我这样做了;
for df in dfs:
print df
It outputs the table..
它输出表..
How can I use pd.Dataframeto scrape the table?
如何使用pd.Dataframe刮表?
回答by MYGz
HTML table on the given url is javascript rendered. pd.read_html()
doesn't supports javascript rendered pages. You can try with dryscrape
like so:
给定 url 上的 HTML 表是 javascript 呈现的。pd.read_html()
不支持 javascript 呈现的页面。你可以dryscrape
像这样尝试:
import pandas as pd
import dryscrape
s = dryscrape.Session()
s.visit("http://www.sharesansar.com/c/today-share-price.html")
df = pd.read_html(s.body())[5]
df.head()
Output:
输出: