Python 熊猫:read_html
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/34555135/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Pandas: read_html
提问by user4943236
I'm trying to extract US states from wiki URL, and for which I'm using Python Pandas.
我正在尝试从 wiki URL 中提取美国各州,为此我正在使用 Python Pandas。
import pandas as pd
import html5lib
f_states = pd.read_html('https://simple.wikipedia.org/wiki/List_of_U.S._states')
However, the above code is giving me an error L
但是,上面的代码给了我一个错误 L
ImportError Traceback (most recent call last) in () 1 import pandas as pd ----> 2 f_states = pd.read_html('https://simple.wikipedia.org/wiki/List_of_U.S._states')
if flavor in ('bs4', 'html5lib'): 662 if not _HAS_HTML5LIB: --> 663 raise ImportError("html5lib not found, please install it") 664 if not _HAS_BS4: 665 raise ImportError("BeautifulSoup4 (bs4) not found, please install it") ImportError: html5lib not found, please install it
ImportError Traceback (most recent call last) in () 1 import pandas as pd ----> 2 f_states = pd.read_html(' https://simple.wikipedia.org/wiki/List_of_U.S._states')
if ('bs4', 'html5lib'): 662 if not _HAS_HTML5LIB: --> 663 raise ImportError("html5lib not found, please install it") 664 if not _HAS_BS4: 665 raise ImportError("BeautifulSoup4 (bs4) not找到,请安装") 导入错误:未找到 html5lib,请安装
I installed html5lib and beautifulsoup4 as well, but it is not working. Can someone help pls.
我也安装了 html5lib 和 beautifulsoup4,但它不起作用。有人可以帮忙吗?
采纳答案by Tim Seed
Running Python 3.4 on a mac
在 Mac 上运行 Python 3.4
New pyvenv
新的pyvenv
pip install pandas
pip install lxml
pip install html5lib
pip install BeautifulSoup4
Then run your example and it should work:
然后运行您的示例,它应该可以工作:
import pandas as pd
import html5lib
f_states= pd.read_html('https://simple.wikipedia.org/wiki/List_of_U.S._states')
回答by Tim Seed
Also consider conda installing your required packages at https://www.continuum.io/downloads. Instead of pip installing, you would conda install your packages.
还可以考虑在https://www.continuum.io/downloads 使用conda 安装所需的软件包。您将 conda 安装您的软件包,而不是 pip 安装。
$ conda install html5lib
回答by Subham Kumar Singh
You need to install lxml using pip.
您需要使用 pip 安装 lxml。
pip install lxml
this worked for me.
这对我有用。
回答by Manjeet Kumar
For that you just need to install
为此,您只需要安装
pip install pandas
pip install lxml
and then you have to import those and run your program
然后你必须导入它们并运行你的程序
import pandas as pd
f_states=pd.read_html('https://simple.wikipedia.org/wiki/List_of_U.S._states')
回答by honglin zhang
if your environment is Anaconda Jupiter notebook.
如果您的环境是 Anaconda Jupiter notebook。
you need another set of install comment:
您需要另一组安装注释:
conda install lxml
conda install html5lib
conda install BeautifulSoup4
then run the python code in Jupiter notebook.
然后在 Jupiter notebook 中运行 python 代码。
import pandas as pd
f_states= pd.read_html('https://simple.wikipedia.org/wiki/List_of_U.S._states')