Python 熊猫:read_html

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/34555135/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 15:09:27  来源:igfitidea点击:

Pandas: read_html

pythonpandas

提问by user4943236

I'm trying to extract US states from wiki URL, and for which I'm using Python Pandas.

我正在尝试从 wiki URL 中提取美国各州,为此我正在使用 Python Pandas。

import pandas as pd
import html5lib
f_states = pd.read_html('https://simple.wikipedia.org/wiki/List_of_U.S._states') 

However, the above code is giving me an error L

但是,上面的代码给了我一个错误 L

ImportError Traceback (most recent call last) in () 1 import pandas as pd ----> 2 f_states = pd.read_html('https://simple.wikipedia.org/wiki/List_of_U.S._states')

if flavor in ('bs4', 'html5lib'): 662 if not _HAS_HTML5LIB: --> 663 raise ImportError("html5lib not found, please install it") 664 if not _HAS_BS4: 665 raise ImportError("BeautifulSoup4 (bs4) not found, please install it") ImportError: html5lib not found, please install it

ImportError Traceback (most recent call last) in () 1 import pandas as pd ----> 2 f_states = pd.read_html(' https://simple.wikipedia.org/wiki/List_of_U.S._states')

if ('bs4', 'html5lib'): 662 if not _HAS_HTML5LIB: --> 663 raise ImportError("html5lib not found, please install it") 664 if not _HAS_BS4: 665 raise ImportError("BeautifulSoup4 (bs4) not找到,请安装") 导入错误:未找到 html5lib,请安装

I installed html5lib and beautifulsoup4 as well, but it is not working. Can someone help pls.

我也安装了 html5lib 和 beautifulsoup4,但它不起作用。有人可以帮忙吗?

采纳答案by Tim Seed

Running Python 3.4 on a mac

在 Mac 上运行 Python 3.4

New pyvenv

新的pyvenv

pip install pandas
pip install lxml
pip install html5lib
pip install BeautifulSoup4

Then run your example and it should work:

然后运行您的示例,它应该可以工作:

import pandas as pd
import html5lib
f_states=   pd.read_html('https://simple.wikipedia.org/wiki/List_of_U.S._states') 

回答by Tim Seed

Also consider conda installing your required packages at https://www.continuum.io/downloads. Instead of pip installing, you would conda install your packages.

还可以考虑在https://www.continuum.io/downloads 使用conda 安装所需的软件包。您将 conda 安装您的软件包,而不是 pip 安装。

$ conda install html5lib 

回答by Subham Kumar Singh

You need to install lxml using pip.

您需要使用 pip 安装 lxml。

pip install lxml

this worked for me.

这对我有用。

回答by Manjeet Kumar

For that you just need to install

为此,您只需要安装

pip install pandas
pip install lxml

and then you have to import those and run your program

然后你必须导入它们并运行你的程序

import pandas as pd
f_states=pd.read_html('https://simple.wikipedia.org/wiki/List_of_U.S._states') 

回答by honglin zhang

if your environment is Anaconda Jupiter notebook.

如果您的环境是 Anaconda Jupiter notebook。

you need another set of install comment:

您需要另一组安装注释:

conda install lxml
conda install html5lib
conda install BeautifulSoup4

then run the python code in Jupiter notebook.

然后在 Jupiter notebook 中运行 python 代码。

import pandas as pd
f_states=   pd.read_html('https://simple.wikipedia.org/wiki/List_of_U.S._states')