Python 如何摆脱 BeautifulSoup 用户警告?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/33511544/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How to get rid of BeautifulSoup user warning?
提问by jellyfishhuang
After I installed BeautifulSoup, Whenever I run my Python in cmd, this warning comes out.
安装 BeautifulSoup 后,每当我在 cmd 中运行 Python 时,都会出现此警告。
D:\Application\python\lib\site-packages\beautifulsoup4-4.4.1-py3.4.egg\bs4\__init__.py:166:
UserWarning: No parser was explicitly specified, so I'm using the best
available HTML parser for this system ("html.parser"). This usually isn't a
problem, but if you run this code on another system, or in a different
virtual environment, it may use a different parser and behave differently.
To get rid of this warning, change this:
BeautifulSoup([your markup])
to this:
BeautifulSoup([your markup], "html.parser")
I have no ideal why it comes out and how to solve it.
我不知道它为什么会出现以及如何解决它。
采纳答案by Ethan Bierlein
The solution to your problem is clearly stated in the error message. Code like the below does not specify an XML/HTML/etc. parser.
错误消息中明确说明了问题的解决方案。像下面这样的代码没有指定 XML/HTML/等。解析器。
BeautifulSoup( ... )
In order to fix the error, you'll need to specify which parser you'd like to use, like so:
为了修复错误,您需要指定要使用的解析器,如下所示:
BeautifulSoup( ..., "html.parser" )
You can also install a 3rd party parser if you'd like.
如果您愿意,您还可以安装第 3 方解析器。
回答by Gayan Weerakutti
Documentation recommends that you install and use lxmlfor speed.
文档建议您安装和使用lxml以提高速度。
BeautifulSoup(html, "lxml")
If you're using a version of Python 2 earlier than 2.7.3, or a version of Python 3 earlier than 3.2.2, it's essential that you install lxml or html5lib–Python's built-in HTML parser is just not very good in older versions.
如果您使用的是早于 2.7.3 的 Python 2 版本,或早于 3.2.2 的 Python 3 版本,则必须安装 lxml 或 html5lib – Python 的内置 HTML 解析器在旧版本中不是很好版本。
Installing LXML parser
安装 LXML 解析器
On Ubuntu (debian)
apt-get install python-lxml
Fedora (RHEL based)
dnf install python-lxml
Using PIP
pip install lxml
在 Ubuntu (debian) 上
apt-get install python-lxml
Fedora(基于 RHEL)
dnf install python-lxml
使用画中画
pip install lxml
回答by Wilson Wu
For HTML parser, you need to install html5lib, run:
对于 HTML 解析器,您需要安装 html5lib,运行:
pip install html5lib
then add html5lib in the BeautifulSoup method:
然后在 BeautifulSoup 方法中添加 html5lib:
htmlDoc = bs4.BeautifulSoup(req1.text, 'html5lib')
print(htmlDoc)