Python 我应该用什么来打开 url 而不是 urllib3 中的 urlopen

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/36516183/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 17:58:22  来源:igfitidea点击:

What should I use to open a url instead of urlopen in urllib3

pythonweb-scrapingbeautifulsoupurllib3

提问by niloofar

I wanted to write a piece of code like the following:

我想写一段如下的代码:

from bs4 import BeautifulSoup
import urllib2

url = 'http://www.thefamouspeople.com/singers.php'
html = urllib2.urlopen(url)
soup = BeautifulSoup(html)

But I found that I have to install urllib3package now.

但是我发现我现在必须安装urllib3包。

Moreover, I couldn't find any tutorial or example to understand how to rewrite the above code, for example, urllib3does not have urlopen.

此外,我找不到任何教程或示例来理解如何重写上述代码,例如,urllib3没有urlopen.

Any explanation or example, please?!

有什么解释或例子吗?!

P/S: I'm using python 3.4.

P/S:我使用的是 python 3.4。

回答by shazow

urllib3 is a different library from urllib and urllib2. It has lots of additional features to the urllibs in the standard library, if you need them, things like re-using connections. The documentation is here: https://urllib3.readthedocs.org/

urllib3 是与 urllib 和 urllib2 不同的库。它为标准库中的 urllibs 提供了许多附加功能,如果您需要它们,例如重用连接。文档在这里:https: //urllib3.readthedocs.org/

If you'd like to use urllib3, you'll need to pip install urllib3. A basic example looks like this:

如果您想使用 urllib3,则需要使用pip install urllib3. 一个基本示例如下所示:

from bs4 import BeautifulSoup
import urllib3

http = urllib3.PoolManager()

url = 'http://www.thefamouspeople.com/singers.php'
response = http.request('GET', url)
soup = BeautifulSoup(response.data)

回答by alecxe

You do not have to install urllib3. You can choose any HTTP-request-making library that fits your needs and feed the response to BeautifulSoup. The choice is though usually requestsbecause of the rich feature set and convenient API. You can install requestsby entering pip install requestsin the command line. Here is a basic example:

您不必安装urllib3. 您可以选择任何适合您需要的 HTTP 请求库并将响应提供给BeautifulSoup. 选择通常是requests因为丰富的功能集和方便的 API。您可以requests通过pip install requests在命令行中输入来安装。这是一个基本示例:

from bs4 import BeautifulSoup
import requests

url = "url"
response = requests.get(url)

soup = BeautifulSoup(response.content, "html.parser")

回答by Lan Vuku?i?

The new urllib3library has a nice documentation here
In order to get your desired result you shuld follow that:

新的urllib3在这里有一个很好的文档
为了得到你想要的结果,你应该遵循:

Import urllib3
from bs4 import BeautifulSoup

url = 'http://www.thefamouspeople.com/singers.php'

http = urllib3.PoolManager()
response = http.request('GET', url)
soup = BeautifulSoup(response.data.decode('utf-8'))

The "decode utf-8" part is optional. It worked without it when i tried, but i posted the option anyway.
Source: User Guide

“解码 utf-8”部分是可选的。当我尝试时,它可以在没有它的情况下工作,但我还是发布了该选项。
来源:用户指南