Python 我应该用什么来打开 url 而不是 urllib3 中的 urlopen

Question

提问by niloofar

I wanted to write a piece of code like the following:

我想写一段如下的代码：

from bs4 import BeautifulSoup
import urllib2

url = 'http://www.thefamouspeople.com/singers.php'
html = urllib2.urlopen(url)
soup = BeautifulSoup(html)

But I found that I have to install urllib3package now.

但是我发现我现在必须安装urllib3包。

Moreover, I couldn't find any tutorial or example to understand how to rewrite the above code, for example, urllib3does not have urlopen.

此外，我找不到任何教程或示例来理解如何重写上述代码，例如，urllib3没有urlopen.

Any explanation or example, please?!

有什么解释或例子吗？！

P/S: I'm using python 3.4.

P/S：我使用的是 python 3.4。

Answer 1

回答by shazow

urllib3 is a different library from urllib and urllib2. It has lots of additional features to the urllibs in the standard library, if you need them, things like re-using connections. The documentation is here: https://urllib3.readthedocs.org/

urllib3 是与 urllib 和 urllib2 不同的库。它为标准库中的 urllibs 提供了许多附加功能，如果您需要它们，例如重用连接。文档在这里：https: //urllib3.readthedocs.org/

If you'd like to use urllib3, you'll need to pip install urllib3. A basic example looks like this:

如果您想使用 urllib3，则需要使用pip install urllib3. 一个基本示例如下所示：

from bs4 import BeautifulSoup
import urllib3

http = urllib3.PoolManager()

url = 'http://www.thefamouspeople.com/singers.php'
response = http.request('GET', url)
soup = BeautifulSoup(response.data)

Answer 2

回答by alecxe

You do not have to install urllib3. You can choose any HTTP-request-making library that fits your needs and feed the response to BeautifulSoup. The choice is though usually requestsbecause of the rich feature set and convenient API. You can install requestsby entering pip install requestsin the command line. Here is a basic example:

您不必安装urllib3. 您可以选择任何适合您需要的 HTTP 请求库并将响应提供给BeautifulSoup. 选择通常是requests因为丰富的功能集和方便的 API。您可以requests通过pip install requests在命令行中输入来安装。这是一个基本示例：

from bs4 import BeautifulSoup
import requests

url = "url"
response = requests.get(url)

soup = BeautifulSoup(response.content, "html.parser")

Answer 3

回答by Lan Vuku?i?

The new urllib3library has a nice documentation here
In order to get your desired result you shuld follow that:

新的urllib3库在这里有一个很好的文档
为了得到你想要的结果，你应该遵循：

Import urllib3
from bs4 import BeautifulSoup

url = 'http://www.thefamouspeople.com/singers.php'

http = urllib3.PoolManager()
response = http.request('GET', url)
soup = BeautifulSoup(response.data.decode('utf-8'))

The "decode utf-8" part is optional. It worked without it when i tried, but i posted the option anyway.
Source: User Guide

“解码 utf-8”部分是可选的。当我尝试时，它可以在没有它的情况下工作，但我还是发布了该选项。
来源：用户指南

Python 我应该用什么来打开 url 而不是 urllib3 中的 urlopen

提问by niloofar

回答by shazow

回答by alecxe

回答by Lan Vuku?i?

相关推荐

最近更新

标签

Python 我应该用什么来打开 url 而不是 urllib3 中的 urlopen

提问by niloofar

回答by shazow

回答by alecxe

回答by Lan Vuku?i?

相关推荐

Python 我似乎无法在 Spark 上使用 --py-files

Python 类型错误：save() 缺少 1 个必需的位置参数：'self'

Python 无法为 COPY 创建 docker 映像失败：stat /var/lib/docker/tmp/docker-builder 错误

Python 在 Pandas 中创建空数据框指定列类型

相关推荐

最近更新

标签