Python 中的异步 HTTP 调用

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/4962808/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-18 18:21:55  来源:igfitidea点击:

Asynchronous HTTP calls in Python

pythonasynchronousasyncore

提问by kasceled

I have a need for a callback kind of functionality in Python where I am sending a request to a webservice multiple times, with a change in the parameter each time. I want these requests to happen concurrently instead of sequentially, so I want the function to be called asynchronously.

我需要在 Python 中使用回调类型的功能,我多次向 Web 服务发送请求,每次都更改参数。我希望这些请求同时发生而不是顺序发生,因此我希望异步调用该函数。

It looks like asyncore is what I might want to use, but the examples I've seen of how it works all look like overkill, so I'm wondering if there's another path I should be going down. Any suggestions on modules/process? Ideally I'd like to use these in a procedural fashion instead of creating classes but I may not be able to get around that.

看起来 asyncore 是我可能想要使用的,但我所看到的关于它如何工作的例子看起来都有些矫枉过正,所以我想知道是否还有另一条路我应该走下去。关于模块/流程的任何建议?理想情况下,我想以程序方式使用这些而不是创建类,但我可能无法解决这个问题。

采纳答案by Keith

Twisted frameworkis just the ticket for that. But if you don't want to take that on you might also use pycurl, wrapper for libcurl, that has its own async event loop and supports callbacks.

Twisted 框架就是这样做的门票。但是如果你不想接受它,你也可以使用pycurl,它是 libcurl 的包装器,它有自己的异步事件循环并支持回调。

回答by Corey Goldberg

Starting in Python 3.2, you can use concurrent.futuresfor launching parallel tasks.

从 Python 3.2 开始,您可以concurrent.futures用于启动并行任务。

Check out this ThreadPoolExecutorexample:

看看这个ThreadPoolExecutor例子:

http://docs.python.org/dev/library/concurrent.futures.html#threadpoolexecutor-example

http://docs.python.org/dev/library/concurrent.futures.html#threadpoolexecutor-example

It spawns threads to retrieve HTML and acts on responses as they are received.

它产生线程来检索 HTML 并在收到响应时对响应进行操作。

import concurrent.futures
import urllib.request

URLS = ['http://www.foxnews.com/',
        'http://www.cnn.com/',
        'http://europe.wsj.com/',
        'http://www.bbc.co.uk/',
        'http://some-made-up-domain.com/']

# Retrieve a single page and report the url and contents
def load_url(url, timeout):
    conn = urllib.request.urlopen(url, timeout=timeout)
    return conn.readall()

# We can use a with statement to ensure threads are cleaned up promptly
with concurrent.futures.ThreadPoolExecutor(max_workers=5) as executor:
    # Start the load operations and mark each future with its URL
    future_to_url = {executor.submit(load_url, url, 60): url for url in URLS}
    for future in concurrent.futures.as_completed(future_to_url):
        url = future_to_url[future]
        try:
            data = future.result()
        except Exception as exc:
            print('%r generated an exception: %s' % (url, exc))
        else:
            print('%r page is %d bytes' % (url, len(data)))

The above example uses threading. There is also a similar ProcessPoolExecutorthat uses a pool of processes, rather than threads:

上面的例子使用线程。还有一个类似的ProcessPoolExecutor使用进程池而不是线程:

http://docs.python.org/dev/library/concurrent.futures.html#processpoolexecutor-example

http://docs.python.org/dev/library/concurrent.futures.html#processpoolexecutor-example

import concurrent.futures
import urllib.request

URLS = ['http://www.foxnews.com/',
        'http://www.cnn.com/',
        'http://europe.wsj.com/',
        'http://www.bbc.co.uk/',
        'http://some-made-up-domain.com/']

# Retrieve a single page and report the url and contents
def load_url(url, timeout):
    conn = urllib.request.urlopen(url, timeout=timeout)
    return conn.readall()

# We can use a with statement to ensure threads are cleaned up promptly
with concurrent.futures.ThreadPoolExecutor(max_workers=5) as executor:
    # Start the load operations and mark each future with its URL
    future_to_url = {executor.submit(load_url, url, 60): url for url in URLS}
    for future in concurrent.futures.as_completed(future_to_url):
        url = future_to_url[future]
        try:
            data = future.result()
        except Exception as exc:
            print('%r generated an exception: %s' % (url, exc))
        else:
            print('%r page is %d bytes' % (url, len(data)))

回答by Raj

Do you know about eventlet? It lets you write what appears to be synchronous code, but have it operate asynchronously over the network.

你知道eventlet吗?它允许您编写看似同步的代码,但让它在网络上异步运行。

Here's an example of a super minimal crawler:

下面是一个超小型爬虫的例子:

urls = ["http://www.google.com/intl/en_ALL/images/logo.gif",
     "https://wiki.secondlife.com/w/images/secondlife.jpg",
     "http://us.i1.yimg.com/us.yimg.com/i/ww/beta/y3.gif"]

import eventlet
from eventlet.green import urllib2

def fetch(url):

  return urllib2.urlopen(url).read()

pool = eventlet.GreenPool()

for body in pool.imap(fetch, urls):
  print "got body", len(body)

回答by Venkatt Guhesan

(Although this thread is about server-side Python. Since this question was asked a while back. Others might stumble on this where they are looking for a similar answer on the client side)

(虽然这个线程是关于服务器端 Python 的。因为这个问题是前一段时间被问到的。其他人可能会偶然发现他们在客户端寻找类似的答案)

For a client side solution, you might want to take a look at Async.js library especially the "Control-Flow" section.

对于客户端解决方案,您可能需要查看 Async.js 库,尤其是“控制流”部分。

https://github.com/caolan/async#control-flow

https://github.com/caolan/async#control-flow

By combining the "Parallel" with a "Waterfall" you can achieve your desired result.

通过将“平行”与“瀑布”相结合,您可以获得您想要的结果。

WaterFall( Parallel(TaskA, TaskB, TaskC) -> PostParallelTask)

瀑布(并行(任务A,任务B,任务C)-> PostParallelTask​​)

If you examine the example under Control-Flow - "Auto" they give you an example of the above: https://github.com/caolan/async#autotasks-callbackwhere "write-file" depends on "get_data" and "make_folder" and "email_link" depends on write-file".

如果您检查 Control-Flow - "Auto" 下的示例,它们会为您提供上述示例:https: //github.com/caolan/async#autotasks-callback其中“write-file”取决于“get_data”和“ make_folder”和“email_link”取决于写入文件”。

Please note that all of this happens on the client side (unless you're doing Node.JS - on the server-side)

请注意,所有这些都发生在客户端(除非您在服务器端执行 Node.JS)

For server-side Python, look at PyCURL @ https://github.com/pycurl/pycurl/blob/master/examples/basicfirst.py

对于服务器端 Python,请查看 PyCURL @ https://github.com/pycurl/pycurl/blob/master/examples/basicfirst.py

By combining the example below with pyCurl, you can achieve the non-blocking multi-threaded functionality.

通过将下面的示例与 pyCurl 结合,您可以实现非阻塞多线程功能。

Hope this helps. Good luck.

希望这可以帮助。祝你好运。

Venkatt @ http://MyThinkpond.com

文卡特@ http://MyThinkpond.com