Python 中的异步 HTTP 调用
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/4962808/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Asynchronous HTTP calls in Python
提问by kasceled
I have a need for a callback kind of functionality in Python where I am sending a request to a webservice multiple times, with a change in the parameter each time. I want these requests to happen concurrently instead of sequentially, so I want the function to be called asynchronously.
我需要在 Python 中使用回调类型的功能,我多次向 Web 服务发送请求,每次都更改参数。我希望这些请求同时发生而不是顺序发生,因此我希望异步调用该函数。
It looks like asyncore is what I might want to use, but the examples I've seen of how it works all look like overkill, so I'm wondering if there's another path I should be going down. Any suggestions on modules/process? Ideally I'd like to use these in a procedural fashion instead of creating classes but I may not be able to get around that.
看起来 asyncore 是我可能想要使用的,但我所看到的关于它如何工作的例子看起来都有些矫枉过正,所以我想知道是否还有另一条路我应该走下去。关于模块/流程的任何建议?理想情况下,我想以程序方式使用这些而不是创建类,但我可能无法解决这个问题。
采纳答案by Keith
Twisted frameworkis just the ticket for that. But if you don't want to take that on you might also use pycurl, wrapper for libcurl, that has its own async event loop and supports callbacks.
Twisted 框架就是这样做的门票。但是如果你不想接受它,你也可以使用pycurl,它是 libcurl 的包装器,它有自己的异步事件循环并支持回调。
回答by Corey Goldberg
Starting in Python 3.2, you can use concurrent.futuresfor launching parallel tasks.
从 Python 3.2 开始,您可以concurrent.futures用于启动并行任务。
Check out this ThreadPoolExecutorexample:
看看这个ThreadPoolExecutor例子:
http://docs.python.org/dev/library/concurrent.futures.html#threadpoolexecutor-example
http://docs.python.org/dev/library/concurrent.futures.html#threadpoolexecutor-example
It spawns threads to retrieve HTML and acts on responses as they are received.
它产生线程来检索 HTML 并在收到响应时对响应进行操作。
import concurrent.futures
import urllib.request
URLS = ['http://www.foxnews.com/',
'http://www.cnn.com/',
'http://europe.wsj.com/',
'http://www.bbc.co.uk/',
'http://some-made-up-domain.com/']
# Retrieve a single page and report the url and contents
def load_url(url, timeout):
conn = urllib.request.urlopen(url, timeout=timeout)
return conn.readall()
# We can use a with statement to ensure threads are cleaned up promptly
with concurrent.futures.ThreadPoolExecutor(max_workers=5) as executor:
# Start the load operations and mark each future with its URL
future_to_url = {executor.submit(load_url, url, 60): url for url in URLS}
for future in concurrent.futures.as_completed(future_to_url):
url = future_to_url[future]
try:
data = future.result()
except Exception as exc:
print('%r generated an exception: %s' % (url, exc))
else:
print('%r page is %d bytes' % (url, len(data)))
The above example uses threading. There is also a similar ProcessPoolExecutorthat uses a pool of processes, rather than threads:
上面的例子使用线程。还有一个类似的ProcessPoolExecutor使用进程池而不是线程:
http://docs.python.org/dev/library/concurrent.futures.html#processpoolexecutor-example
http://docs.python.org/dev/library/concurrent.futures.html#processpoolexecutor-example
import concurrent.futures
import urllib.request
URLS = ['http://www.foxnews.com/',
'http://www.cnn.com/',
'http://europe.wsj.com/',
'http://www.bbc.co.uk/',
'http://some-made-up-domain.com/']
# Retrieve a single page and report the url and contents
def load_url(url, timeout):
conn = urllib.request.urlopen(url, timeout=timeout)
return conn.readall()
# We can use a with statement to ensure threads are cleaned up promptly
with concurrent.futures.ThreadPoolExecutor(max_workers=5) as executor:
# Start the load operations and mark each future with its URL
future_to_url = {executor.submit(load_url, url, 60): url for url in URLS}
for future in concurrent.futures.as_completed(future_to_url):
url = future_to_url[future]
try:
data = future.result()
except Exception as exc:
print('%r generated an exception: %s' % (url, exc))
else:
print('%r page is %d bytes' % (url, len(data)))
回答by Raj
Do you know about eventlet? It lets you write what appears to be synchronous code, but have it operate asynchronously over the network.
你知道eventlet吗?它允许您编写看似同步的代码,但让它在网络上异步运行。
Here's an example of a super minimal crawler:
下面是一个超小型爬虫的例子:
urls = ["http://www.google.com/intl/en_ALL/images/logo.gif",
"https://wiki.secondlife.com/w/images/secondlife.jpg",
"http://us.i1.yimg.com/us.yimg.com/i/ww/beta/y3.gif"]
import eventlet
from eventlet.green import urllib2
def fetch(url):
return urllib2.urlopen(url).read()
pool = eventlet.GreenPool()
for body in pool.imap(fetch, urls):
print "got body", len(body)
回答by Venkatt Guhesan
(Although this thread is about server-side Python. Since this question was asked a while back. Others might stumble on this where they are looking for a similar answer on the client side)
(虽然这个线程是关于服务器端 Python 的。因为这个问题是前一段时间被问到的。其他人可能会偶然发现他们在客户端寻找类似的答案)
For a client side solution, you might want to take a look at Async.js library especially the "Control-Flow" section.
对于客户端解决方案,您可能需要查看 Async.js 库,尤其是“控制流”部分。
https://github.com/caolan/async#control-flow
https://github.com/caolan/async#control-flow
By combining the "Parallel" with a "Waterfall" you can achieve your desired result.
通过将“平行”与“瀑布”相结合,您可以获得您想要的结果。
WaterFall( Parallel(TaskA, TaskB, TaskC) -> PostParallelTask)
瀑布(并行(任务A,任务B,任务C)-> PostParallelTask)
If you examine the example under Control-Flow - "Auto" they give you an example of the above: https://github.com/caolan/async#autotasks-callbackwhere "write-file" depends on "get_data" and "make_folder" and "email_link" depends on write-file".
如果您检查 Control-Flow - "Auto" 下的示例,它们会为您提供上述示例:https: //github.com/caolan/async#autotasks-callback其中“write-file”取决于“get_data”和“ make_folder”和“email_link”取决于写入文件”。
Please note that all of this happens on the client side (unless you're doing Node.JS - on the server-side)
请注意,所有这些都发生在客户端(除非您在服务器端执行 Node.JS)
For server-side Python, look at PyCURL @ https://github.com/pycurl/pycurl/blob/master/examples/basicfirst.py
对于服务器端 Python,请查看 PyCURL @ https://github.com/pycurl/pycurl/blob/master/examples/basicfirst.py
By combining the example below with pyCurl, you can achieve the non-blocking multi-threaded functionality.
通过将下面的示例与 pyCurl 结合,您可以实现非阻塞多线程功能。
Hope this helps. Good luck.
希望这可以帮助。祝你好运。
Venkatt @ http://MyThinkpond.com

