非常简单的python HTTP代理?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/4412581/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
seriously simple python HTTP proxy?
提问by jma
I have looked everywhere and found millions of python proxy servers but none do precisely what i would like (i think :s)
我到处寻找,发现了数百万个 python 代理服务器,但没有一个完全符合我的意愿(我认为 :s)
I have had quite a bit of experience with python generally, but i'm quite new to the world of the deep dark secrets of the HTTP protocol.
我一般对 python 有相当多的经验,但我对 HTTP 协议的深层秘密世界还是很陌生。
What i think might be useful would be a very simple proxy example that can be connected to and will then itself try to connect to the address passed to it.
我认为可能有用的是一个非常简单的代理示例,它可以连接到然后自己尝试连接到传递给它的地址。
Also, i think what has been confusing me is everything the hidden stuff is doing, e.g. if the class inherits from BaseHTTPServer.BaseHTTPRequestHandler what precisely happens when a page is requested, as in many of the examples i have found there is no reference to path variable then suddenly poof! self.path is used in a function. im assuming it's been inherited, but how does it end up with the path used?
此外,我认为让我感到困惑的是隐藏的东西正在做的一切,例如,如果类继承自 BaseHTTPServer.BaseHTTPRequestHandler 请求页面时会发生什么,因为在我发现的许多示例中都没有对路径的引用变然后突然噗!self.path 用于函数中。我假设它是被继承的,但它如何以使用的路径结束?
im sorry if that didn't make much sense, as my idea of my problem is probably scrambled :(
如果这没有多大意义,我很抱歉,因为我对我的问题的想法可能被打乱了:(
if you can think of anything which would make my question clearer please, please suggest i add it. xxx
如果你能想到任何能让我的问题更清楚的东西,请建议我添加它。xxx
Edit:
编辑:
Also, a link to an explaination of the detailed processes through which the proxy handles the request, requests the page (how to read/modify the data at this point) and passes it to the original requester would be greatly appreciated xxxx
此外,代理处理请求、请求页面(此时如何读取/修改数据)并将其传递给原始请求者的详细过程的解释的链接将不胜感激 xxxx
采纳答案by Laurence Gonsalves
"a very simple proxy example that can be connected to and will then itself try to connect to the address passed to it." That is practically the definition of an HTTP proxy.
“一个非常简单的代理示例,它可以连接到然后自己尝试连接到传递给它的地址。” 这实际上是 HTTP 代理的定义。
There's a reallysimple proxy example here: http://effbot.org/librarybook/simplehttpserver.htm
这里有一个非常简单的代理示例:http: //effbot.org/librarybook/simplehttpserver.htm
The core of it is just 3 lines:
它的核心只有 3 行:
class Proxy(SimpleHTTPServer.SimpleHTTPRequestHandler):
def do_GET(self):
self.copyfile(urllib.urlopen(self.path), self.wfile)
So it's a SimpleHTTPRequestHandlerthat, in response to a GET request, opens the URL in the path (a request to a proxy typically looks like "GET http://example.com/", not like "GET /index.html"). It then just copies whatever it can read from that URL to the response.
因此SimpleHTTPRequestHandler,为了响应 GET 请求,打开路径中的 URL(对代理的请求通常看起来像“GET http://example.com/”,而不是“GET /index.html”)。然后它只是将它可以从该 URL 读取的任何内容复制到响应中。
Notet that this is reallyminimal. It doesn't deal with headers at all, I believe.
请注意,这确实很小。我相信,它根本不处理标题。
BTW: pathis documented at http://docs.python.org/library/basehttpserver.html. It was set before your do*method was called.
顺便说一句:path记录在http://docs.python.org/library/basehttpserver.html。它是在do*调用您的方法之前设置的。

