Python httplib.InvalidURL:非数字端口:
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/14491814/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
httplib.InvalidURL: nonnumeric port:
提问by user1985563
i'm trying to do a script which check if many urls exists:
我正在尝试做一个脚本来检查是否存在许多网址:
import httplib
with open('urls.txt') as urls:
for url in urls:
connection = httplib.HTTPConnection(url)
connection.request("GET")
response = connection.getresponse()
if response.status == 200:
print '[{}]: '.format(url), "Up!"
But I got this error:
但我收到了这个错误:
Traceback (most recent call last):
File "test.py", line 5, in <module>
connection = httplib.HTTPConnection(url)
File "/usr/lib/python2.7/httplib.py", line 693, in __init__
self._set_hostport(host, port)
File "/usr/lib/python2.7/httplib.py", line 721, in _set_hostport
raise InvalidURL("nonnumeric port: '%s'" % host[i+1:])
httplib.InvalidURL: nonnumeric port: '//globo.com/galeria/amazonas/a.html
What's wrong?
怎么了?
采纳答案by tom
httplib.HttpConnectiontakes the hostand portof the remote URL in its constructor, and not the whole URL.
httplib.HttpConnection在其构造函数中采用远程 URL的host和port,而不是整个 URL。
For your use case, it's easier to use urllib2.urlopen.
对于您的用例,使用urllib2.urlopen.
import urllib2
with open('urls.txt') as urls:
for url in urls:
try:
r = urllib2.urlopen(url)
except urllib2.URLError as e:
r = e
if r.code in (200, 401):
print '[{}]: '.format(url), "Up!"
elif r.code == 404:
print '[{}]: '.format(url), "Not Found!"
回答by Atul Arvind
This might be a simple solution, here
这可能是一个简单的解决方案,在这里
connection = httplib.HTTPConnection(url)
you are using the httpconnectionso no need to give url like, http://OSMQuote.combut instead of that you need to give OSMQuote.com.
您正在使用,httpconnection因此无需提供诸如http://OSMQuote.com 之类的网址,但您需要提供OSMQuote.com 之类的网址。
In short remove the http://and https://from your URL, because the httplibis considering :as a port number and the port number must be numeric,
简而言之http://,https://从您的 URL 中删除和,因为httplib正在考虑:作为端口号并且端口号必须是数字,
Hope this helps!
希望这可以帮助!

