设置 java URLConnection 的用户代理
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/2529682/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Setting user agent of a java URLConnection
提问by DiglettPotato
I'm trying to parse a webpage using Java with URLConnection. I try to set up the user-agent like this:
我正在尝试使用带有 URLConnection 的 Java 解析网页。我尝试像这样设置用户代理:
java.net.URLConnection c = url.openConnection();
c.setRequestProperty("User-Agent", "Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.4; en-US; rv:1.9.2.2) Gecko/20100316 Firefox/3.6.2");
But the resulting user agent is the one I specify, with "Java/1.5.0_19" appended to the end. Is there a way to truly set the user agent without this addition?
但是生成的用户代理是我指定的,在末尾附加了“Java/1.5.0_19”。有没有办法在没有这个添加的情况下真正设置用户代理?
采纳答案by Tom Hawtin - tackline
Off hand, setting the http.agent
system property to ""
might do the trick (I don't have the code in front of me).
顺便说一句,将http.agent
系统属性设置为""
可能会起作用(我面前没有代码)。
You might get away with:
你可能会逃脱:
System.setProperty("http.agent", "");
but that might require a race between you and initialisation of the URL protocol handler, if it caches the value at startup (actually, I don't think it does).
但这可能需要在您和 URL 协议处理程序的初始化之间进行竞争,如果它在启动时缓存该值(实际上,我认为不会)。
The property can also be set through JNLP files (available to applets from 6u10) and on the command line:
该属性也可以通过 JNLP 文件(可用于 6u10 中的小程序)和在命令行上设置:
-Dhttp.agent=
Or for wrapper commands:
或者对于包装器命令:
-J-Dhttp.agent=
回答by juwens
Just for clarification: setRequestProperty("User-Agent", "Mozilla ...")
now works just fine and doesn't append java/xx
at the end! At least with Java 1.6.30 and newer.
只是为了澄清:setRequestProperty("User-Agent", "Mozilla ...")
现在可以正常工作并且不会java/xx
在最后追加!至少在 Java 1.6.30 和更新版本中。
I listened on my machine with netcat(a port listener):
我用 netcat(端口监听器)在我的机器上监听:
$ nc -l -p 8080
It simply listens on the port, so you see anything which gets requested, like raw http-headers.
它只是侦听端口,因此您可以看到请求的任何内容,例如原始 http 标头。
And got the following http-headers without setRequestProperty:
并在没有 setRequestProperty 的情况下获得以下 http 标头:
GET /foobar HTTP/1.1
User-Agent: Java/1.6.0_30
Host: localhost:8080
Accept: text/html, image/gif, image/jpeg, *; q=.2, */*; q=.2
Connection: keep-alive
And WITH setRequestProperty:
并使用 setRequestProperty:
GET /foobar HTTP/1.1
User-Agent: Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.4; en-US; rv:1.9.2.2) Gecko/20100316 Firefox/3.6.2
Host: localhost:8080
Accept: text/html, image/gif, image/jpeg, *; q=.2, */*; q=.2
Connection: keep-alive
As you can see the user agent was properly set.
如您所见,用户代理已正确设置。
Full example:
完整示例:
import java.io.IOException;
import java.net.URL;
import java.net.URLConnection;
public class TestUrlOpener {
public static void main(String[] args) throws IOException {
URL url = new URL("http://localhost:8080/foobar");
URLConnection hc = url.openConnection();
hc.setRequestProperty("User-Agent", "Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.4; en-US; rv:1.9.2.2) Gecko/20100316 Firefox/3.6.2");
System.out.println(hc.getContentType());
}
}
回答by Bachan Joseph
its work for me set the User-Agent in the addRequestProperty.
它对我的工作 在 addRequestProperty 中设置了 User-Agent。
URL url = new URL(<URL>);
HttpURLConnection httpConn = (HttpURLConnection) url.openConnection();
httpConn.addRequestProperty("User-Agent","Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:56.0) Gecko/20100101 Firefox/56.0");
回答by Sam Ginrich
HTTP Servers tend to reject old browsers and systems.
HTTP 服务器倾向于拒绝旧的浏览器和系统。
The page Tech Blog (wh): Most Common User Agentsreflects the user-agent property of your current browser in section "Your user agent is:", which can be applied to set the request property "User-Agent" of a java.net.URLConnection
or the system property "http.agent".
页面 技术博客 (wh): Most Common User Agents在“Your user agent is:”部分反映了您当前浏览器的 user-agent 属性,可用于设置 ajava.net.URLConnection
或a 的请求属性“User-Agent”系统属性“http.agent”。