java HTMLUnit:执行速度超慢?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/10442803/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
HTMLUnit : super slow execution?
提问by Kanishka Dilshan
I have been using HTMLUnit . It suits my requirements well. But it seems to be extremely slow. for example : I have automated the following scenario using HTMLUnit
我一直在使用 HTMLUnit 。它非常适合我的要求。但它似乎非常缓慢。例如:我使用 HTMLUnit 自动化了以下场景
Goto Google page
Enter some text
Click on the search button
Get the title of the results page
Click on the first result.
Code :
代码 :
long t1=System.currentTimeMillis();
Logger logger=Logger.getLogger("");
logger.setLevel(Level.OFF);
WebClient webClient=createWebClient();
WebRequest webReq=new WebRequest(new URL("http://google.lk"));
HtmlPage googleMainPage=webClient.getPage(webReq);
HtmlTextInput searchTextField=(HtmlTextInput) googleMainPage.getByXPath("//input[@name='q']").get(0);
HtmlButton searchButton=(HtmlButton) googleMainPage.getByXPath("//button[@name='btnK']").get(0);
searchTextField.type("Sri Lanka");
System.out.println("Text typed!");
HtmlPage googleResultsPage= searchButton.click();
System.out.println("Search button clicked!");
System.out.println("Title : " + googleResultsPage.getTitleText());
HtmlAnchor firstResultLink=(HtmlAnchor) googleResultsPage.getByXPath("//a[@class='l']").get(0);
HtmlPage firstResultPage=firstResultLink.click();
System.out.println("First result clicked!");
System.out.println("Title : " + firstResultPage.getTitleText());
//System.out.println(firstResultPage.asText());
long t2=System.currentTimeMillis();
long diff=t2-t1;
System.out.println("Time elapsed : " + milliSecondsToHrsMinutesAndSeconds(diff));
webClient.closeAllWindows();
It works 100% well. But it takes 3 minutes,41 seconds
它 100% 运行良好。但是需要3分41秒
I guess the reason for the slow execution is validating each and every element on the page.
我猜执行缓慢的原因是验证页面上的每个元素。
My question is how to reduce the execution time of HTMLUnit ? is there any way to disable validations on webpages.
我的问题是如何减少 HTMLUnit 的执行时间?有什么方法可以禁用网页验证。
Thanks in advance!
提前致谢!
回答by fstang
For the current htmlUnit 2.13, setting options is slightly different from what maxmax has provided:
对于当前的 htmlUnit 2.13,设置选项与 maxmax 提供的设置略有不同:
final WebClient webClient = new WebClient(BrowserVersion.CHROME);
webClient.getOptions().setCssEnabled(false);//if you don't need css
webClient.getOptions().setJavaScriptEnabled(false);//if you don't need js
HtmlPage page = webClient.getPage("http://XXX.xxx.xx");
...
In my own test, this is 8 times faster than the default options.(Note that this could be webpage-dependent)
在我自己的测试中,这比默认选项快 8 倍。(请注意,这可能取决于网页)
回答by maxmax
- Be sure to use latest htmlunit version (2.9). I had a performance boost from previous version.
- 请务必使用最新的 htmlunit 版本 (2.9)。我从以前的版本中获得了性能提升。
I get your example done within 20s, or 40s depending options i set. As i can't see the webClient initialisation, i guess maybe it could be the problem.
根据我设置的选项,我会在 20 秒或 40 秒内完成您的示例。由于我看不到 webClient 初始化,我想这可能是问题所在。
Here's my initialisation for a 20s treatment :
这是我对 20 秒治疗的初始化:
WebClient client = new WebClient(BrowserVersion.FIREFOX_3_6);
client.setTimeout(60000);
client.setRedirectEnabled(true);
client.setJavaScriptEnabled(true);
client.setThrowExceptionOnFailingStatusCode(false);
client.setThrowExceptionOnScriptError(false);
client.setCssEnabled(false);
client.setUseInsecureSSL(true);
回答by MrSmith42
I recommend also to set a time limit to the javascript:
我还建议为 javascript 设置时间限制:
client.setJavaScriptTimeout(30000); //e.g. 30s