java HtmlUnit 来查看源码

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/5996559/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-10-30 13:53:28  来源:igfitidea点击:

HtmlUnit to view source

javahtmlunit

提问by Jake Sankey

HtmlUnit for Java is great but I haven't been able to figure out how to view the full source or return the source of a web site as a string. can anyone help me with this?

适用于 Java 的 HtmlUnit 很棒,但我无法弄清楚如何查看完整源代码或将网站源代码作为字符串返回。谁能帮我这个?

I know the follow will read the site but now I just want to return the source to a string.

我知道后续会阅读该网站,但现在我只想将源返回到一个字符串。

HtmlPage mySite = webClient.getPage("http://mysite.com");

Thanks!

谢谢!

回答by Jeremy

From looking through the API, my thought would be:

通过查看API,我的想法是:

mySite.getWebResponse().getContentAsString();

回答by Jesse Webb

String pageSource = myPage.asXml();

That will get you the full HTML source of the web page.

这将为您提供网页的完整 HTML 源代码。

String pageText = myPage.asText();

That will get you all of the visible text on the page, including line breaks/white space. It would be the same if you were on the page in your browser and Ctrl+Athen Ctrl+Vinto a variable.

这将使您获得页面上的所有可见文本,包括换行符/空格。如果您在浏览器中的页面上Ctrl+A然后Ctrl+V进入一个变量,它会是一样的。

回答by Kal

have you tried mySite.asXml()? Or you can do mySite.getDocumentElement().toString()

你试过mySite.asXml()吗?或者你可以做mySite.getDocumentElement().toString()