Android 如何从 WebView 获取网页内容?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/2376471/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How do I get the web page contents from a WebView?
提问by gregm
On Android, I have a WebView
that is displaying a page.
在 Android 上,我有一个WebView
正在显示页面。
How do I get the page source without requesting the page again?
如何在不再次请求页面的情况下获取页面源?
It seems WebView
should have some kind of getPageSource()
method that returns a string, but alas it does not.
似乎WebView
应该有某种getPageSource()
返回字符串的方法,但遗憾的是它没有。
If I enable JavaScript, what is the appropriate JavaScript to put in this call to get the contents?
如果我启用 JavaScript,在这个调用中放入什么合适的 JavaScript 来获取内容?
webview.loadUrl("javascript:(function() { " +
"document.getElementsByTagName('body')[0].style.color = 'red'; " +
"})()");
回答by jluckyiv
I know this is a late answer, but I found this question because I had the same problem. I think I found the answer in this poston lexandera.com. The code below is basically a cut-and-paste from the site. It seems to do the trick.
我知道这是一个迟到的答案,但我发现了这个问题,因为我遇到了同样的问题。我想我在 lexandera.com 上的这篇文章中找到了答案。下面的代码基本上是从网站上剪切和粘贴的。它似乎可以解决问题。
final Context myApp = this;
/* An instance of this class will be registered as a JavaScript interface */
class MyJavaScriptInterface
{
@JavascriptInterface
@SuppressWarnings("unused")
public void processHTML(String html)
{
// process the html as needed by the app
}
}
final WebView browser = (WebView)findViewById(R.id.browser);
/* JavaScript must be enabled if you want it to work, obviously */
browser.getSettings().setJavaScriptEnabled(true);
/* Register a new JavaScript interface called HTMLOUT */
browser.addJavascriptInterface(new MyJavaScriptInterface(), "HTMLOUT");
/* WebViewClient must be set BEFORE calling loadUrl! */
browser.setWebViewClient(new WebViewClient() {
@Override
public void onPageFinished(WebView view, String url)
{
/* This call inject JavaScript into the page which just finished loading. */
browser.loadUrl("javascript:window.HTMLOUT.processHTML('<head>'+document.getElementsByTagName('html')[0].innerHTML+'</head>');");
}
});
/* load a web page */
browser.loadUrl("http://lexandera.com/files/jsexamples/gethtml.html");
回答by durka42
Per issue 12987, Blundell's answer crashes (at least on my 2.3 VM). Instead, I intercept a call to console.log with a special prefix:
根据issue 12987,Blundell 的回答崩溃了(至少在我的 2.3 VM 上)。相反,我拦截了一个带有特殊前缀的对 console.log 的调用:
// intercept calls to console.log
web.setWebChromeClient(new WebChromeClient() {
public boolean onConsoleMessage(ConsoleMessage cmsg)
{
// check secret prefix
if (cmsg.message().startsWith("MAGIC"))
{
String msg = cmsg.message().substring(5); // strip off prefix
/* process HTML */
return true;
}
return false;
}
});
// inject the JavaScript on page load
web.setWebViewClient(new WebViewClient() {
public void onPageFinished(WebView view, String address)
{
// have the page spill its guts, with a secret prefix
view.loadUrl("javascript:console.log('MAGIC'+document.getElementsByTagName('html')[0].innerHTML);");
}
});
web.loadUrl("http://www.google.com");
回答by nagoya0
This is an answer based on jluckyiv's, but I think it is better and simpler to change Javascript as follows.
这是基于jluckyiv 的答案,但我认为按如下方式更改 Javascript 更好更简单。
browser.loadUrl("javascript:HTMLOUT.processHTML(document.documentElement.outerHTML);");
回答by larham1
Have you considered fetching the HTML separately, and then loading it into a webview?
您是否考虑过单独获取 HTML,然后将其加载到 webview 中?
String fetchContent(WebView view, String url) throws IOException {
HttpClient httpClient = new DefaultHttpClient();
HttpGet get = new HttpGet(url);
HttpResponse response = httpClient.execute(get);
StatusLine statusLine = response.getStatusLine();
int statusCode = statusLine.getStatusCode();
HttpEntity entity = response.getEntity();
String html = EntityUtils.toString(entity); // assume html for simplicity
view.loadDataWithBaseURL(url, html, "text/html", "utf-8", url); // todo: get mime, charset from entity
if (statusCode != 200) {
// handle fail
}
return html;
}
回答by dr_sulli
I managed to get this working using the code from @jluckyiv's answer but I had to add in @JavascriptInterface annotation to the processHTML method in the MyJavaScriptInterface.
我设法使用@jluckyiv 的答案中的代码使其工作,但我必须在 MyJavaScriptInterface 的 processHTML 方法中添加 @JavascriptInterface 注释。
class MyJavaScriptInterface
{
@SuppressWarnings("unused")
@JavascriptInterface
public void processHTML(String html)
{
// process the html as needed by the app
}
}
回答by javauser71
You also need to annotate the method with @JavascriptInterface if your targetSdkVersion is >= 17 - because there is new security requirements in SDK 17, i.e. all javascript methods must be annotated with @JavascriptInterface. Otherwise you will see error like: Uncaught TypeError: Object [object Object] has no method 'processHTML' at null:1
如果您的 targetSdkVersion >= 17,您还需要使用 @JavascriptInterface 注释该方法 - 因为 SDK 17 中有新的安全要求,即所有 javascript 方法都必须使用 @JavascriptInterface 进行注释。否则你会看到这样的错误:Uncaught TypeError: Object [object Object] has no method 'processHTML' at null:1
回答by onusopus
If you are working on kitkat and above, you can use the chrome remote debugging tools to find all the requests and responses going in and out of your webview and also the the html source code of the page viewed.
如果您正在使用 kitkat 及以上版本,则可以使用 chrome 远程调试工具来查找进出 web 视图的所有请求和响应,以及所查看页面的 html 源代码。