javascript 如何使用javascript下载网页的整个HTML?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/8701432/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How to download entire HTML of a webpage using javascript?
提问by Meysam
Is it possible to download the entire HTML
of a webpage using JavaScript
given the URL? What I want to do is to develop a Firefox add-on to download the content of all the links found in the source of current page of browser.
是否可以HTML
使用JavaScript
给定的 URL下载整个网页?我想做的是开发一个 Firefox 插件来下载浏览器当前页面源中找到的所有链接的内容。
update: the URLs reside in the same domain
更新:URL 位于同一个域中
回答by erturne
It should be possible to do using jQuery ajax. Javascript in a Firefox extension is not subject to the cross-origin restriction. Here are some tips for using jQuery in a Firefox extension:
应该可以使用 jQuery ajax。Firefox 扩展中的 Javascript 不受跨域限制。以下是在 Firefox 扩展中使用 jQuery 的一些技巧:
Add the jQuery library to your extension's chrome/content/ directory.
Load jQuery in the window load event callback rather than including it in your browser overlay XUL. Otherwise it can cause conflicts (e.g. clobbers a user's customized toolbar).
(function(loader){ loader.loadSubScript("chrome://ryebox/content/jquery-1.6.2.min.js"); }) (Components.classes["@mozilla.org/moz/jssubscript-loader;1"].getService(Components.interfaces.mozIJSSubScriptLoader));
Use "jQuery" instead of "$". I experienced weird behavior when using $ instead of jQuery (a conflict of some kind I suppose)
Use jQuery(content.document) instead of jQuery(document) to access a page's DOM. In a Firefox extension "document" refers to the browser's XUL whereas "content.document" refers to the page's DOM.
将 jQuery 库添加到您的扩展程序的 chrome/content/ 目录。
在窗口加载事件回调中加载 jQuery,而不是将它包含在您的浏览器覆盖 XUL 中。否则它会导致冲突(例如破坏用户的自定义工具栏)。
(function(loader){ loader.loadSubScript("chrome://ryebox/content/jquery-1.6.2.min.js"); }) (Components.classes["@mozilla.org/moz/jssubscript-loader;1"].getService(Components.interfaces.mozIJSSubScriptLoader));
使用“jQuery”而不是“$”。我在使用 $ 而不是 jQuery 时遇到了奇怪的行为(我认为是某种冲突)
使用 jQuery(content.document) 而不是 jQuery(document) 来访问页面的 DOM。在 Firefox 扩展中,“document”是指浏览器的 XUL,而“content.document”是指页面的 DOM。
I wrote a Firefox extension for getting bookmarks from my friend's bookmark site. It uses jQuery to fetch my bookmarks in a JSON response from his service, then creates a menu of those bookmarks so that I can easily access them. You can browse the source at https://github.com/erturne/ryebox
我编写了一个 Firefox 扩展,用于从我朋友的书签站点获取书签。它使用 jQuery 从他的服务的 JSON 响应中获取我的书签,然后创建这些书签的菜单,以便我可以轻松访问它们。您可以在https://github.com/erturne/ryebox浏览源代码
回答by Christofer Eliasson
For JavaScript in general, the short answer is no, not unless all pages are within the same domain. JavaScript is limited by the same-origin policy, so for security reasons, you cannot do cross-domain requests like that.
一般来说,对于 JavaScript,简短的回答是否定的,除非所有页面都在同一个域中。JavaScript 受同源策略限制,因此出于安全原因,您不能进行这样的跨域请求。
However, as pointed out by Max and erturne in the comments, when JavaScript is written as part of an extension/add-on to the browser, the regular rules about same origin policy and cross-domain requests does not seem to apply - at least not for Firefox and Chrome. Therefor, using JavaScript to download the pages should be possible using a XMLHttpRequest, or using some of the wrapper methods included in your favorite JS-library.
然而,正如 Max 和 erturne 在评论中指出的那样,当 JavaScript 作为浏览器扩展/附加组件的一部分编写时,关于同源策略和跨域请求的常规规则似乎并不适用 - 至少不适用于 Firefox 和 Chrome。因此,使用 JavaScript 下载页面应该可以使用 XMLHttpRequest 或使用您最喜欢的 JS 库中包含的一些包装方法。
If you like me prefer jQuery, you can have a look at jQuery's .load()method, that loads HTML from a given resource, and inject it into an element that you specify.
如果你喜欢我喜欢 jQuery,你可以看看 jQuery 的.load()方法,它从给定的资源加载 HTML,并将它注入到你指定的元素中。
Edit:Made some updates to my answer based on the comments about cross-domain requests made by add-ons.
编辑:根据关于附加组件提出的跨域请求的评论,对我的答案进行了一些更新。
回答by Thomas Johan Eggum
You can do XmlHttpRequests (XHR`s) if the combination scheme://domain:port is the same for the page hosting the JavaScript that should fetch the HTML.
如果组合 scheme://domain:port 与托管应获取 HTML 的 JavaScript 的页面相同,则您可以执行 XmlHttpRequests (XHR`s)。
Many JS-frameworks gives you easy XHR-support, Jquery, Dojo, etc. Example using DOJO:
许多 JS 框架为您提供了简单的 XHR 支持、Jquery、Dojo 等。 使用 DOJO 的示例:
function getText() {
dojo.xhrGet({
url: "test/someHtml.html",
load: function(response, ioArgs){
//The repsone is the HTML
return response;
},
error: function(response, ioArgs){
return response;
},
handleAs: "text"
});
}
If you prefer writing your own XMLHttpRequest-handler, take a look here: http://www.w3schools.com/xml/xml_http.asp
如果您更喜欢编写自己的 XMLHttpRequest 处理程序,请查看这里:http: //www.w3schools.com/xml/xml_http.asp
回答by qidizi
if you only write a text web page downloader with your mind,and you only know html
and javascript
, you can write a downloader name "download.hta" with html
and javascript
to control Msxml2.ServerXMLHTTP.6.0
and FSO
如果你只写一个文本网页下载你的心,你只知道html
和javascript
,你可以写一个下载名称为“download.hta”以html
和javascript
对控制Msxml2.ServerXMLHTTP.6.0
和FSO