如何使用 JavaScript 创建 Document 对象

Question

提问by jayarjo

Basically that's the question, how is one supposed to construct a Documentobject from a string of HTML dynamically in javascript?

基本上这就是问题，应该如何在 javascript 中从一串 HTML 动态构造一个Document对象？

Answer 1

采纳答案by Andy E

There are two methods defined in specifications, createDocumentfrom DOM Core Level 2 and createHTMLDocumentfrom HTML5. The former creates an XML document (including XHTML), the latter creates a HTML document. Both reside, as functions, on the DOMImplementationinterface.

规范中定义了两种方法，createDocument来自 DOM Core Level 2 和createHTMLDocument来自 HTML5。前者创建一个 XML 文档（包括 XHTML），后者创建一个 HTML 文档。两者都作为函数驻留在DOMImplementation接口上。

var impl    = document.implementation,
    xmlDoc  = impl.createDocument(namespaceURI, qualifiedNameStr, documentType),
    htmlDoc = impl.createHTMLDocument(title);

In reality, these methods are rather young and only implemented in recent browser releases. According to http://quirksmode.organd MDN, the following browsers support createHTMLDocument:

实际上，这些方法还很年轻，仅在最近的浏览器版本中实现。根据http://quirksmode.org和MDN，以下浏览器支持createHTMLDocument：

Chrome 4
Opera 10
Firefox 4
Internet Explorer 9
Safari 4

铬 4
歌剧10
火狐 4
浏览器 9
野生动物园 4

Interestingly enough, you can (kind of) create a HTML document in older versions of Internet Explorer, using ActiveXObject:

有趣的是，您可以（有点）在旧版本的 Internet Explorer 中创建一个 HTML 文档，使用ActiveXObject：

var htmlDoc = new ActiveXObject("htmlfile");

The resulting object will be a new document, which can be manipulated just like any other document.

结果对象将是一个新文档，它可以像任何其他文档一样进行操作。

Answer 2

回答by ecmanaut

Assuming you are trying to create a fully parsed Document object from a string of markup and a content-type you also happen to know (maybe because you got the html from an xmlhttprequest, and thus got the content-type in its Content-Typehttp header; probably usually text/html) – it should be this easy:

假设您正在尝试从一串标记和一个您碰巧知道的内容类型创建一个完全解析的 Document 对象（可能是因为您从 xmlhttprequest 中获得了 html，因此在其Content-Typehttp 标头中获得了内容类型；可能通常text/html) – 应该这么简单：

var doc = (new DOMParser).parseFromString(markup, mime_type);

in an ideal future world where browser DOMParserimplementations are as strong and competent as their document rendering is – maybe that's a good pipe dream requirement for future HTML6standards efforts. It turns out no current browsers do, though.

在理想的未来世界中，浏览器DOMParser实现与其文档渲染一样强大和有能力——也许这对未来的HTML6标准工作来说是一个很好的白日梦要求。事实证明，当前的浏览器没有这样做。

You probably have the easier (but still messy) problem of having a string of html you want to get a fully parsed Documentobject for. Here is another take on how to do that, which also ought to work in all browsers – first you make a HTML Documentobject:

您可能有一个更简单（但仍然很麻烦）的问题，即您想要获得一个完全解析的Document对象的 html 字符串。这是另一种方法，它也应该适用于所有浏览器——首先你创建一个 HTMLDocument对象：

var doc = document.implementation.createHTMLDocument('');

and then populate it with your html fragment:

然后用你的 html 片段填充它：

doc.open();
doc.write(html);
doc.close();

Now you should have a fully parsed DOM in doc, which you can run alert(doc.title)on, slice with css selectors like doc.querySelectorAll('p')or ditto XPath using doc.evaluate.

现在你应该在 doc 中有一个完全解析的 DOM，你可以运行alert(doc.title)它，doc.querySelectorAll('p')使用doc.evaluate.

This actually works in modern WebKit browsers like Chrome and Safari (I just tested in Chrome 22 and Safari 6 respectively) – here is an example that takes the current page's source code, recreates it in a new document variable src, reads out its title, overwrites it with a html quoted version of the same source code and shows the result in an iframe: http://codepen.io/johan/full/KLIeE

这实际上适用于 Chrome 和 Safari 等现代 WebKit 浏览器（我刚刚分别在 Chrome 22 和 Safari 6 中进行了测试）——这是一个示例，它采用当前页面的源代码，在新的文档变量中重新创建它src，读出其标题，覆盖它带有相同源代码的 html 引用版本，并在 iframe 中显示结果：http: //codepen.io/johan/full/KLIeE

Sadly, I don't think any other contemporary browsers have quite as solid implementations yet.

遗憾的是，我认为目前还没有任何其他现代浏览器具有如此可靠的实现。

Answer 3

回答by Neil F

An updated answer for 2014, as the DOMparser has evolved. This works in all current browsers I can find, and should work too in earlier versions of IE, using ecManaut's document.implementation.createHTMLDocument('') approach above.

随着 DOMparser 的发展，2014 年的更新答案。这适用于我能找到的所有当前浏览器，并且应该也适用于早期版本的 IE，使用上面的 ecManaut 的 document.implementation.createHTMLDocument('') 方法。

Essentially, IE, Opera, Firefox can all parse as "text/html". Safari parses as "text/xml".

本质上，IE、Opera、Firefox 都可以解析为“text/html”。Safari 解析为“text/xml”。

Beware of intolerant XML parsing, though. The Safari parse will break down at non-breaking spaces and other HTML characters (French/German accents) designated with ampersands. Rather than handle each character separately, the code below replaces all ampersands with meaningless character string "j!J!". This string can subsequently be re-rendered as an ampersand when displaying the results in a browser (simpler, I have found, than trying to handle ampersands in "false" XML parsing).

但是，请注意不容忍的 XML 解析。Safari 解析将在不间断空格和其他用与号指定的 HTML 字符（法语/德语口音）处分解。下面的代码不是单独处理每个字符，而是用无意义的字符串“j!J!”替换所有与符号。当在浏览器中显示结果时，此字符串随后可以重新呈现为＆符号（我发现，比尝试在“假”XML 解析中处理＆符号更简单）。

function parseHTML(sText) {
try {

    console.log("Domparser: " + typeof window.DOMParser);

    if (typeof window.DOMParser !=null) {
        // modern IE, Firefox, Opera  parse text/html
        var parser = new DOMParser();
        var doc = parser.parseFromString(sText, "text/html");
        if (doc != null) {
            console.log("parsed as HTML");
            return doc

        }
        else {

            //replace ampersands with harmless character string to avoid XML parsing issues
            sText = sText.replace(/&/gi, "j!J!");
            //safari parses as text/xml
            var doc = parser.parseFromString(sText, "text/xml");
            console.log("parsed as XML");
            return doc;
        }

    } 
    else  {
        // older IE 
        doc= document.implementation.createHTMLDocument('');
        doc.write(sText);           
        doc.close;
        return doc; 
    }
} catch (err) {
    alert("Error parsing html:\n" + err.message);
}
}

Answer 4

回答by RobG

The following works in most common browsers, but not some. This is how simple it shouldbe (but isn't):

以下内容适用于大多数常见浏览器，但不适用于某些浏览器。这应该是多么简单（但不是）：

// Fails if UA doesn't support parseFromString for text/html (e.g. IE)
function htmlToDoc(markup) {
  var parser = new DOMParser();
  return parser.parseFromString(markup, "text/html");
}

var htmlString = "<title>foo bar</title><div>a div</div>";
alert(htmlToDoc(htmlString).title);

To account for user agent vagaries, the following may be better (please note attribution):

为了解释用户代理的变幻莫测，以下可能更好（请注意归属）：

/*
 * DOMParser HTML extension
 * 2012-02-02
 *
 * By Eli Grey, http://eligrey.com
 * Public domain.
 * NO WARRANTY EXPRESSED OR IMPLIED. USE AT YOUR OWN RISK.
 *
 * Modified to work with IE 9 by RobG
 * 2012-08-29
 *
 * Notes:
 *
 *  1. Supplied markup should be avalid HTML document with or without HTML tags and
 *     no DOCTYPE (DOCTYPE support can be added, I just didn't do it)
 *
 *  2. Host method used where host supports text/html
 */

/*! @source https://gist.github.com/1129031 */
/*! @source https://developer.mozilla.org/en-US/docs/DOM/DOMParser */

/*global document, DOMParser*/

(function(DOMParser) {
    "use strict";

    var DOMParser_proto;
    var real_parseFromString;
    var textHTML;         // Flag for text/html support
    var textXML;          // Flag for text/xml support
    var htmlElInnerHTML;  // Flag for support for setting html element's innerHTML

    // Stop here if DOMParser not defined
    if (!DOMParser) return;

    // Firefox, Opera and IE throw errors on unsupported types
    try {
        // WebKit returns null on unsupported types
        textHTML = !!(new DOMParser).parseFromString('', 'text/html');

    } catch (er) {
      textHTML = false;
    }

    // If text/html supported, don't need to do anything.
    if (textHTML) return;

    // Next try setting innerHTML of a created document
    // IE 9 and lower will throw an error (can't set innerHTML of its HTML element)
    try {
      var doc = document.implementation.createHTMLDocument('');
      doc.documentElement.innerHTML = '<title></title><div></div>';
      htmlElInnerHTML = true;

    } catch (er) {
      htmlElInnerHTML = false;
    }

    // If if that failed, try text/xml
    if (!htmlElInnerHTML) {

        try {
            textXML = !!(new DOMParser).parseFromString('', 'text/xml');

        } catch (er) {
            textHTML = false;
        }
    }

    // Mess with DOMParser.prototype (less than optimal...) if one of the above worked
    // Assume can write to the prototype, if not, make this a stand alone function
    if (DOMParser.prototype && (htmlElInnerHTML || textXML)) { 
        DOMParser_proto = DOMParser.prototype;
        real_parseFromString = DOMParser_proto.parseFromString;

        DOMParser_proto.parseFromString = function (markup, type) {

            // Only do this if type is text/html
            if (/^\s*text\/html\s*(?:;|$)/i.test(type)) {
                var doc, doc_el, first_el;

                // Use innerHTML if supported
                if (htmlElInnerHTML) {
                    doc = document.implementation.createHTMLDocument("");
                    doc_el = doc.documentElement;
                    doc_el.innerHTML = markup;
                    first_el = doc_el.firstElementChild;

                // Otherwise use XML method
                } else if (textXML) {

                    // Make sure markup is wrapped in HTML tags
                    // Should probably allow for a DOCTYPE
                    if (!(/^<html.*html>$/i.test(markup))) {
                        markup = '<html>' + markup + '<\/html>'; 
                    }
                    doc = (new DOMParser).parseFromString(markup, 'text/xml');
                    doc_el = doc.documentElement;
                    first_el = doc_el.firstElementChild;
                }

                // RG: I don't understand the point of this, I'll leave it here though 
                //     In IE, doc_el is the HTML element and first_el is the HEAD.
                //
                // Is this an entire document or a fragment?
                if (doc_el.childElementCount == 1 && first_el.localName.toLowerCase() == 'html') {
                    doc.replaceChild(first_el, doc_el);
                }

                return doc;

            // If not text/html, send as-is to host method
            } else {
                return real_parseFromString.apply(this, arguments);
            }
        };
    }
}(DOMParser));

// Now some test code
var htmlString = '<html><head><title>foo bar</title></head><body><div>a div</div></body></html>';
var dp = new DOMParser();
var doc = dp.parseFromString(htmlString, 'text/html');

// Treat as an XML document and only use DOM Core methods
alert(doc.documentElement.getElementsByTagName('title')[0].childNodes[0].data);

Don't be put off by the amount of code, there are a lot of comments, it can be shortened quite a bit but becomes less readable.

不要被代码量推迟，有很多注释，它可以缩短很多但变得不那么可读。

Oh, and if the markup is valid XML, life is much simpler:

哦，如果标记是有效的 XML，生活就简单多了：

var stringToXMLDoc = (function(global) {

  // W3C DOMParser support
  if (global.DOMParser) {
    return function (text) {
      var parser = new global.DOMParser();
      return parser.parseFromString(text,"application/xml");
    }

  // MS ActiveXObject support
  } else {
    return function (text) {
      var xmlDoc;

      // Can't assume support and can't test, so try..catch
      try {
        xmlDoc = new ActiveXObject("Microsoft.XMLDOM");
        xmlDoc.async="false";
        xmlDoc.loadXML(text);
      } catch (e){}
      return xmlDoc;
    }
  }
}(this));


var doc = stringToXMLDoc('<books><book title="foo"/><book title="bar"/><book title="baz"/></books>');
alert(
  doc.getElementsByTagName('book')[2].getAttribute('title')
);

Answer 5

回答by Chris Baker

Per the spec (doc), one may use the createHTMLDocumentmethod of DOMImplementation, accessible via document.implementationas follows:

根据规范 ( doc)，可以使用的createHTMLDocument方法DOMImplementation，可通过document.implementation以下方式访问：

var doc = document.implementation.createHTMLDocument('My title');  
var body = document.createElementNS('http://www.w3.org/1999/xhtml', 'body'); 
doc.documentElement.appendChild(body);
// and so on

jsFiddle: http://jsfiddle.net/9Fh7R/
MDN document for DOMImplementation: https://developer.mozilla.org/en/DOM/document.implementation
MDN document for DOMImplementation.createHTMLDocument: https://developer.mozilla.org/En/DOM/DOMImplementation.createHTMLDocument

jsFiddle：http: //jsfiddle.net/9Fh7R/
MDN 文档DOMImplementation：https: //developer.mozilla.org/en/DOM/document.implementation
MDN 文档DOMImplementation.createHTMLDocument：https: //developer.mozilla.org/En/DOM/DOMImplementation.createHTMLDocument

如何使用 JavaScript 创建 Document 对象

提问by jayarjo

采纳答案by Andy E

回答by ecmanaut

回答by Neil F

回答by RobG

回答by Chris Baker

相关推荐

最近更新

标签

如何使用 JavaScript 创建 Document 对象

提问by jayarjo

采纳答案by Andy E

回答by ecmanaut

回答by Neil F

回答by RobG

回答by Chris Baker

相关推荐

Javascript d3.js - 在堆叠图上移动 y 轴

Javascript Firefox Web 控制台已禁用？

Javascript 如何使用 jQuery 检测 URL 更改

Javascript 隐藏具有相同类名的所有元素？

相关推荐

最近更新

标签