jQuery 使用jquery解析html字符串
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/12808770/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Parsing of html string using jquery
提问by lazyguy
I am trying to parse this html through jQuery to get data1, data2, data3. While I do get data2 and data3 I am unable to get data3 with my approach. I am fairly new to jQuery so please pardon my ignorance.
我正在尝试通过 jQuery 解析此 html 以获取 data1、data2、data3。虽然我确实获得了 data2 和 data3,但我无法通过我的方法获得 data3。我对 jQuery 还很陌生,所以请原谅我的无知。
<html>
<body>
<div class="class0">
<h4>data1</h4>
<p class="class1">data2</p>
<div id="mydivid"><p>data3</p></div>
</div>
</body>
</html>
Here is how I am calling this in my jquery.
这是我在 jquery 中调用它的方式。
var datahtml = "<html><body><div class=\"class0\"><h4>data1</h4><p class=\"class1\">data2</p><div id=\"mydivid\"><p>data3</p></div></div></body></html>";
alert($(datahtml).find(".class0").text()); // Doesn't Work
alert($(datahtml).find(".class1").text()); // work
alert($(datahtml).find("#mydivid").text()); // work
Only alert($(datahtml).find(".class0").text());
is not working the rest are working as expected. I am wondering it may be because class0 has multiple tag inside it or what?? How to get data1 in such scenario?
只有alert($(datahtml).find(".class0").text());
不工作,其余的都按预期工作。我想知道这可能是因为 class0 里面有多个标签还是什么?在这种情况下如何获取data1?
采纳答案by Adil
Its behaviour is weird as it igonores the html and body tag and start from first div with class = "class0". The html is parsed as DOM elements but not added to DOM. For elements added to DOM the selector does not ignore body tag and apply selectors on document. You need to add the html to DOM as given below.
它的行为很奇怪,因为它忽略了 html 和 body 标签,并从第一个 div 开始, class = "class0"。html 被解析为 DOM 元素,但不会添加到 DOM。对于添加到 DOM 的元素,选择器不会忽略 body 标签并在文档上应用选择器。您需要将 html 添加到 DOM,如下所示。
$('#div1').append($(datahtml)); //Add in DOM before applying jquery methods.
alert($('#div1').find(".class0").text()); // Now it Works too
alert($('#div1').find(".class1").text()); // work
alert($('#div1').find("#mydivid").text()); // work
If we wrap your html within some html element to make it starting point instead of your first div with class="class0" then your selector will work as expected.
如果我们将您的 html 包装在某个 html 元素中以使其成为起点而不是您的第一个 div class="class0" 那么您的选择器将按预期工作。
var datahtml = "<html><body><div><div class=\"class0\"><h4>data1</h4><p class=\"class1\">data2</p><div id=\"mydivid\"><p>data3</p></div></div></div></body></html>";
alert($(datahtml).find(".class0").text()); // Now it Works too
alert($(datahtml).find(".class1").text()); // work
alert($(datahtml).find("#mydivid").text()); // work
What jQuery docs say about the jQuery parsing function jQuery() i.e. $()
jQuery 文档对 jQuery 解析函数 jQuery() 即 $() 的描述
When passing in complex HTML, some browsers may not generate a DOM that exactly replicates the HTML source provided. As mentioned, jQuery uses the browser"s .innerHTML property to parse the passed HTML and insert it into the current document. During this process, some browsers filter out certain elements such as
<html>
,<title>
, or<head>
elements. As a result, the elements inserted may not be representative of the original string passed.
在传递复杂的 HTML 时,某些浏览器可能无法生成与所提供的 HTML 源代码完全相同的 DOM。如所提到的,jQuery使用浏览器“S .innerHTML属性来解析通过HTML并将其插入到当前文档中。在此过程中,一些浏览器过滤掉某些内容,如
<html>
,<title>
或<head>
元素,其结果是,元件插入可不能代表传递的原始字符串。
回答by Fabrício Matté
None of the current answers addressed the real issue, so I'll give it a go.
当前的答案都没有解决真正的问题,所以我会试一试。
var datahtml = "<html><body><div class=\"class0\"><h4>data1</h4><p class=\"class1\">data2</p><div id=\"mydivid\"><p>data3</p></div></div></body></html>";
console.log($(datahtml));
$(datahtml)
is a jQuery object containing only the div.class0
element, thus when you call .find
on it, you're actually looking for descendants of div.class0
instead of the whole HTML document that you'd expect.
$(datahtml)
是一个仅包含div.class0
元素的 jQuery 对象,因此当您调用.find
它时,您实际上是在寻找其后代div.class0
而不是您期望的整个 HTML 文档。
A quick solution is to wrap the parsed data in an element so the .find
will work as intended:
一个快速的解决方案是将解析的数据包装在一个元素中,以便.find
按预期工作:
var parsed = $('<div/>').append(datahtml);
console.log(parsed.find(".class0").text());
The reason for this isn't very simple, but I assume that as jQuery does "parsing" of more complex html strings by simply dropping your HTML string into a separate created-on-the-fly DOM fragment and then retrieves the parsed elements, this operation would most likely make the DOM parser ignore the html
and body
tags as they would be illegal in this case.
这样做的原因不是很简单,但我假设 jQuery 通过简单地将您的 HTML 字符串放入一个单独的动态 DOM 片段中,然后检索解析的元素来“解析”更复杂的 html 字符串,此操作很可能会使 DOM 解析器忽略html
和body
标签,因为在这种情况下它们是非法的。
Here is a very small test suitewhich demonstrates that this behavior is consistent through jQuery 1.8.2 all the way down to 1.6.4.
这是一个非常小的测试套件,它演示了从 jQuery 1.8.2 一直到 1.6.4 的这种行为是一致的。
Edit:quoting this post:
编辑:引用这篇文章:
Problem is that jQuery creates a DIV and sets
innerHTML
and then takes DIV children, but since BODY and HEAD elements are not valid DIV childs, then those are not created by browser.
问题是 jQuery 创建一个 DIV 并设置
innerHTML
然后获取 DIV 子元素,但由于 BODY 和 HEAD 元素不是有效的 DIV 子元素,因此它们不是由浏览器创建的。
Makes me more confident that my theory is correct. I'll share it here, hopefully it makes some sense for you. Have the jQuery 1.8.2's uncompressed sourceside by side with this. The #
indicates line numbers.
让我更加相信我的理论是正确的。我会在这里分享它,希望它对你有所帮助。将 jQuery 1.8.2 的未压缩源与此并排放置。该#
指示行号。
All document fragmentsmade through jQuery.buildFragment
(defined @#6122) will go through jQuery.clean
(#6151) (even if it is a cached fragment, it already went through the jQuery.clean
when it was created), and as the quoted text above implies, jQuery.clean
(defined @#6275) creates a fresh div
inside the safe fragment to serve as container for the parsed data - div
element created at #6301-6303, childNodes
retrieved at #6344, div removed at #6347 for cleaning up (plus #6359-6361 as bug fix), childNodes
merged into the return array at #6351-6355 and returned at #6406.
通过(定义@#6122)创建的所有文档片段都jQuery.buildFragment
将通过jQuery.clean
(#6151)(即使它是缓存片段,jQuery.clean
它在创建时也已经通过了),正如上面引用的文本所暗示的那样,jQuery.clean
(定义@# 6275)div
在安全片段内部创建一个新的作为解析数据的容器 -div
在#6301-6303创建的元素,childNodes
在#6344检索,在#6347删除div以进行清理(加上#6359-6361作为错误修复),childNodes
在 #6351-6355 处合并到返回数组中,并在 #6406 处返回。
Therefore, all methods that invoke jQuery.buildFragment
, which include jQuery.parseHTML
and jQuery.fn.domManip
- among those are .append()
, .after()
, .before()
which invoke the domManip
jQuery object method, and the $(html)
which is handled at jQuery.fn.init
(defined @#97, handling of complex [more than a single tag] html strings @#125, invokes jQuery.parseHTML
@#131).
因此,所有调用 的方法jQuery.buildFragment
,其中包括jQuery.parseHTML
和jQuery.fn.domManip
-其中包括.append()
, .after()
,.before()
它们调用 domManip
jQuery 对象方法,并且$(html)
在jQuery.fn.init
(定义@#97,处理复杂的[多于单个标签] html 字符串@#125 , 调用jQuery.parseHTML
@#131)。
It makes sense that virtually all jQuery HTML strings parsing (besides single tag html strings) is done using a div
element as container, and html
/body
tags are not valid descendants of a div
element so they are stripped out.
几乎所有的 jQuery HTML 字符串解析(除了单标签 html 字符串)都是使用div
元素作为容器完成的,并且html
/body
标签不是div
元素的有效后代,因此它们被删除。
Addendum: Newer versions of jQuery (1.9+) have refactored the HTML parsing logic (for instance, the internal jQuery.clean
method no longer exists), but the overall parsing logic remains the same.
附录:较新版本的 jQuery (1.9+) 重构了 HTML 解析逻辑(例如,内部jQuery.clean
方法不再存在),但整体解析逻辑保持不变。
回答by Gershom
I think I have an even better way:
我想我有一个更好的方法:
let's say you've got your html:
假设你有你的 html:
var htmlText = '<html><body><div class="class0"><h4>data1</h4><p class="class1">data2</p><div id="mydivid"><p>data3</p></div></div></body></html>'
Here's the thing you've been hoping to do:
这是你一直希望做的事情:
var dataHtml = $($.parseXML(htmlText)).children('html');
dataHtml
now works exactly like the ordinary jquery objects you're familiar with!!
dataHtml
现在就像你熟悉的普通 jquery 对象一样工作!
The wonderful thing about this solution is that it will not strip body, head, or script tags!
这个解决方案的美妙之处在于它不会剥离 body、head 或 script 标签!
回答by Sushanth --
Try this
尝试这个
alert($(datahtml).find(".class0 h4").text());
The reason being the text you are referring to is inside h4
element of class0
.. So your selector will not work,,
Or access the contents directly..
原因是您所指的文本h4
位于class0
.. 的元素内 ,因此您的选择器将不起作用,或者直接访问内容..
alert($(".class0 h4").text());
alert($(".class1").text());
alert($("#mydivid").text());
EDIT
编辑
var datahtml = "<html><body><div class=\"class0\"><h4>data1</h4><p class=\"class1\">data2</p><div id=\"mydivid\"><p>data3</p></div></div></body></html>";
$('body').html(datahtml);
alert($(".class0 h4").text());
alert($(".class1").text());
alert($("#mydivid").text());
回答by Sem
I don't know any other way than placing the HTML in an temporary invisible container.
除了将 HTML 放在一个临时的不可见容器中,我不知道其他任何方法。
$(document).ready(function(){
var datahtml = $("<html><body><div class=\"class0\"><h4>data1</h4><p class=\"class1\">data2</p><div id=\"mydivid\"><p>data3</p></div></div></body></html>".replace("\", ""));
var tempContainer = $('<div style="display:none;">'+ datahtml +'</div>');
$('body').append(tempContainer);
alert($(tempContainer).find('.class1').text());
$(tempContainer).remove();
});
?
Here is a jsfiddle demo.
这是一个jsfiddle 演示。
回答by danwellman
It doesn't work because the <div>
with the class class0
doesn't have any text nodes as direct children. Add the class to the <h4>
and it will work
它不起作用,因为<div>
with 类class0
没有任何文本节点作为直接子节点。将类添加到<h4>
,它将起作用
回答by John Skoumbourdis
I think the main problem is that you cannot have an html to your jquery. In your case what happens to Jquery is that it tries to find the first html tag, That in your case is the div with class0.
我认为主要的问题是你的 jquery 不能有 html。在您的情况下,Jquery 会尝试找到第一个 html 标记,在您的情况下,这是带有 class0 的 div。
Test this to see that I am right:
测试一下,看看我是对的:
if($(datahtml).hasClass('class0'))
alert('Yes you are right :-)');
So this means that you cannot add the html and or the body tag as a part to have a query within.
因此,这意味着您不能将 html 和/或 body 标记添加为在其中进行查询的一部分。
If you want to make it work just try to add this part of code:
如果你想让它工作,只需尝试添加这部分代码:
<div>
<div class="class0">
<h4>data1</h4>
<p class="class1">data2</p>
<div id="mydivid"><p>data3</p></div>
</div>
</div>
So try this:
所以试试这个:
var datahtml = "<div><div class=\"class0\"><h4>data1</h4><p class=\"class1\">data2</p><div id=\"mydivid\"><p>data3</p></div></div></body></div>";
alert($(datahtml).find(".class0").text()); // work
alert($(datahtml).find(".class1").text()); // work
alert($(datahtml).find("#mydivid").text()); // work