如何在 JavaScript 或 jQuery 中规范化 HTML?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/3974734/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-23 06:57:20  来源:igfitidea点击:

How to normalize HTML in JavaScript or jQuery?

javascriptjqueryhtmlhtml-parsing

提问by Julien

Tags can have multiple attributes. The order in which attributes appear in the code does not matter. For example:

标签可以有多个属性。属性在代码中出现的顺序无关紧要。例如:

<a href="#" title="#">
<a title="#" href="#">

How can I "normalize" the HTML in Javascript, so the order of the attributes is always the same? I don't care which order is chosen, as long as it is always the same.

如何在 Javascript 中“规范化”HTML,使属性的顺序始终相同?我不在乎选择哪个顺序,只要它总是相同的。

UPDATE: my original goal was to make it easier to diff (in JavaScript) 2 HTML pages with slight differences. Because users could use different software to edit the code, the order of the attributes could change. This make the diff too verbose.

更新:我最初的目标是更容易区分(在 JavaScript 中)2 个有细微差别的 HTML 页面。由于用户可以使用不同的软件来编辑代码,因此属性的顺序可能会发生变化。这使得差异过于冗长。

ANSWER: Well, first thanks for all the answers. And YES, it is possible. Here is how I've managed to do it. This is a proof of concept, it can certainly be optimized:

回答: 好的,首先感谢所有的回答。是的,这是可能的。这是我设法做到的。这是一个概念证明,它当然可以优化:

function sort_attributes(a, b) {
  if( a.name == b.name) {
    return 0;
  }

  return (a.name < b.name) ? -1 : 1;
}

$("#original").find('*').each(function() {
  if (this.attributes.length > 1) {
    var attributes = this.attributes;
    var list = [];

    for(var i =0; i < attributes.length; i++) {
      list.push(attributes[i]);
    }

    list.sort(sort_attributes);

    for(var i = 0; i < list.length; i++) {
      this.removeAttribute(list[i].name, list[i].value);
    }

    for(var i = 0; i < list.length; i++) {
      this.setAttribute(list[i].name, list[i].value);
    }
  }
});

Same thing for the second element of the diff, $('#different'). Now $('#original').html()and $('#different').html()show HTML code with attributes in the same order.

差异的第二个元素也是如此,$('#different'). 现在$('#original').html()$('#different').html()以相同的顺序显示具有属性的 HTML 代码。

采纳答案by Julien

This is a proof of concept, it can certainly be optimized:

这是一个概念证明,它当然可以优化:

function sort_attributes(a, b) {
  if( a.name == b.name) {
    return 0;
  }

  return (a.name < b.name) ? -1 : 1;
 }

$("#original").find('*').each(function() {
  if (this.attributes.length > 1) {
    var attributes = this.attributes;
    var list = [];

    for(var i =0; i < attributes.length; i++) {
      list.push(attributes[i]);
    }

     list.sort(sort_attributes);

    for(var i = 0; i < list.length; i++) {
      this.removeAttribute(list[i].name, list[i].value);
    }

     for(var i = 0; i < list.length; i++) {
       this.setAttribute(list[i].name, list[i].value);
    }
  }
 });

Same thing for the second element of the diff, $('#different'). Now $('#original').html() and $('#different').html() show HTML code with attributes in the same order.

差异的第二个元素 $('#different') 也是如此。现在 $('#original').html() 和 $('#different').html() 以相同的顺序显示具有属性的 HTML 代码。

回答by Tung Nguyen

JavaScript doesn't actually see a web page in the form of text-based HTML, but rather as a tree structure known as the DOM, or Document Object Model. The order of HTML element attributes in the DOM is not defined (in fact, as Svend comments, they're not even part of the DOM), so the idea of sorting them at the point where JavaScript runs is irrelevant.

JavaScript 实际上并没有以基于文本的 HTML 的形式查看网页,而是将其视为一种称为 DOM 或文档对象模型的树结构。DOM 中 HTML 元素属性的顺序没有定义(事实上,正如 Svend 评论的那样,它们甚至不是 DOM 的一部分),因此在 JavaScript 运行的点对它们进行排序的想法是无关紧要的。

I can only guess what you're trying to achieve. If you're trying to do this to improve JavaScript/page performance, most HTML document renderers already presumably put a lot of effort into optimising attribute access, so there's little to be gained there.

我只能猜测你想要达到的目标。如果您尝试这样做以提高 JavaScript/页面性能,大多数 HTML 文档渲染器可能已经在优化属性访问方面投入了大量精力,因此在那里几乎没有什么收获。

If you're trying to order attributes to make gzip compression of pages more effective as they're sent over the wire, understand that JavaScript runs after that point in time. Instead, you may want to look at things that run server-side instead, though it's probably more trouble than it's worth.

如果您尝试对属性进行排序以使页面的 gzip 压缩在通过网络发送时更有效,请了解 JavaScript 在该时间点之后运行。相反,您可能希望查看在服务器端运行的东西,尽管它可能比它的价值更麻烦。

回答by Kim Bruning

Take the HTML and parse into a DOM structure. Then take the DOM structure, and write it back out to HTML. While writing, sort the attributes using any stable sort. Your HTML will now be normalized with regard to attributes.

获取 HTML 并解析为 DOM 结构。然后获取 DOM 结构,并将其写回 HTML。在编写时,使用任何稳定排序对属性进行排序。您的 HTML 现在将在属性方面进行规范化。

This is a general way to normalize things. (parse non-normalized data, then write it back out in normalized form).

这是规范化事物的一般方法。(解析非规范化数据,然后以规范化形式将其写回)。

I'm not sure why you'd want to Normalize HTML, but there you have it. Data is data. ;-)

我不确定您为什么要规范化 HTML,但是您已经做到了。数据就是数据。;-)

回答by tsurahman

you can try open HTML tab in firebug, the attributes are always in same order

您可以尝试在 firebug 中打开 HTML 选项卡,属性始终按相同顺序排列

回答by Snowhare

Actually, I can think of a few good reasons. One would be comparison for identity matching and for use with 'diff' type tools where it is quite annoying that semantically equivalent lines can be marked as "different".

其实,我能想到几个很好的理由。一种是身份匹配的比较以及与“差异”类型工具一起使用的比较,其中语义等效的行可以标记为“不同”是非常烦人的。

The real question is "Why in Javascript"?

真正的问题是“为什么在 Javascript 中”?

This question "smells" of "I have a problem and I think I have an answer...but I have a problem with my answer, too."

这个问题“闻起来”是“我有问题,我想我有答案……但我的答案也有问题。”

If the OP would explain whythey want to do this, their chances of getting a good answer would go up dramatically.

如果 OP 能解释他们为什么要这样做,他们得到好的答案的机会就会大大增加。

回答by signedbit

The question "What is the need for this?" Answer: It makes the code more readable and easier to understand.

问题“这有什么需要?” 答:它使代码更具可读性和更容易理解。

Why most UI sucks... Many programmers fail to understand the need for simplifying the users job. In this case, the users job is reading and understanding the code. One reason to order the attributes is for the human who has to debug and maintain the code. An ordered list, which the program becomes familiar with, makes his job easier. He can more quickly find attributes, or realize which attributes are missing, and more quickly change attribute values.

为什么大多数 UI 很糟糕... 许多程序员未能理解简化用户工作的必要性。在这种情况下,用户的工作是阅读和理解代码。对属性进行排序的原因之一是供必须调试和维护代码的人员使用。程序熟悉的有序列表使他的工作更容易。他可以更快地找到属性,或者意识到缺少哪些属性,更快速地更改属性值。

回答by Ali

This only matters when someone is reading the source, so for me it's semantic attributes first, less semantic ones next...

这仅在有人阅读源代码时才重要,所以对我来说,首先是语义属性,然后是语义属性较少的属性......

There are exceptions of course, if you have for example consecutive <li>'s, all with one attribute on each and others only on some, you may want to ensure the shared ones are all at the start, followed by individual ones, eg.

当然也有例外,例如,如果您有连续的 <li>,每个都有一个属性,而其他只有一些属性,您可能希望确保共享的都在开头,然后是单独的,例如.

<li a="x">A</li>
<li a="y" b="t">B</li>
<li a="z">C</li>

<li a="x">A</li>
<li a="y" b="t">B</li>
<li a="z">C</li>

(Even if the "b" attribute is more semantically useful than "a")

(即使“b”属性在语义上比“a”更有用)

You get the idea.

你明白了。

回答by Nasaralla

it is actually possible, I think, if the html contents are passed as xml and rendered through xslt... therefore your original content in XML can be in whatever order you want.

我认为这实际上是可能的,如果 html 内容作为 xml 传递并通过 xslt 呈现......因此您在 XML 中的原始内容可以按您想要的任何顺序排列。