如何比较 XML 文件

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/12176239/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-06 13:39:44  来源:igfitidea点击:

How to compare XML files

xmldiff

提问by Avner Levy

I have two XML files (XSD) which are generated by some tool.
The tool doesn't preserve the order of elements so although the content is equal comparing it as text will result as the files are different.
Is there some tool that can sort the elements before comparing and will enable text comparison of the documents? Of course the sorting needs to be done recursively.

我有两个由某些工具生成的 XML 文件 (XSD)。
该工具不会保留元素的顺序,因此尽管内容相同,但将其作为文本进行比较,因为文件不同。
是否有一些工具可以在比较之前对元素进行排序,并且可以对文档进行文本比较?当然排序需要递归进行。

Data example:
File A:

数据示例:
文件 A:

<xml>
  <A/>
  <B/>
</xml>

File B:

文件乙:

<xml>
  <B/>
  <A/>
</xml>

回答by James Oravec

I had a similar problem and I eventually found: http://superuser.com/questions/79920/how-can-i-diff-two-xml-files

我有一个类似的问题,我最终发现:http: //superuser.com/questions/79920/how-can-i-diff-two-xml-files

That post suggests doing a canonical XML sort then doing a diff. The following should work for you if you are on Linux, Mac, or if you have Windows with something like Cygwin installed:

那篇文章建议先进行规范的 XML 排序,然后再进行比较。如果您使用的是 Linux、Mac 或安装了 Cygwin 之类的 Windows,以下内容应该对您有用:

$ xmllint --c14n FileA.xml > 1.xml
$ xmllint --c14n FileB.xml > 2.xml
$ diff 1.xml 2.xml

回答by mkdev

Have a look at Using XSLT to Assist Regression Testingthat describe a solution using xslt

看看Using XSLT to Assist Regression Testing描述了使用 xslt 的解决方案

回答by Maximilian

You can use the perl module DifferenceMarkup http://metacpan.org/pod/XML::DifferenceMarkupor the xmldiff pecl.php.net/xmldiff extension in PHP. Both will produce a human readable XML diff document.

您可以在 PHP 中使用 perl 模块 DifferenceMarkup http://metacpan.org/pod/XML::DifferenceMarkup或 xmldiff pecl.php.net/xmldiff 扩展。两者都将生成一个人类可读的 XML 差异文档。

回答by austincheney

The XML samples are fundamentally different. Even though the content and the hierarchy may be identical the relationships between peers is different. When XML is parsed it is parsed into a structure called a DOM where relationships between units is very important. If you want to discount the nature of relationships between peer entities then you will likely need custom software. I recommend finding some simple open-source XML aware diff tool and adding the additional requirements that you need. I wrote one at http://prettydiff.com/but I suggest you look around to see what is available before making a decision, because editing somebody else's algorithms may require a bit of heavy lifting.

XML 示例完全不同。即使内容和层次结构可能相同,对等点之间的关系也不同。当 XML 被解析时,它被解析为一种称为 DOM 的结构,其中单元之间的关系非常重要。如果您想忽略对等实体之间关系的性质,那么您可能需要自定义软件。我建议找到一些简单的开源 XML 感知差异工具并添加您需要的其他要求。我在http://prettydiff.com/ 上写了一个,但我建议你在做出决定之前环顾四周看看有什么可用的,因为编辑其他人的算法可能需要一些繁重的工作。

回答by Jens Krogsboell

For what it's worth, I have created a java tool (or kotlin actually) for effecient and configurable canonicalization of xml files.

就其价值而言,我创建了一个 Java 工具(实际上是 kotlin),用于对 xml 文件进行有效且可配置的规范化。

It will always:

它将始终:

  • Sort nodes and attributes by name.
  • Remove namespaces (yes - it could - hypothetically - be a problem).
  • Prettyprint the result.
  • 按名称对节点和属性进行排序。
  • 删除命名空间(是的 - 它可能 - 假设 - 是一个问题)。
  • 打印结果。

In addition you can tell it to:

此外,您可以告诉它:

  • Remove a given list of node names - maybe you do not want to know that the value of a piece of metadata - say <RequestReceivedTimestamp>has changed.
  • Sort a given list of collections in the context of the parent - maybe you do not care that the order of <Contact>entries in <ListOfFavourites>has changed.
  • 删除给定的节点名称列表——也许你不想知道元数据的值——比如<RequestReceivedTimestamp>已经改变。
  • 在父上下文中对给定的集合列表进行排序 - 也许您不关心<Contact>条目的顺序<ListOfFavourites>已更改。

It uses XSLT and does all the above efficiently using chaining.

它使用 XSLT 并使用链接有效地完成上述所有工作。

Limitations

限制

It does support sorting nested lists - sorting innermost lists before outer. But it cannot reliably sort arbitrary levels of recursively nested lists.

它确实支持对嵌套列表进行排序 - 在外部列表之前排序最里面的列表。但它不能可靠地对任意级别的递归嵌套列表进行排序。

If you have such needs you can - after having used this tool - compare the sorted byte arrays of the results. they will be equal if only list sorting issues remain.

如果你有这样的需求,你可以 - 在使用这个工具之后 - 比较结果的排序字节数组。如果仅存在列表排序问题,它们将相等。

Where to get it

从哪里得到它

You can get it here: XMLNormalize

你可以在这里得到它:XMLNormalize