C# 您将如何比较两个 XML 文档?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/167946/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How would you compare two XML Documents?
提问by Neil C. Obremski
As part of the base class for some extensive unit testing, I am writing a helper function which recursively compares the nodes of one XmlDocument object to another in C# (.NET). Some requirements of this:
作为一些广泛单元测试的基类的一部分,我正在编写一个辅助函数,它递归地比较一个 XmlDocument 对象的节点与 C# (.NET) 中的另一个。对此的一些要求:
- The first document is the source, e.g. what I want the XML document to look like. Thus the second is the one I want to find differences in and it must not contain extranodes not in the first document.
- Must throw an exception when too many significant differences are found, and it should be easily understood by a human glancing at the description.
- Child element order is important, attributes can be in any order.
- Some attributes are ignorable; specifically
xsi:schemaLocation
andxmlns:xsi
, though I would like to be able to pass in which ones are. - Prefixes for namespaces must match in both attributes and elements.
- Whitespace between elements is irrelevant.
- Elements will eitherhave child elements or
InnerText
, but not both.
- 第一个文档是源文件,例如我希望 XML 文档的样子。因此,第二个是我想要找到差异的那个,它不能包含第一个文档中没有的额外节点。
- 当发现太多显着差异时必须抛出异常,并且应该很容易被人类看一眼描述理解。
- 子元素顺序很重要,属性可以是任何顺序。
- 有些属性是可以忽略的;特别是
xsi:schemaLocation
和xmlns:xsi
,尽管我希望能够传递哪些是。 - 命名空间的前缀必须在属性和元素中匹配。
- 元素之间的空白是无关紧要的。
- 元素要么有子元素要么
InnerText
,但不能同时有。
While I'm scrapping something together: has anyone written such code and would it be possible to share it here?
虽然我正在拼凑一些东西:有没有人写过这样的代码,可以在这里分享吗?
On an aside, what would you call the first and second documents? I've been referring to them as "source" and "target", but it feels wrong since the sourceis what I want the targetto look like, else I throw an exception.
顺便说一句,你怎么称呼第一个和第二个文件?我一直将它们称为“源”和“目标”,但感觉不对,因为源是我希望目标的样子,否则我会抛出异常。
采纳答案by Danimal
Microsoft has an XML diff APIthat you can use.
Microsoft 有一个您可以使用的XML 差异 API。
Unofficial NuGet: https://www.nuget.org/packages/XMLDiffPatch.
非官方 NuGet:https: //www.nuget.org/packages/XMLDiffPatch。
回答by runrig
Comparing XML documents is complicated. Google for xmldiff (there's even a Microsoft solution) for some tools. I've solved this a couple of ways. I used XSLT to sort elements and attributes (because sometimes they would appear in a different order, and I didn't care about that), and filter out attributes I didn't want to compare, and then either used the XML::Diffor XML::SemanticDiffperl module, or pretty printed each document with every element and attribute on a separate line, and using Unix command line diff on the results.
比较 XML 文档很复杂。谷歌搜索 xmldiff(甚至还有微软的解决方案)来获取一些工具。我已经通过几种方式解决了这个问题。我使用 XSLT 对元素和属性进行排序(因为有时它们会以不同的顺序出现,而我并不关心这一点),并过滤掉我不想比较的属性,然后使用XML::Diff或XML::SemanticDiffperl 模块,或者在单独的行上使用每个元素和属性漂亮地打印每个文档,并在结果上使用 Unix 命令行差异。
回答by Alex Gulin
回答by Do Will
Another way to do this would be -
另一种方法是 -
- Get the contents of both files into two different strings.
- Transform the strings using an XSLT (which will just copy everything over to two new strings). This will ensure that all spaces outside the elements are removed. This will result it two new strings.
- Now, just compare the two strings with each other.
- 将两个文件的内容放入两个不同的字符串中。
- 使用 XSLT 转换字符串(它只会将所有内容复制到两个新字符串)。这将确保删除元素之外的所有空格。这将产生两个新字符串。
- 现在,只需将两个字符串相互比较即可。
This won't give you the exact location of the difference, but if you just want to know if there is a difference, this is easy to do without any third party libraries.
这不会为您提供差异的确切位置,但如果您只想知道是否存在差异,则无需任何第三方库即可轻松完成。
回答by Santhosh Kumar Tekuri
回答by Eli Algranti
Not relevant for the OP since it currently ignores child order, but if you want a code only solution you can try XmlSpecificationComparewhich I somewhat misguidedlydeveloped.
与 OP 无关,因为它目前忽略子订单,但如果您想要一个纯代码解决方案,您可以尝试XmlSpecificationCompare,我有点误导性地开发了它。
回答by Andrej Adamenko
I googled up a more complete list of solutions of this problem today, I am going to try one of them soon:
我今天在谷歌上搜索了一个更完整的解决这个问题的列表,我很快就会尝试其中一个:
- http://xmlunit.sourceforge.net/
- http://msdn.microsoft.com/en-us/library/aa302294.aspx
- http://jolt.codeplex.com/wikipage?title=Jolt.Testing.Assertions.XML.Adaptors
- http://www.codethinked.com/checking-xml-for-semantic-equivalence-in-c
- https://vkreynin.wordpress.com/tag/xml/
- http://gandrusz.blogspot.com/2008/07/recently-i-have-run-into-usual-problem.html
- http://xmlspecificationcompare.codeplex.com/
- https://github.com/netbike/netbike.xmlunit
- http://xmlunit.sourceforge.net/
- http://msdn.microsoft.com/en-us/library/aa302294.aspx
- http://jolt.codeplex.com/wikipage?title=Jolt.Testing.Assertions.XML.Adaptors
- http://www.codethinked.com/checking-xml-for-semantic-equivalence-in-c
- https://vkreynin.wordpress.com/tag/xml/
- http://gandrusz.blogspot.com/2008/07/recently-i-have-run-into-usual-problem.html
- http://xmlspecificationcompare.codeplex.com/
- https://github.com/netbike/netbike.xmlunit
回答by Two Cents
This code doesn't satisfy all your requirements, but it's simple and I'm using for my unit tests. Attribute order doesn't matter, but element order does. Element inner text is not compared. I also ignored case when comparing attributes, but you can easily remove that.
这段代码不能满足您的所有要求,但它很简单,我正在用于我的单元测试。属性顺序无关紧要,但元素顺序很重要。不比较元素内部文本。我在比较属性时也忽略了大小写,但您可以轻松删除它。
public bool XMLCompare(XElement primary, XElement secondary)
{
if (primary.HasAttributes) {
if (primary.Attributes().Count() != secondary.Attributes().Count())
return false;
foreach (XAttribute attr in primary.Attributes()) {
if (secondary.Attribute(attr.Name.LocalName) == null)
return false;
if (attr.Value.ToLower() != secondary.Attribute(attr.Name.LocalName).Value.ToLower())
return false;
}
}
if (primary.HasElements) {
if (primary.Elements().Count() != secondary.Elements().Count())
return false;
for (var i = 0; i <= primary.Elements().Count() - 1; i++) {
if (XMLCompare(primary.Elements().Skip(i).Take(1).Single(), secondary.Elements().Skip(i).Take(1).Single()) == false)
return false;
}
}
return true;
}
回答by cwills
https://github.com/CameronWills/FatAntelopeAnother alternative library to the Microsoft XML Diff API. It has a XML diffing algorithm to do an unordered comparison of two XML documents and produce an optimal matching.
https://github.com/CameronWills/FatAntelopeMicrosoft XML Diff API 的另一个替代库。它有一个 XML diffing 算法来对两个 XML 文档进行无序比较并产生最佳匹配。
It is a C# port of the X-Diff algorithm described here: http://pages.cs.wisc.edu/~yuanwang/xdiff.html
它是此处描述的 X-Diff 算法的 C# 端口:http: //pages.cs.wisc.edu/~yuanwang/xdiff.html
Disclaimer: I wrote it :)
免责声明:我写的:)
回答by Chetan Mehra
Based @Two Cents answer and using this link XMLSortingi have created my own XmlComparer
基于@Two Cents 回答并使用此链接XMLSorting我创建了自己的 XmlComparer
Compare XML program
比较 XML 程序
private static bool compareXML(XmlNode node, XmlNode comparenode)
{
if (node.Value != comparenode.Value)
return false;
if (node.Attributes.Count>0)
{
foreach (XmlAttribute parentnodeattribute in node.Attributes)
{
string parentattributename = parentnodeattribute.Name;
string parentattributevalue = parentnodeattribute.Value;
if (parentattributevalue != comparenode.Attributes[parentattributename].Value)
{
return false;
}
}
}
if(node.HasChildNodes)
{
sortXML(comparenode);
if (node.ChildNodes.Count != comparenode.ChildNodes.Count)
return false;
for(int i=0; i<node.ChildNodes.Count;i++)
{
string name = node.ChildNodes[i].LocalName;
if (compareXML(node.ChildNodes[i], comparenode.ChildNodes[i]) == false)
return false;
}
}
return true;
}
Sort XML program
排序 XML 程序
private static void sortXML(XmlNode documentElement)
{
int i = 1;
SortAttributes(documentElement.Attributes);
SortElements(documentElement);
foreach (XmlNode childNode in documentElement.ChildNodes)
{
sortXML(childNode);
}
}
private static void SortElements(XmlNode rootNode)
{
for(int j = 0; j < rootNode.ChildNodes.Count; j++) {
for (int i = 1; i < rootNode.ChildNodes.Count; i++)
{
if (String.Compare(rootNode.ChildNodes[i].Name, rootNode.ChildNodes[1 - 1].Name) < 0)
{
rootNode.InsertBefore(rootNode.ChildNodes[i], rootNode.ChildNodes[i - 1]);
}
}
}
// Console.WriteLine(j++);
}
private static void SortAttributes(XmlAttributeCollection attribCol)
{
if (attribCol == null)
return;
bool changed = true;
while (changed)
{
changed = false;
for (int i = 1; i < attribCol.Count; i++)
{
if (String.Compare(attribCol[i].Name, attribCol[i - 1].Name) < 0)
{
//Replace
attribCol.InsertBefore(attribCol[i], attribCol[i - 1]);
changed = true;
}
}
}
}