C# 不用 try/catch 检查格式正确的 XML?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/1026247/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-06 05:50:47  来源:igfitidea点击:

Check well-formed XML without a try/catch?

c#xmlwell-formed

提问by Steve Cooper

Does anyone know how I can check if a string contains well-formed XML without using something like XmlDocument.LoadXml()in a try/catch block? I've got input that may or may not be XML, and I want code that recognises that input may not be XML without relying on a try/catch, for both speed and on the general principle that non-exceptional circumstances shouldn't raise exceptions. I currently have code that does this;

有谁知道如何XmlDocument.LoadXml()在不使用try/catch 块之类的东西的情况下检查字符串是否包含格式正确的 XML ?我的输入可能是 XML,也可能不是 XML,我想要识别输入可能不是 XML 而不依赖于 try/catch 的代码,无论是速度还是一般原则,非异常情况不应该引起例外。我目前有这样做的代码;

private bool IsValidXML(string value)
    {
        try
        {
            // Check we actually have a value
            if (string.IsNullOrEmpty(value) == false)
            {
                // Try to load the value into a document
                XmlDocument xmlDoc = new XmlDocument();

                xmlDoc.LoadXml(value);

                // If we managed with no exception then this is valid XML!
                return true;
            }
            else
            {
                // A blank value is not valid xml
                return false;
            }
        }
        catch (System.Xml.XmlException)
        {
            return false;
        }
    }

But it seems like something that shouldn't require the try/catch. The exception is causing merry hell during debugging because every time I check a string the debugger will break here, 'helping' me with my pesky problem.

但似乎不需要 try/catch 的东西。异常在调试过程中会导致欢乐地狱,因为每次我检查一个字符串时,调试器都会在这里中断,“帮助”我解决我讨厌的问题。

采纳答案by Jon Skeet

I don't know a way of validating without the exception, but you can change the debugger settings to only break for XmlExceptionif it's unhandled - that should solve your immediate issues, even if the code is still inelegant.

我不知道没有例外的验证方法,但是您可以将调试器设置更改为仅XmlException在未处理时中断- 这应该可以解决您的直接问题,即使代码仍然不优雅。

To do this, go to Debug / Exceptions... / Common Language Runtime Exceptions and find System.Xml.XmlException, then make sure only "User-unhandled" is ticked (not Thrown).

为此,请转到 Debug / Exceptions... / Common Language Runtime Exceptions 并找到 System.Xml.XmlException,然后确保只勾选了“User-unhandled”(未抛出)。

回答by Matthew Flaschen

That's a reasonable way to do it, except that the IsNullOrEmpty is redundant (LoadXml can figure that out fine). If you do keep IsNullOrEmpty, do if(!string.IsNullOrEmpty(value)).

这是一个合理的方法,除了 IsNullOrEmpty 是多余的(LoadXml 可以很好地解决这个问题)。如果您确实保留 IsNullOrEmpty,请执行 if(!string.IsNullOrEmpty(value))。

Basically, though, your debugger is the problem, not the code.

但是,基本上,您的调试器是问题所在,而不是代码。

回答by Steve Cooper

Add the [System.Diagnostics.DebuggerStepThrough]attribute to the IsValidXmlmethod. This suppresses the XmlException from being caught by the debugger, which means you can turn on the catching of first-change exceptions and this particular method will not be debugged.

[System.Diagnostics.DebuggerStepThrough]属性添加到IsValidXml方法中。这会抑制 XmlException 被调试器捕获,这意味着您可以打开对首次更改异常的捕获,并且不会调试此特定方法。

回答by Shivanath D

The XmlTextReader class is an implementation of XmlReader, and provides a fast, performant parser. It enforces the rules that XML must be well-formed. It is neither a validating nor a non-validating parser since it does not have DTD or schema information. It can read text in blocks, or read characters from a stream.

XmlTextReader 类是 XmlReader 的一个实现,并提供了一个快速、高性能的解析器。它强制执行 XML 必须格式良好的规则。它既不是验证解析器,也不是非验证解析器,因为它没有 DTD 或模式信息。它可以读取块中的文本,或从流中读取字符。

And an example from another MSDN article to which I have added code to read the whole contents of the XML stream.

还有一个来自另一篇 MSDN 文章的示例,我在其中添加了代码来读取 XML 流的全部内容。

string str = "<ROOT>AQID</ROOT>";
XmlTextReader r = new XmlTextReader(new StringReader(str));
try
{
  while (r.Read())
  {
  }
}
finally
{
  r.Close();
}


source: http://bytes.com/topic/c-sharp/answers/261090-check-wellformedness-xml

来源:http: //bytes.com/topic/c-sharp/answers/261090-check-wellformedness-xml

回答by nickdu

I disagree that the problem is the debugger. In general, for non-exceptional cases, exceptions should be avoided. This means that if someone is looking for a method like IsWellFormed()which returns true/false based on whether the input is well formed XML or not, exceptions should not be thrown within this implementation, regardless of whether they are caught and handled or not.

我不同意问题出在调试器上。一般来说,对于非异常情况,应该避免异常。这意味着,如果有人正在寻找一种IsWellFormed()根据输入是否为格式良好的 XML 返回真/假的方法,则不应在此实现中抛出异常,无论它们是否被捕获和处理。

Exceptions are expensive and they should not be encountered during normal successful execution. An example is writing a method which checks for the existance of a file and using File.Open and catching the exception in the case the file doesn't exist. This would be a poor implementation. Instead File.Exists()should be used (and hopefully the implementation of that does not simply put a try/catch around some method which throws an exception if the file doesn't exist, I'm sure it doesn't).

异常代价高昂,在正常的成功执行过程中不应遇到异常。一个例子是编写一个检查文件是否存在的方法,并使用 File.Open 并在文件不存在的情况下捕获异常。这将是一个糟糕的实施。相反,File.Exists()应该使用(并且希望它的实现不会简单地将 try/catch 放在某个方法周围,如果文件不存在则抛出异常,我确定它不存在)。

回答by Greg Finzer

Steve,

史蒂夫,

We had an 3rd party that accidentally sometimes sent us JSON instead of XML. Here is what I implemented:

我们有一个 3rd 方有时会不小心向我们发送 JSON 而不是 XML。这是我实施的:

public static bool IsValidXml(string xmlString)
{
    Regex tagsWithData = new Regex("<\w+>[^<]+</\w+>");

    //Light checking
    if (string.IsNullOrEmpty(xmlString) || tagsWithData.IsMatch(xmlString) == false)
    {
        return false;
    }

    try
    {
        XmlDocument xmlDocument = new XmlDocument();
        xmlDocument.LoadXml(xmlString);
        return true;
    }
    catch (Exception e1)
    {
        return false;
    }
}

[TestMethod()]
public void TestValidXml()
{
    string xml = "<result>true</result>";
    Assert.IsTrue(Utility.IsValidXml(xml));
}

[TestMethod()]
public void TestIsNotValidXml()
{
    string json = "{ \"result\": \"true\" }";
    Assert.IsFalse(Utility.IsValidXml(json));
}

回答by hello_earth

Just my 2 cents - there are various questions about this around, and most people agree on the "garbage in - garbage out" fact. I don't disagree with that - but personally I found the following quick and dirty solution, especially for the cases where you deal with xml data from 3rd parties which simply do not communicate with you easily.. It doesn't avoid using try/catch - but it uses it with finer granularity, so in cases where the quantity of invalid xml characters is not that big, it helps.. I used XmlTextReader, and its method ReadChars() for each parent element, which is one of the commands that do not do well-formed checks, like ReadInner/OuterXml does. So it's a combination of Read() and ReadChars() when Read() stubmbles upon a parent node. Of course this works because I can do assumption that the basic structure of the XML is okay, but contents (values) of certain nodes can contain special characters that haven't been replaced with &..; equivalent... (I found an article about this somewhere, but can't find the source link at the moment)

只是我的 2 美分 - 关于这个有各种各样的问题,大多数人都同意“垃圾进 - 垃圾出”的事实。我并不反对这一点 - 但我个人发现了以下快速而肮脏的解决方案,特别是对于您处理来自 3rd 方的 xml 数据的情况,这些数据根本无法与您轻松沟通..它不会避免使用 try/ catch - 但它以更精细的粒度使用它,因此在无效 xml 字符的数量不是那么大的情况下,它会有所帮助。不做格式良好的检查,就像 ReadInner/OuterXml 那样。因此,当 Read() 在父节点上存根时,它是 Read() 和 ReadChars() 的组合。当然这是有效的,因为我可以假设 XML 的基本结构没问题,但是某些节点的内容(值)可以包含尚未用 &..; 替换的特殊字符。等效...(我在某处找到了一篇关于此的文章,但目前找不到源链接)

回答by toddmo

My two cents. This was pretty simple and follows some common conventions since it's about parsing...

我的两分钱。这非常简单并且遵循一些常见的约定,因为它是关于解析...

public bool TryParse(string s, ref XmlDocument result)
{
    try {
        result = new XmlDocument();
        result.LoadXml(s);
        return true;
    } catch (XmlException ex) {
        return false;
    }
}

回答by golfalot

Caution with using XmlDocumentfor it possible to load an element along the lines of <0>some text</0>using XmlDocument doc = (XmlDocument)JsonConvert.DeserializeXmlNode(object)withoutan exception being thrown.

注意 using XmlDocumentfor 可能会沿着<0>some text</0>using 加载元素XmlDocument doc = (XmlDocument)JsonConvert.DeserializeXmlNode(object)不会引发异常。

Numeric element names are not valid xml, and in my case an error did not occur until I tried to write the xmlDoc.innerText to an Sql server datatype of xml.

数字元素名称不是有效的 xml,在我的情况下,直到我尝试将 xmlDoc.innerText 写入 Sql 服务器数据类型 xml 时才发生错误。

This how I validate now, and an exception gets thrown
XmlDocument tempDoc = XmlDocument)JsonConvert.DeserializeXmlNode(formData.ToString(), "data"); doc.LoadXml(tempDoc.InnerXml);

这就是我现在验证的方式,并抛出异常
XmlDocument tempDoc = XmlDocument)JsonConvert.DeserializeXmlNode(formData.ToString(), "data"); doc.LoadXml(tempDoc.InnerXml);

回答by VoteCoffee

I'm using this function for verifying strings/fragments

我正在使用此函数来验证字符串/片段

<Runtime.CompilerServices.Extension()>
Public Function IsValidXMLFragment(ByVal xmlFragment As String, Optional Strict As Boolean = False) As Boolean
    IsValidXMLFragment = True

    Dim NameTable As New Xml.NameTable

    Dim XmlNamespaceManager As New Xml.XmlNamespaceManager(NameTable)
    XmlNamespaceManager.AddNamespace("xsd", "http://www.w3.org/2001/XMLSchema")
    XmlNamespaceManager.AddNamespace("xsi", "http://www.w3.org/2001/XMLSchema-instance")

    Dim XmlParserContext As New Xml.XmlParserContext(Nothing, XmlNamespaceManager, Nothing, Xml.XmlSpace.None)

    Dim XmlReaderSettings As New Xml.XmlReaderSettings
    XmlReaderSettings.ConformanceLevel = Xml.ConformanceLevel.Fragment
    XmlReaderSettings.ValidationType = Xml.ValidationType.Schema
    If Strict Then
        XmlReaderSettings.ValidationFlags = (XmlReaderSettings.ValidationFlags Or XmlSchemaValidationFlags.ProcessInlineSchema)
        XmlReaderSettings.ValidationFlags = (XmlReaderSettings.ValidationFlags Or XmlSchemaValidationFlags.ReportValidationWarnings)
    Else
        XmlReaderSettings.ValidationFlags = XmlSchemaValidationFlags.None
        XmlReaderSettings.ValidationFlags = (XmlReaderSettings.ValidationFlags Or XmlSchemaValidationFlags.AllowXmlAttributes)
    End If

    AddHandler XmlReaderSettings.ValidationEventHandler, Sub() IsValidXMLFragment = False
    AddHandler XmlReaderSettings.ValidationEventHandler, AddressOf XMLValidationCallBack

    Dim XmlReader As Xml.XmlReader = Xml.XmlReader.Create(New IO.StringReader(xmlFragment), XmlReaderSettings, XmlParserContext)
    While XmlReader.Read
        'Read entire XML
    End While
End Function

I'm using this function for verifying files:

我正在使用此功能来验证文件:

Public Function IsValidXMLDocument(ByVal Path As String, Optional Strict As Boolean = False) As Boolean
    IsValidXMLDocument = IO.File.Exists(Path)
    If Not IsValidXMLDocument Then Exit Function

    Dim XmlReaderSettings As New Xml.XmlReaderSettings
    XmlReaderSettings.ConformanceLevel = Xml.ConformanceLevel.Document
    XmlReaderSettings.ValidationType = Xml.ValidationType.Schema
    If Strict Then
        XmlReaderSettings.ValidationFlags = (XmlReaderSettings.ValidationFlags Or XmlSchemaValidationFlags.ProcessInlineSchema)
        XmlReaderSettings.ValidationFlags = (XmlReaderSettings.ValidationFlags Or XmlSchemaValidationFlags.ReportValidationWarnings)
    Else
        XmlReaderSettings.ValidationFlags = XmlSchemaValidationFlags.None
        XmlReaderSettings.ValidationFlags = (XmlReaderSettings.ValidationFlags Or XmlSchemaValidationFlags.AllowXmlAttributes)
    End If
    XmlReaderSettings.CloseInput = True

    AddHandler XmlReaderSettings.ValidationEventHandler, Sub() IsValidXMLDocument = False
    AddHandler XmlReaderSettings.ValidationEventHandler, AddressOf XMLValidationCallBack

    Using FileStream As New IO.FileStream(Path, IO.FileMode.Open)
        Using XmlReader As Xml.XmlReader = Xml.XmlReader.Create(FileStream, XmlReaderSettings)
            While XmlReader.Read
                'Read entire XML
            End While
        End Using
    End Using
End Function