如何在 XML 中获取不区分大小写的元素
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/868850/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How to get case-insensitive elements in XML
提问by Luixv
As far as I know XML element type names as well as attribute names are case sensitive.
据我所知,XML 元素类型名称和属性名称区分大小写。
Is there a way or any trick to get case insensitive elements?
有没有办法或任何技巧来获取不区分大小写的元素?
Clarification: A grammar has been defined via XSD which is used for some clients to upload data. The users -the content generators- are creating XML files using different tools but many of them are using plain text editors or whatever. Sometimes when this people are trying to upload their files they get incompatibility errors. It is a common error that they mix lowerCase and upperCase tags although it is was always clear that tags ARE case sensitive.
说明:已通过 XSD 定义了语法,用于某些客户端上传数据。用户 - 内容生成器 - 使用不同的工具创建 XML 文件,但其中许多使用纯文本编辑器或其他工具。有时,当这些人尝试上传他们的文件时,他们会遇到不兼容错误。尽管标签区分大小写总是很清楚,但它们混合了小写和大写标签是一个常见的错误。
I have access to the XSD file which defines this grammar and I can change it. The question is how to avoid this error-prone lower/upper case tags problem.
我可以访问定义此语法的 XSD 文件,我可以更改它。问题是如何避免这种容易出错的小写/大写标签问题。
Any idea?
任何的想法?
Thanks in advance!
提前致谢!
回答by melkisadek
If I understand your problem correctly then the case errors can only be corrected between the creation and the upload by a 3rd party parsing tool.
如果我正确理解您的问题,那么只能通过 3rd 方解析工具在创建和上传之间更正大小写错误。
i.e. XML File > Parsed against XSD and corrected > Upload approved
即 XML 文件 > 针对 XSD 解析并更正 > 上传批准
You could do this at run-time by developing a container application for your clients to create their XML files in. Alternatively you could write an application on the server side that takes the uploaded file and checks the syntax. Either way you're going to have to make a decision and then do some work!!
您可以在运行时通过为您的客户端开发一个容器应用程序来创建他们的 XML 文件来完成此操作。或者,您可以在服务器端编写一个应用程序来获取上传的文件并检查语法。无论哪种方式,您都将不得不做出决定,然后做一些工作!!
A lot depends on the scale of the problem. If you have similar tags in different cases in your XSD e.g. and but you are receiving then you will need a complicated solution based on node counting etc.
在很大程度上取决于问题的规模。如果您在 XSD 中的不同情况下有类似的标签,例如,但您正在接收,那么您将需要一个基于节点计数等的复杂解决方案。
If you are purely stuck with clients using random cases against an XSD only containing lower case tags then you should be able to parse the files and convert all tags to lower case in one go. This is assuming the content between the tags is multi-case and you can't just convert the full document.
如果您完全坚持使用针对仅包含小写标记的 XSD 的随机案例的客户端,那么您应该能够解析文件并将所有标记一次性转换为小写。这是假设标签之间的内容是多大小写的,您不能只转换整个文档。
How you do this depends on the mechanics of your situation. Obviously it will be easier to get the clients to error check their own submissions. If this isn't practical then you'll need to identify a window of opportunity in the process which will allow you to convert the file to the correct format before errors are encountered.
你如何做到这一点取决于你的情况的机制。显然,让客户对他们自己的提交进行错误检查会更容易。如果这不切实际,那么您需要在此过程中确定一个机会窗口,以便您在遇到错误之前将文件转换为正确的格式。
There are far too many ways to go about this to discuss here. It mainly depends on the skill-sets or finance available to you.
有太多的方法可以在这里讨论。这主要取决于您可以使用的技能组合或财务状况。
回答by Cerebrus
As @Melkisadek said, the XSD validation exists for a purpose. If you allow users to upload files with invalid XML, your application is bound to fail at some point when the data within those files is accessed. Furthermore, the whole purpose of having an XSD validate the input XML schema is defeated. If you are willing to forego the whole schema validation feature, then you would need to use an XSLT to convert all tags to Uppercase or Lowercase as you desire (see @Rashmi's answer).
正如@Melkisadek 所说,XSD 验证是有目的的。如果您允许用户上传带有无效 XML 的文件,那么当访问这些文件中的数据时,您的应用程序必然会在某个时刻失败。此外,让 XSD 验证输入 XML 模式的整个目的都落空了。如果您愿意放弃整个模式验证功能,那么您需要使用 XSLT 将所有标签转换为您想要的大写或小写(请参阅@Rashmi 的回答)。
It would be analogous to allowing a user to input special characters in a Social Security Number entry field, just because the user is more comfortable entering special characters (Yes, this example is silly, couldn't think of a better one!)
这类似于允许用户在社会安全号码输入字段中输入特殊字符,只是因为用户更习惯输入特殊字符(是的,这个例子很愚蠢,想不出更好的例子!)
Therefore, in my mind, the solution lies in keeping the schema validation as-is, but providing users a way to validate the schema before uploading. For instance, if this is Web app, you could provide a button on the page which uses Javascript to validate the file against your schema. Alternatively, validate on the server only when the file is uploaded. In both cases, provide appropriate feedback such as the line number on which the errant entities lie, the character position, and reason for flagging an error.
因此,在我看来,解决方案在于保持架构验证原样,但为用户提供一种在上传之前验证架构的方法。例如,如果这是 Web 应用程序,您可以在页面上提供一个按钮,该按钮使用 Javascript 根据您的架构验证文件。或者,仅在上传文件时在服务器上进行验证。在这两种情况下,提供适当的反馈,例如错误实体所在的行号、字符位置和标记错误的原因。
回答by Hoylen
In theory, you could try to hack the XML Schema to validate incorrectly capitalised element names.
理论上,您可以尝试破解 XML 模式以验证错误大写的元素名称。
This can be done by using the substitution groupmechanism in XML Schema. For example, if your schema had defined:
这可以通过使用XML Schema 中的替换组机制来完成。例如,如果您的架构已定义:
<xsd:element name="foobar" type="xsd:string"/>
then you could add the following to the XML Schema:
那么您可以将以下内容添加到 XML 架构中:
<xsd:element name="Foobar" type="xsd:string" substitutionGroup="foobar"/>
<xsd:element name="FooBar" type="xsd:string" substitutionGroup="foobar"/>
<xsd:element name="fooBar" type="xsd:string" substitutionGroup="foobar"/>
<xsd:element name="FOOBAR" type="xsd:string" substitutionGroup="foobar"/>
etc.
等等。
to try and anticipate the possible mistakes they could make. For each element, there could be 2^n possible combination of cases, where n is the length of the name (assuming each character of the name is a letter).
尝试并预测他们可能犯的错误。对于每个元素,可能有 2^n 种可能的情况组合,其中 n 是名称的长度(假设名称的每个字符都是一个字母)。
In practice, this is too much trouble, only delays the problem rather than solving it, and probably won't work. If the users don't realise that XML is case sensitive, then they might not have end tags that match the case of the start tag and it will still fail to validate.
在实践中,这太麻烦了,只会拖延问题而不是解决问题,并且可能不会奏效。如果用户没有意识到 XML 区分大小写,那么他们可能没有与开始标记的大小写匹配的结束标记,并且仍然无法验证。
As other people have said, either pre-process the submitted input to fix the case or to get the users to produce correct input before they submit it.
正如其他人所说,要么预处理提交的输入以修复案例,要么让用户在提交之前生成正确的输入。
回答by SO User
XPath/ Xslt processors are case sensitive. They can't select a node/ attribute if you specify the wrong case.
XPath/Xslt 处理器区分大小写。如果您指定错误的大小写,他们将无法选择节点/属性。
In case you want to output the node name and want it to be in upper case, you can do:
如果您想输出节点名称并希望它是大写的,您可以执行以下操作:
upper-case(local-name())
回答by Volchik
The simples solution is send to lowercase all tags/attributes when you load xml from user and only then check it over xsd designed for all lowercase tags/attributes
简单的解决方案是在您从用户加载 xml 时将所有标签/属性发送到小写,然后才通过为所有小写标签/属性设计的 xsd 检查它
回答by JBRWilkinson
After uploading, walk the XML file (via DOM or SAX) and fix the casing before you validate?
上传后,在验证之前遍历 XML 文件(通过 DOM 或 SAX)并修复大小写?
回答by Zack Marrapese
XML is normally machine generated. Therefore, you should have no real issue here width <RANdOm />case.
XML 通常是机器生成的。因此,这里的宽度<RANdOm />案例应该没有真正的问题。
If the real issue is that two different systems are generating two different types of the tag (<Widget />vs. <widget />), I guess you could simply define both cases in your XSD.
如果真正的问题是两个不同的系统正在生成两种不同类型的标签(<Widget />vs. <widget />),我想您可以简单地在 XSD 中定义这两种情况。

