Unicode 字符,如 XML 中的 \u0016
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/8485436/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Unicode characters like \u0016 in XML
提问by user1081449
Is there a way to handle unicode characters like \u0016 in XML? As per my understanding, loading such characters in XMLDocument throws an invalid hexadecimal character error. I tried with other unicode characters. They seem to work fine. Only the control characters cause this error. Can we remove these characters without actual parsing the XML?
有没有办法在 XML 中处理像 \u0016 这样的 unicode 字符?根据我的理解,在 XMLDocument 中加载此类字符会引发无效的十六进制字符错误。我尝试使用其他 unicode 字符。他们似乎工作得很好。只有控制字符会导致此错误。我们可以在不实际解析 XML 的情况下删除这些字符吗?
回答by scessor
Characters are denoted using the notation used in the Unicode Standard, that is, an optional U+ followed by their hexadecimal number, using at least 4 digits, such as
U+1234orU+10FFFD. InXMLor HTML this could be expressed asሴor􏿽.
字符使用 Unicode 标准中使用的符号表示,即可选的 U+ 后跟其十六进制数,使用至少 4 位数字,例如
U+1234或U+10FFFD。在XML或 HTML 中,这可以表示为ሴ或􏿽。
from Unicode Technical Report.
来自Unicode 技术报告。
Valid characters in XML:
XML 中的有效字符:
Char ::= #x9 | #xA | #xD | [#x20-#xD7FF] | [#xE000-#xFFFD] | [#x10000-#x10FFFF]
字符 ::= #x9 | #xA | #xD | [#x20-#xD7FF] | [#xE000-#xFFFD] | [#x10000-#x10FFFF]

