Unicode 字符,如 XML 中的 \u0016

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/8485436/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-06 15:03:46  来源:igfitidea点击:

Unicode characters like \u0016 in XML

xmlunicode

提问by user1081449

Is there a way to handle unicode characters like \u0016 in XML? As per my understanding, loading such characters in XMLDocument throws an invalid hexadecimal character error. I tried with other unicode characters. They seem to work fine. Only the control characters cause this error. Can we remove these characters without actual parsing the XML?

有没有办法在 XML 中处理像 \u0016 这样的 unicode 字符?根据我的理解,在 XMLDocument 中加载此类字符会引发无效的十六进制字符错误。我尝试使用其他 unicode 字符。他们似乎工作得很好。只有控制字符会导致此错误。我们可以在不实际解析 XML 的情况下删除这些字符吗?

回答by scessor

Characters are denoted using the notation used in the Unicode Standard, that is, an optional U+ followed by their hexadecimal number, using at least 4 digits, such as U+1234or U+10FFFD. In XMLor HTML this could be expressed as ሴor 􏿽.

字符使用 Unicode 标准中使用的符号表示,即可选的 U+ 后跟其十六进制数,使用至少 4 位数字,例如U+1234U+10FFFD。在XML或 HTML 中,这可以表示为ሴ􏿽

from Unicode Technical Report.

来自Unicode 技术报告

Valid characters in XML:

XML 中的有效字符:

Char ::= #x9 | #xA | #xD | [#x20-#xD7FF] | [#xE000-#xFFFD] | [#x10000-#x10FFFF]

字符 ::= #x9 | #xA | #xD | [#x20-#xD7FF] | [#xE000-#xFFFD] | [#x10000-#x10FFFF]

from Extensible Markup Language (XML) 1.0 (Fifth Edition)

来自可扩展标记语言 (XML) 1.0(第五版)

回答by Darin Dimitrov

You cannot use control characters directly in XML. If you needed to store binary data in XML file you could Base 64encode it. That way you can store images, ...

不能直接在 XML 中使用控制字符。如果您需要将二进制数据存储在 XML 文件中,您可以对它进行Base 64编码。这样你就可以存储图像,...