如何将 Oracle 中的字符编码为 XML?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/156697/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How to encode characters from Oracle to XML?
提问by Andre Bossard
In my environment here I use Java to serialize the result set to XML. It happens basically like this:
在我这里的环境中,我使用 Java 将结果集序列化为 XML。它基本上是这样发生的:
//foreach column of each row
xmlHandler.startElement(uri, lname, "column", attributes);
String chars = rs.getString(i);
xmlHandler.characters(chars.toCharArray(), 0, chars.length());
xmlHandler.endElement(uri, lname, "column");
The XML looks like this in Firefox:
XML 在 Firefox 中如下所示:
<row num="69004">
<column num="1">10069</column>
<column num="2">sd</column>
<column num="3">FCVolume </column>
</row>
But when I parse the XML I get the a
但是当我解析 XML 时,我得到了一个
org.xml.sax.SAXParseException: Character reference "" is an invalid XML character.
org.xml.sax.SAXParseException:字符引用“ ”是无效的 XML 字符。
My question now is: Which charactes do I have to replace or how do I have to encode my characters, that they will be valid XML?
我现在的问题是:我必须替换哪些字符或我必须如何编码我的字符,它们将是有效的 XML?
采纳答案by Andre Bossard
I found an interesting list in the Xml Spec: According to that List its discouraged to use the Character #26 (Hex: #x1A).
我在Xml Spec 中发现了一个有趣的列表:根据该列表,不鼓励使用字符 #26(十六进制:#x1A)。
The characters defined in the following ranges are also discouraged. They are either control characters or permanently undefined Unicode characters
也不鼓励使用以下范围中定义的字符。它们要么是控制字符,要么是永久未定义的 Unicode 字符
See the complete ranges.
查看完整的范围。
This code replaces all non-valid Xml Utf8 from a String:
此代码替换字符串中所有无效的 Xml Utf8:
public String stripNonValidXMLCharacters(String in) {
StringBuffer out = new StringBuffer(); // Used to hold the output.
char current; // Used to reference the current character.
if (in == null || ("".equals(in))) return ""; // vacancy test.
for (int i = 0; i < in.length(); i++) {
current = in.charAt(i);
if ((current == 0x9) ||
(current == 0xA) ||
(current == 0xD) ||
((current >= 0x20) && (current <= 0xD7FF)) ||
((current >= 0xE000) && (current <= 0xFFFD)) ||
((current >= 0x10000) && (current <= 0x10FFFF)))
out.append(current);
}
return out.toString();
}
its taken from Invalid XML Characters: when valid UTF8 does not mean valid XML
它取自Invalid XML Characters: 当有效的 UTF8 并不意味着有效的 XML
But with that I had the still UTF-8 compatility issue:
但是,我仍然遇到了 UTF-8 兼容性问题:
org.xml.sax.SAXParseException: Invalid byte 1 of 1-byte UTF-8 sequence
After reading XML - returning XML as UTF-8 from a servletI just tried out what happens if I set the Contenttype like this:
阅读XML - 从 servlet 以 UTF-8 格式返回 XML 后,我只是尝试了如果我这样设置 Contenttype 会发生什么:
response.setContentType("text/xml;charset=utf-8");
And it worked ....
它起作用了......
回答by Eugene Yokota
Extensible Markup Language (XML) 1.0says:
The ampersand character (&) and the left angle bracket (<) must not appear in their literal form, except when used as markup delimiters, or within a comment, a processing instruction, or a CDATA section. If they are needed elsewhere, they must be escaped using either numeric character references or the strings "&" and "<" respectively. The right angle bracket (>) may be represented using the string ">", and must, for compatibility, be escaped using either ">" or a character reference when it appears in the string "]]>" in content, when that string is not marking the end of a CDATA section.
与符号 (&) 和左尖括号 (<) 不得以其文字形式出现,除非用作标记定界符,或者在注释、处理指令或 CDATA 部分中。如果其他地方需要它们,则必须分别使用数字字符引用或字符串“&”和“<”进行转义。右尖括号 (>) 可以使用字符串 ">" 表示,并且为了兼容性,当它出现在内容中的字符串 "]]>" 中时,必须使用 ">" 或字符引用进行转义,当字符串未标记 CDATA 部分的结尾。
You can skip the encoding if you use CDATA:
如果使用 CDATA,则可以跳过编码:
<column num="1"><![CDATA[10069]]></column>
<column num="2"><![CDATA[sd&]]></column>
回答by Eugene Yokota
Which version of JRE are you running? Sax Projectsays:
您运行的是哪个版本的 JRE?萨克斯项目说:
J2SE 1.4 bundles an old version of SAX2. How do I make SAX2 r2 or later available?
J2SE 1.4 捆绑了旧版本的 SAX2。如何使 SAX2 r2 或更高版本可用?