在 Oracle XDB 中转义控制字符
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/7270445/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Escaping control characters in Oracle XDB
提问by DaveyDaveDave
I'm completely new to Oracle's XDB, in particular using it to generate XML output from a database table, and am working on an application which is moving from 9i (Oracle9i Enterprise Edition Release 9.2.0.5.0 - Production) to 11g (Oracle Database 11g Enterprise Edition Release 11.2.0.2.0 - 64bit Production). Here's a small test case which illustrates the problem I'm having:
我对 Oracle 的 XDB 完全陌生,特别是使用它从数据库表生成 XML 输出,并且正在开发一个从 9i(Oracle9i 企业版 9.2.0.5.0 版 - 生产)迁移到 11g(Oracle)的应用程序Database 11g Enterprise Edition Release 11.2.0.2.0 - 64 位生产)。这是一个小测试用例,它说明了我遇到的问题:
select xmlelement("test", test) from (select 'a' test from dual);
This works and gives me:
这有效并给了我:
<test>a</test>
However in 11g, if I swap 'a' for an invalid character, such as U+0013 I get the following error:
但是,在 11g 中,如果我将 'a' 交换为无效字符,例如 U+0013,则会出现以下错误:
ORA-31061: XDB error: special char to escaped char conversion failed.
Under 9i the same thing works successfully, with no error.
在 9i 下,同样的事情成功运行,没有错误。
Obviously the ideal answer is to have some validation in place to prevent control characters getting into the simple character data that I'm trying to convert into XML, but unfortunately that's outside the scope of what I'm doing.
显然,理想的答案是进行一些验证以防止控制字符进入我试图转换为 XML 的简单字符数据中,但不幸的是,这超出了我正在做的范围。
Is this something anyone else has experienced, and if so, is there a simple change I can make to my XML generating script, or do I need to do some other kind of cleansing? Or just manually fix the problem on the rare occasions that it happens (which would be a perfectly reasonable option for my needs).
这是其他人经历过的,如果是,我可以对我的 XML 生成脚本进行简单的更改,还是需要做一些其他类型的清理?或者只是在极少数情况下手动修复问题(这对于我的需求来说是一个完全合理的选择)。
回答by Rebecca J Coleman
While always fixing the data at the source is the best solution, I also found this to be useful in the case where I cannot control the data at the source:
虽然始终在源头修复数据是最好的解决方案,但我也发现这在我无法控制源头数据的情况下很有用:
select xmlelement("test", test)
from (select regexp_replace(unistr('aSQL> select xmlelement("test", unistr('aselect xmlelement("test", regexp_replace(test, '[^[:print:]|[:space:]]', '#')) from
(select '- <- to keep line break after weird char
-' test from dual )
13b')) from dual;
ERROR:
ORA-31061: XDB error: special char to escaped char conversion failed.
no rows selected
SQL> select xmlelement("test", unistr('a##代码##aeb')) from dual;
XMLELEMENT("TEST",UNISTR('A##代码##AEB'))
--------------------------------------------------------------------------------
<test>a?b</test>
SQL>
13b'), '[[:cntrl:]]', '') test from dual);
Important piece is the regexp_replace(your_field, '[[:cntrl::]]', '')
to remove control characters from the data.
重要的是regexp_replace(your_field, '[[:cntrl::]]', '')
从数据中删除控制字符。
回答by user272735
U+0013 is not a valid unicode codepoint for XML. See e.g. Valid characters in XML. So 11g correctly raises an exception.
U+0013 不是 XML 的有效 unicode 代码点。参见例如XML 中的有效字符。所以 11g 正确地引发了一个异常。
##代码##No idea why this will pass in 9i (I don't have that available), but that's probably simply because Oracle's implementation has evolved to be more standard conforming and/or the standard has evolved.
不知道为什么这会在 9i 中通过(我没有那个可用),但这可能仅仅是因为 Oracle 的实现已经演变为更符合标准和/或标准已经演变。
Your fix is correct.
你的修复是正确的。
回答by DaveyDaveDave
Just to follow-up on this for anyone interested. As far as I can tell, 9i just passed through the invalid character, producing invalid XML. 11g throws an error, which is probably the more correct behaviour, even if it is annoying in my case.
只是为了对任何感兴趣的人进行跟进。据我所知,9i 只是通过了无效字符,产生了无效的 XML。11g 抛出一个错误,这可能是更正确的行为,即使在我的情况下它很烦人。
The only reasonable solution I found was to fix the content at source.
我找到的唯一合理的解决方案是在源头修复内容。
回答by J. Chomel
If you wish to keep line breaks, you may try like follows:
如果你想保留换行符,你可以尝试如下:
##代码##- replace all that
^
=> is not in the sets (of printing[:print:]
or space|[:space:]
chars)
- 替换所有
^
=> 不在集合中的(打印[:print:]
或空格|[:space:]
字符)