Oracle plsql rtf varchar2 字段为纯文本格式
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/6265233/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Oracle plsql rtf varchar2 field to plain text format
提问by Stephan Schielke
I need to convert a rich formatted text of a VARCHAR2 field to plain text.
我需要将 VARCHAR2 字段的格式丰富的文本转换为纯文本。
For example:
例如:
{\rtf1\ansi\ansicpg1252\deff0{\fonttbl{\f0\fnil Tahoma;}{\f1\fnil\fcharset0 Tahoma;}}
{\colortbl ;\red0\green0\blue255;}
\viewkind4\uc1\pard\cf1\lang1031\b\f0\fs16 NUMBER_A\cf0\b0\f1 *\cf1\b\protect NUMBER_B\cf0\b0\protect0\f0\par
}
should be converted to:
应转换为:
NUMBER_A * NUMBER_B
I have tried to parse the RTF-string char by char but this isn't a very smart solution. A PL/SQL utilities method for any RTF-text would be the nicest way. Is there a native solution? Any ideas how to convert the rtf-text?
我试图按字符解析 RTF 字符串字符,但这不是一个非常聪明的解决方案。用于任何 RTF 文本的 PL/SQL 实用程序方法将是最好的方法。有本地解决方案吗?任何想法如何转换 rtf 文本?
Thx for sharing your time and ideas.
感谢分享您的时间和想法。
采纳答案by Tony Andrews
Apparently this can be done via Oracle Text - see this AskTom question
显然这可以通过 Oracle Text 完成 - 请参阅此 AskTom 问题
回答by CLS
I made it with simple PL/SQL. Based on this SQL source, but completly rewritten. (Backslash + Apostrofe confuses code coloring here)
我是用简单的 PL/SQL 实现的。 基于此 SQL 源,但完全重写。(反斜杠 + 撇号在这里混淆了代码着色)
CREATE OR REPLACE FUNCTION Rtf2Txt
(
pRtf varchar2
)
return nvarchar2 is
/*
Converts RTF text to TXT format by removing headers, commands, and formatting
*/
vPos1 int;
vPos2 int;
vPos3 int;
vPos4 int;
vTmp int;
vText varchar2(4000);
begin
vText := pRtf;
if vText is null then
return vText;
end if;
-- Remove outer { and } pair
vPos1 := instr(vText, '{', +1); -- The first {
vPos2 := instr(vText, '}', -1); -- The last }
if vPos1 > 0 and vPos2 > 0 then
vText := substr(vText, vPos1 +1, vPos2 - vPos1 -1);
end if;
-- Remove inner { and } pairs
while 1 = 1 loop
vPos2 := instr(vText, '}', +1); -- The first }
vPos1 := instr(vText, '{', (length(vText) - vPos2) *-1 -1); -- The last { before the found }
if vPos1 > 0 and vPos2 > 0 and vPos1 < vPos2 then
vText := substr(vText, 1, vPos1 -1) || substr(vText, vPos2 +1, length(vText) - vPos2);
else
exit;
end if;
end loop;
-- Cleaning up
vText := replace(vText, '\pard', '');
vText := replace(vText, chr(13), '');
vText := replace(vText, chr(10), '');
vText := replace(vText, '\par', chr(13));
while length(vText) > 0 and substr(vText, 1, 1) IN (' ', CHR(13), CHR(10)) loop
vText := substr(vText, 2, length(vText) -1);
end loop;
while length(vText) > 0 and substr(vText, length(vText), 1) IN (' ', CHR(13), CHR(10)) loop
vText := substr(vText, 1, length(vText) -1);
end loop;
-- Remove \ commands and replace \'XX charactercoding
vPos2 := 1;
while 1 = 1 loop
vPos1 := instr(vText, '\', vPos2);
if vPos1 = 0 then
exit;
end if;
if substr(vText, vPos1 +1, 1) = '\' then -- Skip \ escape sequence, when present
vPos2 := vPos1 +2;
continue;
end if;
if substr(vText, vPos1 +1, 1) = '''' then -- Decode \' hex sequence
vTmp := to_number(substr(vText, vPos1 +2, 2), 'xx');
vText := substr(vText, 1, vPos1 -1) ||chr(vTmp)|| substr(vText, vPos1 +4, length(vText) - vPos1 -3);
vPos2 := vPos1 +1;
continue;
end if;
-- Skip \anything sequence
vPos2 := instr(vText, '\', vPos1 +1); -- The next \
vPos3 := instr(vText, ' ', vPos1 +1); -- The next ' '
vPos4 := instr(vText, chr(13), vPos1 +1); -- The next Enter
if vPos4 > 0 and vPos4 < vPos3 then
vPos3 := vPos4;
end if;
if vPos2 = 0 and vPos3 = 0 then
vPos3 := length(vText);
end if;
if vPos2 > 0 and (vPos2 < vPos3 or vPos3 = 0) then
vText := substr(vText, 1, vPos1 -1) || substr(vText, vPos2, length(vText) - vPos2 +1);
vPos2 := vPos1;
end if;
if vPos3 > 0 and (vPos3 < vPos2 or vPos2 = 0) then
vText := substr(vText, 1, vPos1 -1) || substr(vText, vPos3 +1, length(vText) - vPos3);
vPos2 := vPos1;
end if;
end loop;
return vText;
end;
/