如何使用 Oracle utl_file 输出扩展的 ascii 字符
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/17041419/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How to output extended ascii characters using Oracle utl_file
提问by Superdooperhero
I was writing files using
我正在使用
l_file := utl_file.fopen('OUT', 'a.txt', 'w');
utl_file.put_line(l_file, 'Ros?ttenville');
but I changed this to
但我把它改成
l_file := utl_file.fopen_nchar('OUT', 'a.txt', 'w', 32767);
utl_file.put_line_nchar(l_file, 'Ros?ttenville');
when I found out that the extended ASCII (characters above code 127) were not written out correctly. However the second unicode version also does not write the extended characters correctly. Instead of Ros?ttenville I'm getting Ros??ttenville. Anyone know how to fix this?
当我发现扩展 ASCII(代码 127 以上的字符)没有正确写出时。然而,第二个 unicode 版本也没有正确写入扩展字符。而不是 Ros?ttenville 我得到的是 Ros??ttenville。有人知道怎么修这个东西吗?
回答by Alex Poole
You haven't said what your database character set is, and thus whether it's legitimate to have 'extended ascii' (probably 8859-1, with chr(235)
in this case) in a string, or if this is just a demo. Either way, I think, your problem is trying to implicitly convert a non-unicode string.
你还没说你的数据库的字符集是什么,因而无论是(可能是合法的有“扩展ASCII” 8859,与chr(235)
在这种情况下)的字符串,或者,如果这只是一个演示。无论哪种方式,我认为,您的问题是试图隐式转换非 unicode 字符串。
?
is code point EB, which is also UTF-8 C3 AB
. You're getting separate characters ?
(code point C3) and ?
(code point AB). So it can't do a direct translation from chr(235)
, which is 0x00EB
, to U+00EB
. It seems to be going via the UTF-8 C3 AB
as two separate characters. I'm not going to try to understand exactly why...
?
是代码点 EB,也是 UTF-8 C3 AB
。您将获得单独的字符?
(代码点 C3)和?
(代码点 AB)。所以它不能直接从chr(235)
,即0x00EB
,到U+00EB
。它似乎C3 AB
作为两个单独的字符通过 UTF-8 。我不会试图去理解到底为什么......
You can either use the convert
function:
您可以使用该convert
功能:
l_file := utl_file.fopen('OUT', 'a.txt', 'w');
utl_file.put_line(l_file,
convert('Ros?ttenville', 'WE8ISO8859P1', 'UTF8'));
... or, as use of that is discourage by Oracle, the utl_raw.convert
function:
...或者,由于 Oracle 不鼓励使用它,该utl_raw.convert
函数:
l_file := utl_file.fopen('OUT', 'a.txt', 'w');
utl_file.put_line(l_file,
utl_raw.cast_to_varchar2(utl_raw.convert(utl_raw.cast_to_raw('Ros?ttenville'),
'ENGLISH_UNITED KINGDOM.WE8ISO8859P1', 'ENGLISH_UNITED KINGDOM.UTF8')));
Both give me the value you want, and your original gave me the same value you saw (where my DB character set is AL32UTF8
in 11gR2 on Linux). If your DB character set is not Unicode, your national character set certainly appears to be (it isn't clear in the question if you got the same output with both attempts), so the nchar
version should work instead:
两者都给了我你想要的值,你的原始值给了我你看到的相同的值(我的数据库字符集AL32UTF8
在 Linux 上的 11gR2 中)。如果您的 DB 字符集不是 Unicode,则您的国家字符集肯定是(如果两次尝试都得到相同的输出,问题中不清楚),因此该nchar
版本应该可以工作:
l_file := utl_file.fopen_nchar('OUT', 'a.txt', 'w', 32767);
utl_file.put_line_nchar(l_file,
utl_raw.cast_to_varchar2(utl_raw.convert(utl_raw.cast_to_raw('Ros?ttenville'),
'ENGLISH_UNITED KINGDOM.WE8ISO8859P1', 'ENGLISH_UNITED KINGDOM.UTF8')));
It would probably be better to be working with Unicode values in the first place, particularly if you currently have a mix of 'extended ascii' and other string types in a table; applying the conversion to everything in that case might give some odd results...
首先使用 Unicode 值可能会更好,特别是如果您当前在表中混合了“扩展 ascii”和其他字符串类型;在这种情况下将转换应用于所有内容可能会产生一些奇怪的结果......
回答by DejanR
UTL_FILE.PUT_LINE does not make conversion of data and export data in database default character set.
UTL_FILE.PUT_LINE 不会在数据库默认字符集中进行数据转换和导出数据。
So You need to make proper conversion on write:
所以你需要在写时进行适当的转换:
UTL_FILE.PUT_LINE(file,CONVERT(text,'WE8ISO8859P1'),FALSE);
You must set:
您必须设置:
LANG=GERMAN_AUSTRIA.WE8ISO8859P1;export LANG
LC_CTYPE=ISO-8859-1;export LC_CTYPE
NLS_LANG=GERMAN_AUSTRIA.WE8ISO8859P1;export NLS_LANG