如何使用 Oracle utl_file 输出扩展的 ascii 字符

Question

提问by Superdooperhero

I was writing files using

我正在使用

l_file := utl_file.fopen('OUT', 'a.txt', 'w');
utl_file.put_line(l_file, 'Ros?ttenville');

but I changed this to

但我把它改成

l_file := utl_file.fopen_nchar('OUT', 'a.txt', 'w', 32767);
utl_file.put_line_nchar(l_file, 'Ros?ttenville');

when I found out that the extended ASCII (characters above code 127) were not written out correctly. However the second unicode version also does not write the extended characters correctly. Instead of Ros?ttenville I'm getting Ros??ttenville. Anyone know how to fix this?

当我发现扩展 ASCII（代码 127 以上的字符）没有正确写出时。然而，第二个 unicode 版本也没有正确写入扩展字符。而不是 Ros?ttenville 我得到的是 Ros??ttenville。有人知道怎么修这个东西吗？

Answer 1

回答by Alex Poole

You haven't said what your database character set is, and thus whether it's legitimate to have 'extended ascii' (probably 8859-1, with chr(235)in this case) in a string, or if this is just a demo. Either way, I think, your problem is trying to implicitly convert a non-unicode string.

你还没说你的数据库的字符集是什么，因而无论是（可能是合法的有“扩展ASCII” 8859，与chr(235)在这种情况下）的字符串，或者，如果这只是一个演示。无论哪种方式，我认为，您的问题是试图隐式转换非 unicode 字符串。

?is code point EB, which is also UTF-8 C3 AB. You're getting separate characters ?(code point C3) and ?(code point AB). So it can't do a direct translation from chr(235), which is 0x00EB, to U+00EB. It seems to be going via the UTF-8 C3 ABas two separate characters. I'm not going to try to understand exactly why...

?是代码点 EB，也是 UTF-8 C3 AB。您将获得单独的字符?（代码点 C3）和?（代码点 AB）。所以它不能直接从chr(235)，即0x00EB，到U+00EB。它似乎C3 AB作为两个单独的字符通过 UTF-8 。我不会试图去理解到底为什么......

You can either use the convertfunction:

您可以使用该convert功能：

l_file := utl_file.fopen('OUT', 'a.txt', 'w');
utl_file.put_line(l_file,
  convert('Ros?ttenville', 'WE8ISO8859P1', 'UTF8'));

... or, as use of that is discourage by Oracle, the utl_raw.convertfunction:

...或者，由于 Oracle 不鼓励使用它，该utl_raw.convert函数：

l_file := utl_file.fopen('OUT', 'a.txt', 'w');
utl_file.put_line(l_file,
  utl_raw.cast_to_varchar2(utl_raw.convert(utl_raw.cast_to_raw('Ros?ttenville'),
    'ENGLISH_UNITED KINGDOM.WE8ISO8859P1', 'ENGLISH_UNITED KINGDOM.UTF8')));

Both give me the value you want, and your original gave me the same value you saw (where my DB character set is AL32UTF8in 11gR2 on Linux). If your DB character set is not Unicode, your national character set certainly appears to be (it isn't clear in the question if you got the same output with both attempts), so the ncharversion should work instead:

两者都给了我你想要的值，你的原始值给了我你看到的相同的值（我的数据库字符集AL32UTF8在 Linux 上的 11gR2 中）。如果您的 DB 字符集不是 Unicode，则您的国家字符集肯定是（如果两次尝试都得到相同的输出，问题中不清楚），因此该nchar版本应该可以工作：

l_file := utl_file.fopen_nchar('OUT', 'a.txt', 'w', 32767);
utl_file.put_line_nchar(l_file,
  utl_raw.cast_to_varchar2(utl_raw.convert(utl_raw.cast_to_raw('Ros?ttenville'),
    'ENGLISH_UNITED KINGDOM.WE8ISO8859P1', 'ENGLISH_UNITED KINGDOM.UTF8')));

It would probably be better to be working with Unicode values in the first place, particularly if you currently have a mix of 'extended ascii' and other string types in a table; applying the conversion to everything in that case might give some odd results...

首先使用 Unicode 值可能会更好，特别是如果您当前在表中混合了“扩展 ascii”和其他字符串类型；在这种情况下将转换应用于所有内容可能会产生一些奇怪的结果......

Answer 2

回答by DejanR

UTL_FILE.PUT_LINE does not make conversion of data and export data in database default character set.

UTL_FILE.PUT_LINE 不会在数据库默认字符集中进行数据转换和导出数据。

So You need to make proper conversion on write:

所以你需要在写时进行适当的转换：

UTL_FILE.PUT_LINE(file,CONVERT(text,'WE8ISO8859P1'),FALSE);

You must set:

您必须设置：

LANG=GERMAN_AUSTRIA.WE8ISO8859P1;export LANG
LC_CTYPE=ISO-8859-1;export LC_CTYPE
NLS_LANG=GERMAN_AUSTRIA.WE8ISO8859P1;export NLS_LANG

如何使用 Oracle utl_file 输出扩展的 ascii 字符

提问by Superdooperhero

回答by Alex Poole

回答by DejanR

相关推荐

最近更新

标签

如何使用 Oracle utl_file 输出扩展的 ascii 字符

提问by Superdooperhero

回答by Alex Poole

回答by DejanR

相关推荐

Oracle regexp_like 否定特殊字符

oracle ORA-00932: 不一致的数据类型：Hibernate 中的预期 DATE 为 BINARY

oracle 如何在运行 Sqoop 导入和导出时找到最佳映射器数量？

oracle 是否可以使用 liquibase 更新数据库中的现有行？

相关推荐

最近更新

标签