使用 Oracle SQL Loader (sqlldr) 加载 Unicode 字符会导致问号

Question

提问by philrabin

I'm trying to load localized strings from a unicode (UTF8-encoded) csv using SQL Loader into an oracle database. I've tried all sort of combinations but nothing seems to give me the result I'm looking for which is to have special greek characters like (Δ) not get converted to ?” or ?.

我正在尝试使用 SQL Loader 将本地化字符串从 unicode（UTF8 编码）csv 加载到 oracle 数据库中。我已经尝试了各种组合，但似乎没有什么可以给我我正在寻找的结果，即具有特殊的希腊字符，如 (Δ) 不会被转换为 ?” 或者？。

My table definition looks like this:

我的表定义如下所示：

CREATE TABLE "GLOBALIZATIONRESOURCE"
(
    "RESOURCETYPE" VARCHAR2(255 CHAR) NOT NULL ENABLE,
    "CULTURE"      VARCHAR2(20 CHAR) NOT NULL ENABLE,
    "KEY"          VARCHAR2(128 CHAR) NOT NULL ENABLE,
    "VALUE"        VARCHAR2(2048 CHAR),
    "DESCRIPTION"  VARCHAR2(512 CHAR),
    CONSTRAINT "PK_GLOBALIZATIONRESOURCE" PRIMARY KEY ("RESOURCETYPE","CULTURE","KEY") USING INDEX TABLESPACE REPSPACE_IX ENABLE
)
TABLESPACE REPSPACE;

I have tried the following configurations in my control file (and actually every permutation I could think of)

我在我的控制文件中尝试了以下配置（实际上我能想到的每个排列）

load data
TRUNCATE
INTO TABLE "GLOBALIZATIONRESOURCE"
FIELDS TERMINATED BY "," OPTIONALLY ENCLOSED BY '"'
TRAILING NULLCOLS
(   
    "RESOURCETYPE" CHAR(255), 
    "CULTURE" CHAR(20), 
    "KEY" CHAR(128), 
    "VALUE" CHAR(2048), 
    "DESCRIPTION" CHAR(512)
)

load data
CHARACTERSET UTF8
TRUNCATE
INTO TABLE "GLOBALIZATIONRESOURCE"
FIELDS TERMINATED BY "," OPTIONALLY ENCLOSED BY '"'
TRAILING NULLCOLS
(   
    "RESOURCETYPE" CHAR(255), 
    "CULTURE" CHAR(20), 
    "KEY" CHAR(128), 
    "VALUE" CHAR(2048), 
    "DESCRIPTION" CHAR(512)
)

load data
CHARACTERSET UTF16
TRUNCATE
INTO TABLE "GLOBALIZATIONRESOURCE"
FIELDS TERMINATED BY X'002c' OPTIONALLY ENCLOSED BY X'0022'
TRAILING NULLCOLS
(   
    "RESOURCETYPE" CHAR(255), 
    "CULTURE" CHAR(20), 
    "KEY" CHAR(128), 
    "VALUE" CHAR(2048), 
    "DESCRIPTION" CHAR(512)
)

With the first two options, the unicode characters don't get encoded and just show up as upside down question marks.

使用前两个选项，unicode 字符不会被编码，只会显示为倒置的问号。

If I choose last option, UTF16, then I get the following error even though all my data in my fields are much shorter than the length specified.

如果我选择最后一个选项 UTF16，那么即使我的字段中的所有数据都比指定的长度短得多，我也会收到以下错误。

Field in data file exceeds maximum length

It seems as though every possible combination of ctl file configurations (even setting the byte order to little and big) doesn't work correctly. Can someone please give an example of a configuration (table structure and CTL file) that correctly loads unicode data from a csv? Any help would be greatly appreciated.

似乎所有可能的 ctl 文件配置组合（甚至将字节顺序设置为小和大）都无法正常工作。有人可以举例说明从 csv 正确加载 unicode 数据的配置（表结构和 CTL 文件）吗？任何帮助将不胜感激。

Note: I've already been to http://docs.oracle.com/cd/B19306_01/server.102/b14215/ldr_concepts.htm, http://docs.oracle.com/cd/B10501_01/server.920/a96652/ch10.htmand http://docs.oracle.com/cd/B10501_01/server.920/a96652/ch10.htm.

注意：我已经去过http://docs.oracle.com/cd/B19306_01/server.102/b14215/ldr_concepts.htm，http://docs.oracle.com/cd/B10501_01/server.920/ a96652/ch10.htm和http://docs.oracle.com/cd/B10501_01/server.920/a96652/ch10.htm。

Answer 1

回答by ridonekorkmaz

You have two problem;

你有两个问题；

Character set.

字符集。

Answer:You can solve this problem by finding your text character set (most of time notepad++ can do this.). After finding character set, you have to find sqlldr correspond of character set name. So, you can find this info from link https://docs.oracle.com/cd/B10501_01/server.920/a96529/appa.htm#975313After all of these, you should solve character set problem.

答：您可以通过查找您的文本字符集来解决此问题（大多数情况下，notepad++ 可以做到这一点。）。找到字符集后，你必须找到sqlldr 对应的字符集名称。因此，您可以从链接https://docs.oracle.com/cd/B10501_01/server.920/a96529/appa.htm#975313 中找到此信息。完成所有这些之后，您应该解决字符集问题。

In contrast to your actual data length, sqlldr says that, Field in data file exceeds maximum length.

与您的实际数据长度相反，sqlldr 说，Field in data file exceeds maximum length.

Answer:You can solve this problem by adding CHAR(4000)(or what the actual length is) to problematic column. In my case, the problematic column is "E" column. Example is below. In my case I solved my problem in this way, hope helps. LOAD DATA CHARACTERSET UTF8 -- This line is comment-- Turkish charset (for ü??? etc.) -- CHARACTERSET WE8ISO8859P9 -- Character list is here. -- https://docs.oracle.com/cd/B10501_01/server.920/a96529/appa.htm#975313INFILE 'data.txt' "STR '~|~\n'" TRUNCATE INTO TABLE SILTAB FIELDS TERMINATED BY '#' TRAILING NULLCOLS ( a, b, c, d, e CHAR(4000) )

答：您可以通过向CHAR(4000)有问题的列添加（或实际长度是多少）来解决此问题。就我而言，有问题的列是“E”列。示例如下。就我而言，我以这种方式解决了我的问题，希望有所帮助。 LOAD DATA CHARACTERSET UTF8 -- This line is comment-- Turkish charset (for ü??? etc.) -- CHARACTERSET WE8ISO8859P9 -- Character list is here. -- https://docs.oracle.com/cd/B10501_01/server.920/a96529/appa.htm#975313INFILE 'data.txt' "STR '~|~\n'" TRUNCATE INTO TABLE SILTAB FIELDS TERMINATED BY '#' TRAILING NULLCOLS ( a, b, c, d, e CHAR(4000) )

Answer 2

回答by davidsr

You must ensure that the following charactersets are the same:

您必须确保以下字符集相同：

db characterset
dump file characterset
the client from which you are doing the import (NLS_LANG)

数据库字符集
转储文件字符集
您要从中进行导入的客户端 (NLS_LANG)

If the client-side characterset is different, oracle will attempt to perform character conversions to the native db characterset and this might not always provide the desired result.

如果客户端字符集不同，oracle 将尝试执行到本机 db 字符集的字符转换，这可能无法始终提供所需的结果。

Answer 3

回答by user1019903

Don't use MS Office to save the spreadsheet into unicode .csv. Instead, use OpenOffice to save into unicode-UTF8 .csv file. Then in the loader control file, add "CHARACTERSET UTF8" run Oracle SQL*Loader, this gives me correct results

不要使用 MS Office 将电子表格保存为 unicode .csv。相反，使用 OpenOffice 保存到 unicode-UTF8 .csv 文件。然后在加载器控制文件中，添加“CHARACTERSET UTF8”运行 Oracle SQL*Loader，这给了我正确的结果

Answer 4

回答by pavangulhane

There is a range of character set encoding that you can use in control file while loading data from sql loader.

从 sql loader 加载数据时，您可以在控制文件中使用一系列字符集编码。

For greek characters I believe Western European char set should do the trick.

对于希腊字符，我相信西欧字符集应该可以解决问题。

LOAD DATA
CHARACTERSET WE8ISO8859P1

or in case of MS word input files with smart characters try in control file

或者如果是带有智能字符的 MS Word 输入文件，请尝试在控制文件中

LOAD DATA
CHARACTERSET WE8MSWIN1252

使用 Oracle SQL Loader (sqlldr) 加载 Unicode 字符会导致问号

提问by philrabin

回答by ridonekorkmaz

回答by davidsr

回答by user1019903

回答by pavangulhane

相关推荐

最近更新

标签

使用 Oracle SQL Loader (sqlldr) 加载 Unicode 字符会导致问号

提问by philrabin

回答by ridonekorkmaz

回答by davidsr

回答by user1019903

回答by pavangulhane

相关推荐

Oracle 日期差异以获取年数

Oracle SQL Developer 在哪里存储连接？

oracle 过程缓冲区溢出

如何测试日期格式字符串是否是 Oracle 中的有效日期格式字符串

相关推荐

最近更新

标签