oracle 使用 utf8 的 SQL 加载器

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/11979495/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-19 01:03:12  来源:igfitidea点击:

SQL Loader with utf8

oracleutf-8sql-loader

提问by AbhiDwi

I am getting following error while loading Japanese data using SQL*Loader. My Database is UTF8 (NLS parameters) and my OS supports UTF8.

使用 SQL*Loader 加载日语数据时出现以下错误。我的数据库是 UTF8(NLS 参数),我的操作系统支持 UTF8。

Record 5: Rejected - Error on table ACTIVITY_FACT, column METADATA.
ORA-12899: value too large for column METADATA (actual: 2624, maximum: 3500)

My Control file:

我的控制文件:

load data
characterset UTF8
infile '../tab_files/activity_fact.csv' "STR ';'"
APPEND
into tableactivity_fact
fields terminated by ',' optionally enclosed by '~'
TRAILING NULLCOLS
(metadata CHAR(3500))

My table

我的桌子

create table actvuty_facr{
metadata varchar2(3500 char)
}

Why SQL Loader is throwing the wrong exception, (actual: 2624, maximum: 3500). 2624 is less than 3500.

为什么 SQL Loader 抛出错误的异常,(actual: 2624, maximum: 3500). 2624 小于 3500。

回答by

The default length semantics for all datafiles (except UFT-16) is byte. So in your case you have a CHAR of 3500 bytes rather than characters. You have some multi-byte characters in your file and the 2624 characters is therefore using more than 3500 bytes, hence the (misleading) message.

所有数据文件(UFT-16 除外)的默认长度语义都是字节。因此,在您的情况下,您的 CHAR 为 3500 个字节而不是字符。您的文件中有一些多字节字符,因此 2624 个字符使用了超过 3500 个字节,因此是(误导性)消息。

You can sort this out by using character length semantics instead

您可以使用字符长度语义来解决这个问题

alter this line in your control file

更改控制文件中的这一行

characterset UTF8

to this

对此

characterset UTF8 length semantics char

and it will work on characters for CHAR fields (and some others) - in the same way that you have set up your table, so 3500 characters of up to four bytes each.

它将处理 CHAR 字段(和其他一些)的字符 - 与您设置表格的方式相同,因此 3500 个字符,每个字符最多四个字节。

See the Utilities Guide on Character Length Semanticsfor more information

有关详细信息,请参阅有关字符长度语义的实用程序指南