postgresql 错误:COPY 分隔符必须是单个一字节字符

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/6930242/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-10 23:03:08  来源:igfitidea点击:

ERROR: COPY delimiter must be a single one-byte character

postgresqlpostgresql-copy

提问by vchitta

I want to load the data from a flat file with delimiter "~,~" into a PostgreSQL table. I have tried it as below but looks like there is a restriction for the delimiter. If COPY statement doesn't allow multiple chars for delimiter, is there any alternative to do this?

我想将带有分隔符“~,~”的平面文件中的数据加载到 PostgreSQL 表中。我已经尝试过如下,但看起来分隔符有限制。如果 COPY 语句不允许使用多个字符作为分隔符,是否还有其他方法可以做到这一点?

metadb=# \COPY public.CME_DATA_STAGE_TRANS FROM 'E:\Infor\Outbound_Marketing.2.1\EM\metadata\pgtrans.log' WITH      DELIMITER AS '~,~'
ERROR:  COPY delimiter must be a single one-byte character
\copy: ERROR:  COPY delimiter must be a single one-byte character

回答by Susheel Javadi

If you are using Vertica, you could use E'\t'or U&'\0009'

如果您使用的是Vertica,则可以使用 E'\t' 或 U&'\0009'

To indicate a non-printing delimiter character (such as a tab), specify the character in extended string syntax (E'...'). If your database has StandardConformingStrings enabled, use a Unicode string literal (U&'...'). For example, use either E'\t' or U&'\0009' to specify tab as the delimiter.

要指示非打印分隔符(例如制表符),请在扩展字符串语法 (E'...') 中指定该字符。如果您的数据库启用了 StandardConformingStrings,请使用 Unicode 字符串文字 (U&'...')。例如,使用 E'\t' 或 U&'\0009' 将制表符指定为分隔符。

回答by Grzegorz Szpetkowski

Unfortunatelly there is no way to load flat file with multiple characters delimiter ~,~in Postgres unless you want to modify source code(and recompile of course) by yourself in some (terrific) way:

不幸的是,没有办法~,~在 Postgres 中加载带有多个字符分隔符的平面文件,除非您想以某种(了不起的)方式自己修改源代码(当然还需要重新编译):

/* Only single-byte delimiter strings are supported. */
if (strlen(cstate->delim) != 1)
    ereport(ERROR,
        (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
    errmsg("COPY delimiter must be a single one-byte character")));

What you want is to preprocessyour input file with some external tool, for example sedmight to be best companion on GNU/Linux platfom, for example:

您想要的是使用一些外部工具预处理您的输入文件,例如sed可能是 GNU/Linux 平台上的最佳伴侣,例如:

sed s/~,~/\t/g inputFile

回答by Erwin Brandstetter

The obvious thing to do is what all other answers advised. Edit import file. I would do that, too.

显而易见的事情是所有其他答案的建议。编辑导入文件。我也会那样做。

However, as a proof of concept, here are two ways to accomplish this without additional tools.

但是,作为概念证明,这里有两种无需额外工具即可完成此操作的方法。

1) General solution

1) 通用解决方案

CREATE OR REPLACE FUNCTION f_import_file(OUT my_count integer)
  RETURNS integer AS
$BODY$
DECLARE
    myfile    text;  -- read xml file into that var.
    datafile text := '\path\to\file.txt'; -- !pg_read_file only accepts relative path in database dir!
BEGIN

myfile := pg_read_file(datafile, 0, 100000000);  -- arbitrary 100 MB max.

INSERT INTO public.my_tbl
SELECT ('(' || regexp_split_to_table(replace(myfile, '~,~', ','), E'\n') || ')')::public.my_tbl;

-- !depending on file format, you might need additional quotes to create a valid format.

GET DIAGNOSTICS my_count = ROW_COUNT;

END;
$BODY$
  LANGUAGE plpgsql VOLATILE;

This uses a number of pretty advanced features. If anybody is actually interested and needs an explanation, leave a comment to this post and I will elaborate.

这使用了许多非常高级的功能。如果有人真正感兴趣并需要解释,请在这篇文章中发表评论,我会详细说明。

2) Special case

2) 特殊情况

Ifyou can guarantee that '~' is only present in the delimiter '~,~', then you can go ahead with a plain COPY in this special case. Just treat ',' in '~,~' as an additional columns. Say, your table looks like this:

如果你能保证 '~' 只出现在分隔符 '~,~' 中,那么你可以在这种特殊情况下继续使用普通的 COPY。只需将 '~,~' 中的 ',' 视为附加列。说,你的表是这样的:

CREATE TABLE foo (a int, b int, c int);

Then you can (in one transaction):

然后你可以(在一笔交易中):

CREATE TEMP TABLE foo_tmp ON COMMIT DROP (
 a int, tmp1 "char"
,b int, tmp2 "char"
,c int);

COPY foo_tmp FROM '\path\to\file.txt' WITH DELIMITER AS '~';

ALTER TABLE foo_tmp DROP COLUMN tmp1;
ALTER TABLE foo_tmp DROP COLUMN tmp2;

INSERT INTO foo SELECT * FROM foo_tmp;

回答by user606723

Not quite sure if you're looking for a postgresql solution or just a general one.

不太确定您是在寻找 postgresql 解决方案还是只是一个通用的解决方案。

If it were me, I would open up a copy of vim (or gvim) and run the commend :%s/~,~/~/g
That replaces all "~,~" with "~".

如果是我,我会打开一个 vim(或 gvim)的副本并运行:%s/~,~/~/g
将所有“~,~”替换为“~”的命令。

回答by Umur Kontac?

you can use a single character delimiter, open notepadpress ctrl+hreplace ~,~with something will not interfere. like |

您可以使用单个字符分隔符,打开notepadctrl+h替换~,~不会干扰。喜欢|