postgresql 编码 'WIN1252' 中字节序列为 0x9d 的字符在编码 'UTF8' 中没有等效项

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/42130110/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-09 06:23:01  来源:igfitidea点击:

Character with byte sequence 0x9d in encoding 'WIN1252' has no equivalent in encoding 'UTF8'

postgresqlencodingutf-8

提问by Sahil Doshi

I am reading a csv file in my sql script and copying its data into a postgre sql table. The line of code is below :

我正在我的 sql 脚本中读取 csv 文件并将其数据复制到 postgre sql 表中。代码行如下:

\copy participants_2013 from 'C:/Users/Acrotrend/Desktop/mip_sahil/mip/reelportdata/Participating_Individual_Extract_Report_MIPJunior_2013_160414135957.Csv' with CSV delimiter ',' quote '"' HEADER;

I am getting following error : character with byte sequence 0x9d in encoding 'WIN1252' has no equivalent in encoding 'UTF8'.

我收到以下错误:编码 'WIN1252' 中字节序列为 0x9d 的字符在编码 'UTF8' 中没有等价物。

Can anyone help me with what the cause of this issue and how can I resolve it?

任何人都可以帮助我解决此问题的原因以及如何解决?

回答by Philip Couling

The problem is that 0x9Dis not a valid byte value in WIN1252. There's a table here: https://en.wikipedia.org/wiki/Windows-1252

问题是这0x9D不是 WIN1252 中的有效字节值。这里有一张桌子:https: //en.wikipedia.org/wiki/Windows-1252

The problem may be that you are importing a UTF-8 file and postgresql is defaulting to Windows-1252 (which I believe is the default on many windows systems).

问题可能是您正在导入 UTF-8 文件,而 postgresql 默认为 Windows-1252(我相信这是许多 Windows 系统上的默认设置)。

You need to change the character set on your windows command line before running the script with chcp. Or in postgresql you can:

在使用chcp运行脚本之前,您需要更改 Windows 命令行上的字符集。或者在 postgresql 中你可以:

SET CLIENT_ENCODING TO 'utf8';

Before importing the file.

在导入文件之前。

回答by Pavel Stehule

Any encoding has numeric ranges of valid code. Are you sure so your data are in win1252 encoding?

任何编码都有有效代码的数字范围。你确定你的数据是win1252编码吗?

Postgres is very strict and doesn't import any possible encoding broken files. You can use iconvthat can works in tolerant mode, and it can remove broken chars. After cleaning by iconvyou can import the file.

Postgres 非常严格,不会导入任何可能的编码损坏文件。您可以使用iconv它可以在容忍模式下工作,并且可以删除损坏的字符。清理后就iconv可以导入文件了。

回答by isapir

Simply specify encoding 'UTF-8'as the encoding in the \copycommand, e.g. (I broke it into two lines for readability but keep it all on the same line):

只需encoding 'UTF-8'\copy命令中指定编码,例如(我将其分成两行以提高可读性,但将其全部保留在同一行上):

\copy dest_table from 'C:/src-data.csv' 
                 (format csv, header true, delimiter ',', encoding 'UTF8');

More details:

更多细节:

The problem is that the Client Encoding is set to WIN1252, most likely because it is running on Windows machine but the file has a UTF-8character in it.

问题是客户端编码设置为WIN1252,很可能是因为它在 Windows 机器上运行,但文件中有一个UTF-8字符。

You can check the Client Encoding with

您可以检查客户端编码

SHOW client_encoding;

 client_encoding
-----------------
WIN1252