postgresql 编码为 UTF8 的字符在 WIN1252 中没有等效项

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/1565234/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-10 22:21:55  来源:igfitidea点击:

Character with encoding UTF8 has no equivalent in WIN1252

postgresqlencodingutf-8character

提问by Monis Iqbal

I am getting the following exception:

我收到以下异常:

Caused by: org.postgresql.util.PSQLException: ERROR: character 0xefbfbd of encoding "UTF8" has no equivalent in "WIN1252"

Is there a way to eradicate such characters, either via SQL or programmatically?
(SQL solution should be preferred).

有没有办法通过 SQL 或以编程方式根除此类字符?
(SQL 解决方案应该是首选)。

I was thinking of connecting to the DB using WIN1252, but it will give the same problem.

我正在考虑使用 WIN1252 连接到数据库,但它会出现同样的问题。

采纳答案by Tometzky

What do you do when you get this message? Do you import a file to Postgres? As devstuff said it is a BOM character. This is a character Windows writes as first to a text file, when it is saved in UTF8 encoding - it is invisible, 0-width character, so you'll not see it when opening it in a text editor.

收到这条消息你会怎么做?您是否将文件导入 Postgres?正如 devstuff 所说,这是一个 BOM 字符。这是 Windows 首先写入文本文件的字符,当它以 UTF8 编码保存时 - 它是不可见的 0 宽度字符,因此在文本编辑器中打开它时您不会看到它。

Try to open this file in for example Notepad, save-as it in ANSI encoding and add (or replace similar) set client_encoding to 'WIN1252'line in your file.

尝试在例如记事本中打开此文件,将其另存为 ANSI 编码并set client_encoding to 'WIN1252'在文件中添加(或替换类似的)行。

回答by airstrike

I had a similar issue, and I solved by setting the encoding to UTF8 with \encoding UTF8in the client before attempting an INSERT INTO foo (SELECT * from bar WHERE x=y);. My client was using WIN1252 encoding but the database was in UTF8, hence the error.

我有一个类似的问题,我通过\encoding UTF8在客户端中将编码设置为 UTF8 来解决,然后再尝试INSERT INTO foo (SELECT * from bar WHERE x=y);. 我的客户端使用的是 WIN1252 编码,但数据库是 UTF8,因此出现错误。

More info is available on the PostgreSQL wiki under Character Set Support(devel docs).

PostgreSQL wiki 上的字符集支持(开发文档)下提供了更多信息。

回答by MSalters

Don't eridicate the characters, they're real and used for good reasons. Instead, eridicate Win1252.

不要抹杀角色,它们是真实的并且有充分的理由使用。相反,根除 Win1252。

回答by Maurice M

I had a very similar issue. I had a linked server from SQL Server to a PostgreSQL database. Some data I had in the table I was selecting from using an openquery statement had some character that didn't have an equivalent in Win1252. The problem was that the System DSN entry (to be found under the ODBC Data Source Administrator) I had used for the connection was configured to use PostgreSQL ANSI(x64) rather than PostgreSQL Unicode(x64). Creating a new data source with the Unicode support and creating a new modified linked server and refernecing the new linked server in your openquery resolved the issue for me. Happy days.

我有一个非常相似的问题。我有一个从 SQL Server 到 PostgreSQL 数据库的链接服务器。我在使用 openquery 语句选择的表中的某些数据具有某些在 Win1252 中没有等效项的字符。问题是我用于连接的系统 DSN 条目(可在 ODBC 数据源管理器下找到)被配置为使用 PostgreSQL ANSI(x64) 而不是 PostgreSQL Unicode(x64)。创建具有 Unicode 支持的新数据源并创建新的修改链接服务器并在您的 openquery 中引用新链接服务器为我解决了这个问题。快乐的时光。

回答by devstuff

That looks like the byte sequence 0xBD, 0xBF, 0xEF as a little-endian integer. This is the UTF8-encoded form of the Unicode byte-order-mark (BOM) character 0xFEFF.

这看起来像字节序列 0xBD, 0xBF, 0xEF 作为小端整数。这是 Unicode 字节顺序标记 (BOM) 字符 0xFEFF 的 UTF8 编码形式。

I'm not sure what Postgre's normal behaviour is, but the BOM is normally used only for encoding detection at the beginning of an input stream, and is usually not returned as part of the result.

我不确定 Postgre 的正常行为是什么,但 BOM 通常仅用于输入流开头的编码检测,通常不会作为结果的一部分返回。

In any case, your exception is due to this code point not having a mapping in the Win1252 code page. This will occur with most other non-Latin characters too, such as those used in Asian scripts.

在任何情况下,您的异常都是由于此代码点在 Win1252 代码页中没有映射。大多数其他非拉丁字符也会发生这种情况,例如亚洲文字中使用的字符。

Can you change the database encoding to be UTF8 instead of 1252? This will allow your columns to contain almost any character.

您可以将数据库编码更改为 UTF8 而不是 1252 吗?这将允许您的列包含几乎任何字符。

回答by Christian A Strasser

I was able to get around it by using Postgres' substring function and selecting that instead:

我能够通过使用 Postgres 的 substring 函数并选择它来解决它:

select substring(comments from 1 for 200) from billing

The comment that the special character started each field was a great help in finally resolving it.

特殊字符开始每个字段的注释对最终解决它有很大帮助。

回答by Maxime Langlois

This problem appeared for us around 19/11/2016 with our old Access 97 app accessing a postgresql 9.1 DB.

这个问题是在 2016 年 11 月 19 日左右出现的,我们的旧 Access 97 应用程序访问了 postgresql 9.1 数据库。

This was solved by changing the driver to UNICODE instead of ANSI (see plang comment).

这是通过将驱动程序更改为 UNICODE 而不是 ANSI 来解决的(请参阅计划注释)。

回答by s6a6n6d6m6a6n

Here's what worked for me : 1 enable ad-hoc queries in sp_configure. 2 add ODBC DSN for your linked PostgreSQL server. 3 make sure you have both ANSI and Unicode (x64) drivers (try with both). 4 run query like this below - change UID, server ip, db name and password. 5 just keep the query in last line in postgreSQL format.

以下是对我有用的方法: 1 在 sp_configure 中启用即席查询。2 为链接的 PostgreSQL 服务器添加 ODBC DSN。3 确保您同时拥有 ANSI 和 Unicode (x64) 驱动程序(两者都尝试)。4 像下面这样运行查询 - 更改 UID、服务器 IP、数据库名称和密码。5 只需将查询保留在 postgreSQL 格式的最后一行。

EXEC sp_configure 'show advanced options', 1
RECONFIGURE
GO
EXEC sp_configure 'ad hoc distributed queries', 1
RECONFIGURE
GO

SELECT * FROM OPENROWSET('MSDASQL', 
'Driver=PostgreSQL Unicode(x64); 
uid=loginid;
Server=1.2.3.41;
port=5432;
database=dbname;
pwd=password',

'select * FROM table_name limit 10;')

回答by yodi

I have face this issue when my Windows 10 using Mandarin China as default language. This problem has occurred because I did try to import a database with UTF-8. Checking via psql and do "\l", it shows collate and cytpe is Mandarin China.

当我的 Windows 10 使用普通话作为默认语言时,我遇到了这个问题。出现此问题是因为我确实尝试使用 UTF-8 导入数据库。通过 psql 检查并执行“\l”,它显示 collat​​e 和 cytpe 是普通话 CN 。

The solution, reset OS language back to US and re-install PostgreSQL. As the collate back to UTF-8, you can reset back your OS language again.

解决方案,将操作系统语言重置回美国并重新安装 PostgreSQL。随着整理回 UTF-8,您可以再次重置您的操作系统语言。

I write the full context and solution here https://www.yodiw.com/fix-utf8-encoding-win1252-cputf8-postgresql-windows-10/

我在这里写了完整的上下文和解决方案https://www.yodiw.com/fix-utf8-encoding-win1252-cputf8-postgresql-windows-10/

回答by Mahdi Ben Selimene

You can change encoding

您可以更改编码

Example

例子

String tmp // String that will be saved in postgre database
String utfString = new String(tmp.getBytes(Charset.forName("utf-8")));

I use java.nio.Charset to set charset;

我使用 java.nio.Charset 来设置字符集;