ogr2ogr 和 Postgis/PostgreSQL 数据库的编码问题
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/1377662/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Encoding problems with ogr2ogr and Postgis/PostgreSQL database
提问by Chau
In our organization, we handle GIS content in different file formats. I need to put these files into a PostGIS database, and that is done using ogr2ogr. The problem is, that the database is UTF8 encoded, and the files might have a different encoding.
在我们的组织中,我们处理不同文件格式的 GIS 内容。我需要将这些文件放入 PostGIS 数据库中,这是使用 ogr2ogr 完成的。问题是,数据库是 UTF8 编码的,文件可能有不同的编码。
I found descriptions of how I can specify the encoding by adding an options parameter to org2ogr, but appearantly it doesn't have an effect.
我找到了关于如何通过向 org2ogr 添加选项参数来指定编码的描述,但它似乎没有效果。
ogr2ogr -f PostgreSQL PG:"host=localhost user=username dbname=dbname \
password=password options='-c client_encoding=latin1'" sourcefile;
The error I recieve is:
我收到的错误是:
ERROR 1: ALTER TABLE "soer_vd" ADD COLUMN "m?ls?tning" CHAR(10) ERROR: invalid byte sequence for encoding "UTF8": 0xe56c73 HINT: This error can also happen if the byte sequence does not match the encoding expected by the server, which is controlled by "client_encoding". ERROR 1: ALTER TABLE "soer_vd" ADD COLUMN "p?virkning" CHAR(10) ERROR: invalid byte sequence for encoding "UTF8": 0xe57669 HINT: This error can also happen if the byte sequence does not match the encoding expected by the server, which is controlled by "client_encoding". ERROR 1: INSERT command for new feature failed. ERROR: invalid byte sequence for encoding "UTF8": 0xf8 HINT: This error can also happen if the byte sequence does not match the encoding expected by the server, which is controlled by "client_encoding".
Currently, my source file is a Shape file and I'm pretty sure, that it is Latin1 encoded.
目前,我的源文件是一个 Shape 文件,我很确定它是 Latin1 编码的。
What am I doing wrong here and can you help me?
我在这里做错了什么,你能帮我吗?
Kind regards, Casper
亲切的问候, 卡斯帕
回答by Chau
Magnus is right and I will discuss the solution here.
Magnus 是对的,我将在这里讨论解决方案。
I have seen the option to inform PostgreSQL about character encoding, options='-c client_encoding=xxx'
, used many places, but it does not seem to have any effect. If someone knows how this part is working, feel free to elaborate.
我已经看到通知 PostgreSQL 有关字符编码的选项options='-c client_encoding=xxx'
,使用了很多地方,但它似乎没有任何效果。如果有人知道这部分是如何工作的,请随时详细说明。
Magnus suggested to set the environment variable PGCLIENTENCODING to LATIN1. This can, according to a mailing list I queried, be done by modifying the call to ogr2ogr:
Magnus 建议将环境变量 PGCLIENTENCODING 设置为 LATIN1。根据我查询的邮件列表,这可以通过修改对 ogr2ogr 的调用来完成:
ogr2ogr -–config PGCLIENTENCODING LATIN1 –f PostgreSQL
PG:”host=hostname user=username dbname=databasename password=password” inputfile
This didn't do anything for me. What worked for me was to, before the call to ogr2ogr, to:
这对我没有任何作用。在调用 ogr2ogr 之前,对我有用的是:
SET PGCLIENTENCODING=LATIN1
It would be great to hear more details from experienced users and I hope it can help others :)
很高兴听到有经验的用户提供更多详细信息,我希望它可以帮助其他人:)
回答by Magnus Hagander
That does sound like it would set the client encoding to LATIN1. Exactly what error do you get?
这听起来像是将客户端编码设置为 LATIN1。你到底得到了什么错误?
Just in case ogr2ogr doesn't pass it along properly, you can also try setting the environment variable PGCLIENTENCODING
to latin1
.
以防万一 ogr2ogr 没有正确传递它,您也可以尝试将环境变量设置PGCLIENTENCODING
为latin1
.
I suggest you double check that they are actually LATIN1. Simply running file
on it will give you a good idea, assuming it's actually consistent within the file. You can also try sending it through iconv
to convert it to either LATIN1 or UTF8.
我建议您仔细检查它们是否实际上是 LATIN1。file
假设它在文件中实际上是一致的,简单地运行它会给你一个好主意。您也可以尝试发送它iconv
以将其转换为 LATIN1 或 UTF8。
回答by mloskot
Currently, OGRfrom GDALdoes not perform any recoding of character data during translation between vector formats. The team has prepared RFC 23.1: Unicode support in OGRdocument which discusses support of recoding for OGR drivers. The RFC 23 was adoptedand the core functionality was already released in GDAL 1.6.0. However, most of OGR drivers have not been updated, including Shapefile driver.
目前,OGR从GDAL不矢量格式之间的转换过程中执行字符数据的任何重新编码。该团队准备了RFC 23.1:OGR文档中的Unicode 支持,该文档讨论了对 OGR 驱动程序重新编码的支持。采用了RFC 23,核心功能已经在 GDAL 1.6.0 中发布。但是,大多数 OGR 驱动程序尚未更新,包括Shapefile 驱动程序。
For the time being, I would describe OGR as encoding agnostic and ignorant. It means, OGR does take what it gets and sends it out without any processing. OGR uses char type to manipulate textual data. This is fine to handle multi-byte encoded strings (like UTF-8) - it's just a plain stream of bytes stored as array of char elements.
目前,我会将 OGR 描述为编码不可知论和无知。这意味着,OGR 确实会获取它所得到的信息,并且无需任何处理就将其发送出去。OGR 使用 char 类型来操作文本数据。这对于处理多字节编码的字符串(如 UTF-8)来说很好——它只是一个存储为 char 元素数组的普通字节流。
It is advised that developers of OGR drivers should return UTF-8 encoded strings of attribute values, however this rule has not been widely adopted across OGR drivers, thus making this functionality not end-user ready yet.
建议 OGR 驱动程序的开发人员应返回 UTF-8 编码的属性值字符串,但是此规则尚未在 OGR 驱动程序中广泛采用,因此该功能尚未为最终用户准备好。
回答by Sylvain
You need to write your command line like this :
您需要像这样编写命令行:
PGCLIENTENCODING=LATIN1 ogr2ogr -f PostgreSQL PG:"dbname=...
回答by Roberto Marzocchi
On windows a command is
在 Windows 上,一个命令是
SET PGCLIENTENCODING=LATIN1
设置 PGCLIENTENCODING=LATIN1
On linux
在 linux 上
export PGCLIENTENCODING=LATIN1
导出 PGCLIENTENCODING=LATIN1
or
或者
PGCLIENTENCODING=LATIN1
PGCLIENTENCODING=LATIN1
Moreover this discussion help me:
此外,这个讨论对我有帮助:
On windows
在窗户上
SET PGCLIENTENCODING=LATIN1 ogr2ogr...
SET PGCLIENTENCODING=LATIN1 ogr2ogr...
do not give me any error, but ogr2ogr do not works...I need to change the system variable (e.g. System--> Advanced system settings--> Environment variables -->New system variable) reboot the system and then run
不要给我任何错误,但 ogr2ogr 不起作用...我需要更改系统变量(例如系统--> 高级系统设置--> 环境变量--> 新系统变量)重新启动系统然后运行
ogr2ogr...
ogr2ogr...
回答by Sivaguru
I solved this problem using this command:
我使用以下命令解决了这个问题:
pg_restore --host localhost --port 5432 --username postgres --dbname {DBNAME} --schema public --verbose "{FILE_PATH to import}"
I don't know if this is the right solution, but it worked for me.
我不知道这是否是正确的解决方案,但它对我有用。