带双引号的 postgresql COPY 和 CSV 数据
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/10195749/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
postgresql COPY and CSV data w/ double quotes
提问by Wells
Example CSV line:
CSV 行示例:
"2012","Test User","ABC","First","71.0","","","0","0","3","3","0","0","","0","","","","","0.1","","4.0","0.1","4.2","80.8","847"
All values after "First" are numeric columns. Lots of NULL values just quoted as such, right.
“First”之后的所有值都是数字列。很多 NULL 值就是这样引用的,对。
Attempt at COPY:
尝试复制:
copy mytable from 'myfile.csv' with csv header quote '"';
NOPE: ERROR: invalid input syntax for type numeric: ""
不: ERROR: invalid input syntax for type numeric: ""
Well, yeah. It's a null value. Attempt 2 at COPY:
嗯,是的。这是一个空值。在复制时尝试 2:
copy mytable from 'myfile.csv' with csv header quote '"' null '""';
NOPE: ERROR: CSV quote character must not appear in the NULL specification
不: ERROR: CSV quote character must not appear in the NULL specification
What's a fella to do? Strip out all double quotes from the file before running COPY
? Can do that, but I figured there's a proper solution to what must be an incredibly common problem.
一个家伙要做什么?在运行之前从文件中去除所有双引号COPY
?可以做到这一点,但我认为对于必须是非常普遍的问题,有一个适当的解决方案。
采纳答案by kgrittn
While some database products treat an empty string as a NULL value, the standard says that they are distinct, and PostgreSQL treats them as distinct.
虽然一些数据库产品将空字符串视为 NULL 值,但标准说它们是不同的,而 PostgreSQL 将它们视为不同的。
It would be best if you could generate your CSV file with an unambiguous representation. While you coulduse sed or something to filter the file to good format, the other option would be to COPY
the data in to a table where a text
column could accept the empty strings, and then populate the target table. The NULLIF
function may help with that: http://www.postgresql.org/docs/9.1/interactive/functions-conditional.html#FUNCTIONS-NULLIF-- it will return NULL if both arguments match and the first value if they don't. So, something like NULLIF(txtcol, '')::numeric
might work for you.
如果您可以生成具有明确表示的 CSV 文件,那将是最好的。虽然您可以使用 sed 或其他东西将文件过滤为良好的格式,但另一种选择是COPY
将数据放入一个表中,其中一text
列可以接受空字符串,然后填充目标表。该NULLIF
函数可能对此有所帮助:http: //www.postgresql.org/docs/9.1/interactive/functions-conditional.html#FUNCTIONS-NULLIF- 如果两个参数匹配,它将返回 NULL,如果不匹配则返回第一个值'吨。所以,类似的东西NULLIF(txtcol, '')::numeric
可能对你有用。
回答by Adam Greenhall
as an alternative, using
作为替代方案,使用
sed 's/""//g' myfile.csv > myfile-formatted.csv
psql
# copy mytable from 'myfile-formatted.csv' with csv header;
works as well.
也有效。
回答by Chris
I think all you need to do here is the following:
我认为您需要做的就是以下几点:
COPY mytable from '/dir/myfile.csv' DELIMITER ',' NULL '' WITH CSV HEADER QUOTE ;
回答by Abdel
COPY mytable from '/dir/myfile.csv' DELIMITER ',' NULL ''
WITH CSV HEADER FORCE QUOTE *;
回答by Juanse
This worked for me in Python 3.8.X
这在 Python 3.8.X 中对我有用
import psycopg2
import csv
from io import StringIO
db_conn = psycopg2.connect(host=t_host, port=t_port,
dbname=t_dbname, user=t_user, password=t_pw)
cur = db_conn.cursor()
csv.register_dialect('myDialect',
delimiter=',',
skipinitialspace=True,
quoting=csv.QUOTE_MINIMAL)
with open('files/emp.csv') as f:
next(f)
reader = csv.reader(f, dialect='myDialect')
buffer = StringIO()
writer = csv.writer(buffer, dialect='myDialect')
writer.writerows(reader)
buffer.seek(0)
cur.copy_from(buffer, 'personnes', sep=',', columns=('nom', 'prenom', 'telephone', 'email'))
db_conn.commit()