postgresql Psycopg2“copy_from”命令,是否可以忽略引号中的分隔符(出现错误)?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/27055634/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Psycopg2 "copy_from" command, possible to ignore delimiter in quote (getting error)?
提问by wouldbesmooth
I am trying to load rows of data into postgres in a csv-like structure using the copy_from command (function to utilize copy command in postgres). My data is delimited with commas(and unfortunately since I am not the data owner I cannot just change the delimiter). I run into a problem when I try to load a row that has a value in quotes containing a comma (ie. that comma should not be treated as a delimiter).
我正在尝试使用 copy_from 命令(在 postgres 中使用 copy 命令的函数)将数据行加载到类似 csv 结构的 postgres 中。我的数据用逗号分隔(不幸的是,由于我不是数据所有者,我不能只更改分隔符)。当我尝试加载包含逗号的引号中的值时遇到问题(即该逗号不应被视为分隔符)。
For example this row of data is fine:
例如这行数据就可以了:
",Madrid,SN,,SEN,,,SN,173,157"
This row of data is not fine:
这行数据不行:
","Dominican, Republic of",MC,,YUO,,,MC,65,162",
Some code:
一些代码:
conn = get_psycopg_conn()
cur = conn.cursor()
_io_buffer.seek(0) #This buffer is holding the csv-like data
cur.copy_from(_io_buffer, str(table_name), sep=',', null='', columns=column_names)
conn.commit()
回答by Craig Ringer
It looks like copy_from
doesn't expose the csv
mode or quote
options, which are available form the underlying PostgreSQL COPY
command. So you'll need to either patch psycopg2 to add them, or use copy_expert
.
看起来copy_from
没有公开csv
mode 或quote
options,它们可以从底层 PostgreSQLCOPY
命令中获得。因此,您需要修补 psycopg2 以添加它们,或者使用copy_expert
.
I haven't tried it, but something like
我没试过,但类似
curs.copy_expert("""COPY mytable FROM STDIN WITH (FORMAT CSV)""", _io_buffer)
might be sufficient.
可能就足够了。
回答by jhtravis
I had this same error and was able to get close to a fix based on the single line of code listed by craig-ringer. The other item I needed was to include quotes for the initial object by using df.to_csv(index=False,header=False, quoting=csv.QUOTE_NONNUMERIC,sep=',')
and specifically , quoting=csv.QUOTE_NONNUMERIC
.
我遇到了同样的错误,并且能够根据craig-ringer列出的单行代码接近修复。我需要的另一项是通过使用df.to_csv(index=False,header=False, quoting=csv.QUOTE_NONNUMERIC,sep=',')
和 特别地包含初始对象的引号, quoting=csv.QUOTE_NONNUMERIC
。
The full example of pulling one data source from MySQL and storing it in Postgres is below:
从 MySQL 中提取一个数据源并将其存储在 Postgres 中的完整示例如下:
#run in python 3.6
import MySQLdb
import psycopg2
import os
from io import StringIO
import pandas as pd
import csv
mysql_db = MySQLdb.connect(host="host_address",# your host, usually localhost
user="user_name", # your username
passwd="source_pw", # your password
db="source_db") # name of the data base
postgres_db = psycopg2.connect("host=dest_address dbname=dest_db_name user=dest_user password=dest_pw")
my_list = ['1','2','3','4']
# you must create a Cursor object. It will let you execute all the queries you need
mysql_cur = mysql_db.cursor()
postgres_cur = postgres_db.cursor()
for item in my_list:
# Pull cbi data for each state and write it to postgres
print(item)
mysql_sql = 'select * from my_table t \
where t.important_feature = \'' + item + '\';'
# Do something to create your dataframe here...
df = pd.read_sql_query(mysql_sql, mysql_db)
# Initialize a string buffer
sio = StringIO()
sio.write(df.to_csv(index=False,header=False, quoting=csv.QUOTE_NONNUMERIC,sep=',')) # Write the Pandas DataFrame as a csv to the buffer
sio.seek(0) # Be sure to reset the position to the start of the stream
# Copy the string buffer to the database, as if it were an actual file
with postgres_db.cursor() as c:
print(c)
c.copy_expert("""COPY schema:new_table FROM STDIN WITH (FORMAT CSV)""", sio)
postgres_db.commit()
mysql_db.close()
postgres_db.close()