postgresql Psycopg2“copy_from”命令,是否可以忽略引号中的分隔符(出现错误)?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/27055634/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-10-21 01:41:25  来源:igfitidea点击:

Psycopg2 "copy_from" command, possible to ignore delimiter in quote (getting error)?

pythonpostgresqlpsycopg2

提问by wouldbesmooth

I am trying to load rows of data into postgres in a csv-like structure using the copy_from command (function to utilize copy command in postgres). My data is delimited with commas(and unfortunately since I am not the data owner I cannot just change the delimiter). I run into a problem when I try to load a row that has a value in quotes containing a comma (ie. that comma should not be treated as a delimiter).

我正在尝试使用 copy_from 命令(在 postgres 中使用 copy 命令的函数)将数据行加载到类似 csv 结构的 postgres 中。我的数据用逗号分隔(不幸的是,由于我不是数据所有者,我不能只更改分隔符)。当我尝试加载包含逗号的引号中的值时遇到问题(即该逗号不应被视为分隔符)。

For example this row of data is fine:

例如这行数据就可以了:

",Madrid,SN,,SEN,,,SN,173,157"

This row of data is not fine:

这行数据不行:

","Dominican, Republic of",MC,,YUO,,,MC,65,162",

Some code:

一些代码:

    conn = get_psycopg_conn()
    cur = conn.cursor()

    _io_buffer.seek(0) #This buffer is holding the csv-like data
    cur.copy_from(_io_buffer, str(table_name), sep=',', null='', columns=column_names)
    conn.commit()

回答by Craig Ringer

It looks like copy_fromdoesn't expose the csvmode or quoteoptions, which are available form the underlying PostgreSQL COPYcommand. So you'll need to either patch psycopg2 to add them, or use copy_expert.

看起来copy_from没有公开csvmode 或quoteoptions,它们可以从底层 PostgreSQLCOPY命令中获得。因此,您需要修补 psycopg2 以添加它们,或者使用copy_expert.

I haven't tried it, but something like

我没试过,但类似

curs.copy_expert("""COPY mytable FROM STDIN WITH (FORMAT CSV)""", _io_buffer)

might be sufficient.

可能就足够了。

回答by jhtravis

I had this same error and was able to get close to a fix based on the single line of code listed by craig-ringer. The other item I needed was to include quotes for the initial object by using df.to_csv(index=False,header=False, quoting=csv.QUOTE_NONNUMERIC,sep=',')and specifically , quoting=csv.QUOTE_NONNUMERIC.

我遇到了同样的错误,并且能够根据craig-ringer列出的单行代码接近修复。我需要的另一项是通过使用df.to_csv(index=False,header=False, quoting=csv.QUOTE_NONNUMERIC,sep=',')和 特别地包含初始对象的引号, quoting=csv.QUOTE_NONNUMERIC

The full example of pulling one data source from MySQL and storing it in Postgres is below:

从 MySQL 中提取一个数据源并将其存储在 Postgres 中的完整示例如下:

#run in python 3.6
import MySQLdb
import psycopg2
import os
from io import StringIO
import pandas as pd
import csv

mysql_db = MySQLdb.connect(host="host_address",# your host, usually localhost
                     user="user_name",         # your username
                     passwd="source_pw",  # your password
                     db="source_db")       # name of the data base

postgres_db = psycopg2.connect("host=dest_address dbname=dest_db_name user=dest_user password=dest_pw")

my_list = ['1','2','3','4']

# you must create a Cursor object. It will let you execute all the queries you need
mysql_cur = mysql_db.cursor()
postgres_cur = postgres_db.cursor()

for item in my_list:
  # Pull cbi data for each state and write it to postgres
  print(item)
  mysql_sql = 'select * from my_table t \
       where t.important_feature = \'' + item + '\';'

  # Do something to create your dataframe here...
  df = pd.read_sql_query(mysql_sql, mysql_db)

  # Initialize a string buffer
  sio = StringIO()
  sio.write(df.to_csv(index=False,header=False, quoting=csv.QUOTE_NONNUMERIC,sep=','))  # Write the Pandas DataFrame as a csv to the buffer
  sio.seek(0)  # Be sure to reset the position to the start of the stream

  # Copy the string buffer to the database, as if it were an actual file
  with postgres_db.cursor() as c:
      print(c)
      c.copy_expert("""COPY schema:new_table FROM STDIN WITH (FORMAT CSV)""", sio)
      postgres_db.commit()

mysql_db.close()
postgres_db.close()