postgresql 使用python在postgres中复制(来自)带有标题的csv

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/25566386/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-10-21 01:36:01  来源:igfitidea点击:

Copy (from) csv with heades in postgres with python

pythonsqlpostgresqlcsvpsycopg2

提问by Pablo Pardo

I'm trying to fill a table from CSV files in a python script.

我正在尝试从 python 脚本中的 CSV 文件填充表格。

The SQL statement, which follows, runs without error:

下面的 SQL 语句运行没有错误:

COPY registro
FROM '/home/pablo/Escritorio/puntos/20140227.csv'
DELIMITER ','
CSV header;

CSV has headers, and using headerparameter, it imports without error.

CSV 有标题,并且使用header参数,它导入没有错误。

The problem comes when I execute it from my python script. The only way I've found not to try to import the headers is with copy_expert()method. I get no error message but the table is still empty after I run the Python script below.

当我从 python 脚本执行它时,问题就出现了。我发现不尝试导入标头的唯一方法是使用copy_expert()方法。我没有收到错误消息,但在我运行下面的 Python 脚本后表仍然是空的。

Any possible clue? Or maybe any other way to copy a table from CSV with headers?

任何可能的线索?或者也许有其他方法可以从带有标题的 CSV 复制表格?

Thanks.

谢谢。

#/usr/bin/env python
# -*- coding: utf-8 -*-
import psycopg2
import os
import glob
DSN = "dbname=gps user=postgres host=localhost"
con = psycopg2.connect(DSN)
cur = con.cursor()
my_file = open('/home/pablo/Escritorio/puntos/20140227.csv')
#This is only a test file, not all the directory
sql = "COPY registro FROM stdin DELIMITER \',\' CSV header;"
cur.copy_expert(sql, my_file)
cur.close()
con.close()

回答by Seth

I'd try con.commit()after cur.copy_expert().

我会尝试con.commit()cur.copy_expert()

Also I would avoid preprocessing and uploading the file row by row as Sam P. pointed out above if the dataset is large. cur.copy_expert()is significantly faster.

此外,如果数据集很大,我会避免预处理和逐行上传文件,如 Sam P. 上面指出的那样。cur.copy_expert()明显更快。

conn = psycopg2.connect('postgresql://scott:tiger@localhost:5432/database')
cur = conn.cursor()
copy_sql = """
           COPY table_name FROM stdin WITH CSV HEADER
           DELIMITER as ','
           """
with open(path, 'r') as f:
    cur.copy_expert(sql=copy_sql, file=f)
    conn.commit()
    cur.close()

回答by Sam P

I would recommend dealing with the csv file in python first. It will be best to structure the data pulled from the csv file into rows/columns (in python this will nested lists, or a list of a tuples) then you can construct & execute SQL commands based on that data iteratively.

我建议先在 python 中处理 csv 文件。最好将从 csv 文件中提取的数据结构化为行/列(在 python 中,这将嵌套列表或元组列表),然后您可以基于该数据迭代地构建和执行 SQL 命令。

Use the csvlibrary to interact with the csv file, take a look at the documentation here: https://docs.python.org/2/library/csv.html. It's very user friendly and will help you with a lot of your problems.

使用csv库与 csv 文件进行交互,请查看此处的文档:https: //docs.python.org/2/library/csv.html。它非常用户友好,可以帮助您解决很多问题。

Here's a way to do it without csv(as I can't remember all the functions off the top of my head), however it would be best not to use this approach:

这是一种不用的方法csv(因为我无法记住所有的功能),但是最好不要使用这种方法:

#/usr/bin/env python
# -*- coding: utf-8 -*-
import psycopg2
import os
import glob
DSN = "dbname=gps user=postgres host=localhost"
con = psycopg2.connect(DSN)
cur = con.cursor()

# 'rb' used as I don't know the encoding of your file
# just use r if it's in utf-8 or a known/consistent charset
with open(file,'rb') as open_file:
    my_file = open_file.read().decode('utf-8','ignore')

data = my_file.splitlines()
data = [r.split(delimiter) for r in data]

data = data[1:] # get rid of headers

for r in data:
     # create command
     # cur.execute(command)