SQL 如何使用 Postgres 中的 CSV 文件中的值更新选定的行?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/8910494/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How to update selected rows with values from a CSV file in Postgres?
提问by user519753
I'm using Postgres and would like to make a big update query that would pick up from a CSV file, lets say I got a table that's got (id, banana, apple)
.
我正在使用 Postgres 并希望进行一个大型更新查询,该查询将从 CSV 文件中获取,假设我有一个表,其中包含(id, banana, apple)
.
I'd like to run an update that changes the Bananas and not the Apples, each new Banana and their ID would be in a CSV file.
我想运行一个更新来更改香蕉而不是苹果,每个新香蕉及其 ID 都将在一个 CSV 文件中。
I tried looking at the Postgres site but the examples are killing me.
我尝试查看 Postgres 站点,但这些示例让我很沮丧。
回答by Erwin Brandstetter
COPY
the file to a temporary staging table and update the actual table from there. Like:
COPY
文件到临时登台表并从那里更新实际表。喜欢:
CREATE TEMP TABLE tmp_x (id int, apple text, banana text); -- but see below
COPY tmp_x FROM '/absolute/path/to/file' (FORMAT csv);
UPDATE tbl
SET banana = tmp_x.banana
FROM tmp_x
WHERE tbl.id = tmp_x.id;
DROP TABLE tmp_x; -- else it is dropped at end of session automatically
If the imported table matches the table to be updated exactly, this may be convenient:
如果导入的表与要更新的表完全匹配,这可能很方便:
CREATE TEMP TABLE tmp_x AS SELECT * FROM tbl LIMIT 0;
Creates an empty temporary table matching the structure of the existing table, without constraints.
创建一个与现有表结构匹配的空临时表,没有约束。
Privileges
特权
SQL COPY
requires superuser privileges for this. (The manual):
SQLCOPY
为此需要超级用户权限。(手册):
COPY
naming a file or command is only allowed to database superusers, since it allows reading or writing any file that the server has privileges to access.
COPY
命名文件或命令只允许数据库超级用户,因为它允许读取或写入服务器有权访问的任何文件。
The psqlmeta-command \copy
works for any db role. The manual:
该psql的元命令\copy
适用于任何数据库的作用。手册:
Performs a frontend (client) copy. This is an operation that runs an SQL
COPY
command, but instead of the server reading or writing the specified file, psql reads or writes the file and routes the data between the server and the local file system. This means that file accessibility and privileges are those of the local user, not the server, and no SQL superuser privileges are required.
执行前端(客户端)复制。这是一个运行 SQL
COPY
命令的操作,但不是服务器读取或写入指定文件,而是 psql 读取或写入文件并在服务器和本地文件系统之间路由数据。这意味着文件可访问性和权限是本地用户的,而不是服务器的,并且不需要 SQL 超级用户权限。
The scope of temporary tables is limited to a single sessionof a single role, so the above has to be executed in the same psql session:
临时表的范围仅限于单个角色的单个会话,因此必须在同一个 psql 会话中执行上述操作:
CREATE TEMP TABLE ...;
\copy tmp_x FROM '/absolute/path/to/file' (FORMAT csv);
UPDATE ...;
If you are scripting this in a bash command, be sure to wrap it all in a singlepsql call. Like:
如果您在 bash 命令中编写此脚本,请确保将其全部包装在单个psql 调用中。喜欢:
echo 'CREATE TEMP TABLE tmp_x ...; \copy tmp_x FROM ...; UPDATE ...;' | psql
Normally, you need the meta-command \\
to switch between psql meta commands and SQL comands in psql, but \copy
is an exception to this rule. The manual again:
通常,您需要元命令\\
在 psql 中的 psql 元命令和 SQL 命令之间切换,但\copy
此规则的一个例外。又是说明书:
special parsing rules apply to the
\copy
meta-command. Unlike most other meta-commands, the entire remainder of the line is always taken to be the arguments of\copy
, and neither variable interpolation nor backquote expansion are performed in the arguments.
特殊的解析规则适用于
\copy
元命令。与大多数其他元命令不同,该行的整个剩余部分始终被视为 的参数\copy
,并且在参数中既不执行变量插值也不执行反引号扩展。
Big tables
大桌子
If the import-table is big it may pay to increase temp_buffers
temporarily for the session (first thing in the session):
如果导入表很大,可能需要temp_buffers
为会话临时增加(会话中的第一件事):
SET temp_buffers = '500MB'; -- example value
Add an index to the temporary table:
向临时表添加索引:
CREATE INDEX tmp_x_id_idx ON tmp_x(id);
And run ANALYZE
manually, since temporary tables are not covered by autovacuum / auto-analyze.
并ANALYZE
手动运行,因为 autovacuum / auto-analyze 不涵盖临时表。
ANALYZE tmp_x;
Related answers:
相关回答:
回答by Anupama V Iyengar
You can try the below code written in python, the input file is the csv file whose contents you want to update into the table. Each row is split based on comma so for each row, row[0]is the value under first column, row[1] is value under second column etc.
您可以尝试以下用 python 编写的代码,输入文件是您要更新到表中的内容的 csv 文件。每行基于逗号分割,因此对于每一行,row[0] 是第一列下的值,row[1] 是第二列下的值等。
import csv
import xlrd
import os
import psycopg2
import django
from yourapp import settings
django.setup()
from yourapp import models
try:
conn = psycopg2.connect("host=localhost dbname=prodmealsdb
user=postgres password=blank")
cur = conn.cursor()
filepath = '/path/to/your/data_to_be_updated.csv'
ext = os.path.splitext(filepath)[-1].lower()
if (ext == '.csv'):
with open(filepath) as csvfile:
next(csvfile)
readCSV = csv.reader(csvfile, delimiter=',')
for row in readCSV:
print(row[3],row[5])
cur.execute("UPDATE your_table SET column_to_be_updated = %s where
id = %s", (row[5], row[3]))
conn.commit()
conn.close()
cur.close()
except (Exception, psycopg2.DatabaseError) as error:
print(error)
finally:
if conn is not None:
conn.close()