database 通过命令行插入 SQL 语句，无需重新连接到远程数据库

Question

提问by aza07

I have a large amount of data files to process and to be stored in the remote database. Each line of a data file represents a row in the database, but must be formatted before inserting into the database.

我有大量数据文件要处理并存储在远程数据库中。数据文件的每一行代表数据库中的一行，但在插入数据库之前必须进行格式化。

My first solution was to process data files by writing bash scripts and produce SQL data files, and then import the dump SQL files into the database. This solution seems to be too slow and as you can see involves an extra step of creating intermediary SQL file.

我的第一个解决方案是通过编写 bash 脚本来处理数据文件并生成 SQL 数据文件，然后将转储的 SQL 文件导入数据库。这个解决方案似乎太慢了，正如您所看到的，涉及创建中间 SQL 文件的额外步骤。

My second solution was to write bash scripts that while processing each line of the data file, creates and INSERT INTO ...statement and sends the SQL statement to the remote database:

我的第二个解决方案是编写 bash 脚本，在处理数据文件的每一行时，创建和INSERT INTO ...声明并将 SQL 语句发送到远程数据库：

echo sql_statement | psql -h remote_server -U username -d database

i.e. does not create SQL file. This solution, however, has one major issue that I am searching an advice on:
Each time I have to reconnect to the remote database to insert one single row.

即不创建 SQL 文件。然而，这个解决方案有一个主要问题，我正在搜索一个建议：
每次我必须重新连接到远程数据库以插入一行时。

Is there a way to connect to the remote database, stay connected and then "pipe" or "send" the insert-SQL-statement without creating a huge SQL file?

有没有办法连接到远程数据库，保持连接，然后“管道”或“发送”插入 SQL 语句而不创建一个巨大的 SQL 文件？

Answer 1

回答by Erwin Brandstetter

Answer to your actual question

回答您的实际问题

Yes. You can use a named pipeinstead of creating a file. Consider the following demo.

是的。您可以使用命名管道而不是创建文件。考虑以下演示。

Create a schema xin my database eventfor testing:

x在我的数据库中创建一个模式event进行测试：

-- DROP SCHEMA x CASCADE;
CREATE SCHEMA x;
CREATE TABLE x.x (id int, a text);

Create a named pipe (fifo) from the shell like this:

从 shell 创建一个命名管道 (fifo)，如下所示：

postgres@db:~$ mkfifo --mode=0666 /tmp/myPipe

Either 1)call the SQL command COPYusing a named pipe on the server:

无论是1）调用SQL命令COPY使用命名管道在服务器上：

postgres@db:~$ psql event -p5433 -c "COPY x.x FROM '/tmp/myPipe'"

This will acquire an exclusive lockon the table x.xin the database. The connection stays open until the fifo gets data. Be careful not to leave this open for too long! You can call this afteryou have filled the pipe to minimize blocking time. You can chose the sequence of events. The command executes as soon as two processes bind to the pipe. The first waits for the second.

这将获得对数据库中表的排它锁x.x。连接保持打开状态，直到 fifo 获取数据。小心不要让这个打开太久！您可以在填充管道后调用它以最小化阻塞时间。您可以选择事件的顺序。一旦两个进程绑定到管道，该命令就会执行。第一个等待第二个。

Or 2)you can execute SQL from the pipe on the client:

或者2)您可以从客户端上的管道执行 SQL ：

postgres@db:~$ psql event -p5433 -f /tmp/myPipe

This is better suited for your case. Also, no table locks until SQL is executed in one piece.

这更适合您的情况。此外，在 SQL 被一次性执行之前，不会锁定表。

Bash will appear blocked. It is waiting for input to the pipe. To do it all from one bash instance, you can send the waiting process to the background instead. Like this:

Bash 将出现阻塞。它正在等待管道的输入。要从一个 bash 实例完成所有操作，您可以将等待进程发送到后台。像这样：

postgres@db:~$ psql event -p5433 -f /tmp/myPipe 2>&1 &

Either way, from the same bash or a different instance, you can fill the pipenow.
Demo with three rows for variant 1):

无论哪种方式，从同一个 bash 或不同的实例，您现在都可以填充管道。
变体1) 的三行演示：

postgres@db:~$ echo '1  foo' >> /tmp/myPipe; echo '2    bar' >> /tmp/myPipe; echo '3    baz' >> /tmp/myPipe;

(Take care to use tabs as delimiters or instruct COPYto accept a different delimiter using WITH DELIMITER 'delimiter_character')
That will trigger the pending psql with the COPY command to execute and return:

（注意使用制表符作为分隔符或使用指示COPY接受不同的分隔符WITH DELIMITER 'delimiter_character'）
这将使用 COPY 命令触发挂起的 psql 执行并返回：

COPY 3

Demo for for variant 2):

变体2) 的演示：

postgres@db:~$ (echo -n "INSERT INTO x.x VALUES (1,'foo')" >> /tmp/myPipe; echo -n ",(2,'bar')" >> /tmp/myPipe; echo ",(3,'baz')" >> /tmp/myPipe;)

INSERT 0 3

Delete the named pipe after you are done:

完成后删除命名管道：

postgres@db:~$ rm /tmp/myPipe

Check success:

检查成功：

event=# select * from x.x;
 id |         a
----+-------------------
  1 | foo
  2 | bar
  3 | baz

Useful links for the code above

上面代码的有用链接

Reading compressed files with postgres using named pipes
Introduction to Named Pipes
Best practice to run bash script in background

使用命名管道通过 postgres 读取压缩文件命名管道
 简介
 在后台运行 bash 脚本的最佳实践

Advice you may or may not not need

您可能需要或不需要的建议

For bulk INSERTyou have better solutions than a separate INSERTper row. Use this syntax variant:

对于批量，INSERT您有比每行单独的INSERT更好的解决方案。使用此语法变体：

INSERT INTO mytable (col1, col2, col3) VALUES
 (1, 'foo', 'bar')
,(2, 'goo', 'gar')
,(3, 'hoo', 'har')
...
;

Write your statements to a file and do one mass INSERTlike this:

将您的语句写入一个文件，然后INSERT像这样进行一次批量处理：

psql -h remote_server -U username -d database -p 5432 -f my_insert_file.sql

(5432 or whatever port the db-cluster is listening on)
my_insert_file.sqlcan hold multiple SQL statements. In fact, it's common practise to restore / deploy whole databases like that. Consult the manual about the -fparameter, or in bash: man psql.

（5432 或 db-cluster 正在侦听的任何端口）
my_insert_file.sql可以保存多个 SQL 语句。事实上，像这样恢复/部署整个数据库是常见的做法。查阅有关-f参数的手册，或在 bash: 中man psql。

Or, if you can transfer the (compressed) file to the server, you can use COPYto insert the (decompressed) data even faster.

或者，如果您可以将（压缩）文件传输到服务器，则可以使用COPY更快地插入（解压缩）数据。

You can also do some or all of the processing inside PostgreSQL. For that you can COPY TO(or INSERT INTO) a temporary table and use plain SQL statements to prepare and finally INSERT / UPDATE your tables. I do that a lot. Be aware that temporary tables live and die with the session.

您还可以在 PostgreSQL 内部进行部分或全部处理。为此，您可以COPY TO（或INSERT INTO）临时表并使用普通 SQL 语句来准备并最终插入/更新您的表。我经常这样做。请注意，临时表会随着会话而生和死。

You could use a GUI like pgAdminfor comfortable handling. A session in an SQL Editor window remains open until you close the window. (Therefore, temporary tables live until you close the window.)

您可以使用像pgAdmin这样的 GUI来进行舒适的处理。SQL 编辑器窗口中的会话保持打开状态，直到您关闭该窗口。（因此，临时表会一直存在，直到您关闭窗口。）

Answer 2

回答by Dan

I know I'm late to the party, but why couldn't you combine all your INSERTstatements into a single string, with a semicolon marking the end of each statement? (Warning! Pseudocode ahead...)

我知道我迟到了，但是为什么不能将所有INSERT语句组合成一个字符串，并用分号标记每个语句的结尾？（警告！前面的伪代码......）

Instead of:

代替：

for each line
  sql_statement="INSERT whatever YOU want"
  echo $sql_statement | psql ...
done

Use:

用：

sql_statements=""
for each line
  sql_statement="INSERT whatever YOU want;"
  sql_statements="$sql_statements $sql_statement"
done
echo $sql_statements | psql ...

That way you don't have to create anything on your filesystem, do a bunch of redirection, run any tasks in the background, remember to delete anything on your filesystem afterwards, or even remind yourself what a named pipe is.

这样你就不必在你的文件系统上创建任何东西，做一堆重定向，在后台运行任何任务，记住之后删除文件系统上的任何东西，甚至提醒自己命名管道是什么。

database 通过命令行插入 SQL 语句，无需重新连接到远程数据库

提问by aza07

回答by Erwin Brandstetter

Answer to your actual question

回答您的实际问题

Useful links for the code above

上面代码的有用链接

Advice you may or may not not need

您可能需要或不需要的建议

回答by Dan

相关推荐

最近更新

标签

database 通过命令行插入 SQL 语句，无需重新连接到远程数据库

提问by aza07

回答by Erwin Brandstetter

Answer to your actual question

回答您的实际问题

Useful links for the code above

上面代码的有用链接

Advice you may or may not not need

您可能需要或不需要的建议

回答by Dan

相关推荐

database 如何更改 SQLite 数据库列中的值？

database 如何在 MongoDB 中将集合导出为 CSV？

database 将 SQL 转储导入 PostgreSQL 数据库

database 如何从数据库中填充 h:selectOneMenu 的选项？

相关推荐

最近更新

标签