postgresql Postgres FOR 循环

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/19145761/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-11 00:26:31  来源:igfitidea点击:

Postgres FOR LOOP

postgresqlstored-proceduresfor-looprandomplpgsql

提问by user2840106

I am trying to get 25 random samples of 15,000 IDs from a table. Instead of manually pressing run every time, I'm trying to do a loop. Which I fully understand is not the optimum use of Postgres, but it is the tool I have. This is what I have so far:

我试图从表中获取 15,000 个 ID 的 25 个随机样本。我不是每次都手动按运行,而是尝试进行循环。我完全理解这不是 Postgres 的最佳使用,但它是我拥有的工具。这是我到目前为止:

for i in 1..25 LOOP
   insert into playtime.meta_random_sample
   select i, ID
   from   tbl
   order  by random() limit 15000
end loop

回答by Erwin Brandstetter

Proceduralelements like loopsare not part of the SQL language and can only be used inside the body of a procedural language function, procedure(Postgres 11 or later) or a DOstatement, where such additional elements are defined by the respective procedural language. The default is PL/pgSQL, but there are others.

循环过程元素不是 SQL 语言的一部分,只能在过程语言函数过程(Postgres 11 或更高版本)或DO语句的主体内使用,其中此类附加元素由相应的过程语言定义。默认是PL/pgSQL,但还有其他的

Example with plpgsql:

plpgsql 示例:

DO
$do$
BEGIN 
   FOR i IN 1..25 LOOP
      INSERT INTO playtime.meta_random_sample
         (col_i, col_id)                       -- declare target columns!
      SELECT  i,     id
      FROM   tbl
      ORDER  BY random()
      LIMIT  15000;
   END LOOP;
END
$do$;

For many tasks that can be solved with a loop, there is a shorter and faster set-basedsolution around the corner. Pure SQL equivalent for your example:

对于许多可以用循环解决的任务,有一种更短、更快的基于集合的解决方案即将出现。纯 SQL 等效于您的示例:

INSERT INTO playtime.meta_random_sample (col_i, col_id)
SELECT t.*
FROM   generate_series(1,25) i
CROSS  JOIN LATERAL (
   SELECT i, id
   FROM   tbl
   ORDER  BY random()
   LIMIT  15000
   ) t;

About generate_series():

关于generate_series()

About optimizing performance of random selections:

关于优化随机选择的性能:

回答by Gabriel

Below is example you can use:

以下是您可以使用的示例:

create temp table test2 (
  id1  numeric,
  id2  numeric,
  id3  numeric,
  id4  numeric,
  id5  numeric,
  id6  numeric,
  id7  numeric,
  id8  numeric,
  id9  numeric,
  id10 numeric) 
with (oids = false);

do
$do$
declare
     i int;
begin
for  i in 1..100000
loop
    insert into test2  values (random(), i * random(), i / random(), i + random(), i * random(), i / random(), i + random(), i * random(), i / random(), i + random());
end loop;
end;
$do$;

回答by Morris de Oryx

I just ran into this question and, while it is old, I figured I'd add an answer for the archives. The OP asked about for loops, but their goal was to gather a random sample of rows from the table. For that task, Postgres 9.5+ offers the TABLESAMPLE clause on WHERE. Here's a good rundown:

我刚遇到这个问题,虽然它很旧,但我想我会为档案添加一个答案。OP 询问了 for 循环,但他们的目标是从表中收集随机的行样本。对于该任务,Postgres 9.5+ 在 WHERE 上提供了 TABLESAMPLE 子句。这是一个很好的概述:

https://www.2ndquadrant.com/en/blog/tablesample-in-postgresql-9-5-2/

https://www.2ndquadrant.com/en/blog/tablesample-in-postgresql-9-5-2/

I tend to use Bernoulli as it's row-based rather than page-based, but the original question is about a specific row count. For that, there's a built-in extension:

我倾向于使用伯努利,因为它是基于行而不是基于页面的,但最初的问题是关于特定的行数。为此,有一个内置扩展:

https://www.postgresql.org/docs/current/tsm-system-rows.html

https://www.postgresql.org/docs/current/tsm-system-rows.html

CREATE EXTENSION tsm_system_rows;

Then you can grab whatever number of rows you want:

然后你可以获取任何你想要的行数:

select * from playtime tablesample system_rows (15);

回答by LoMaPh

I find it more convenient to make a connection using a procedural programming language (like Python) and do these types of queries.

我发现使用过程编程语言(如 Python)建立连接并执行这些类型的查询更方便。

import psycopg2
connection_psql = psycopg2.connect( user="admin_user"
                                  , password="***"
                                  , port="5432"
                                  , database="myDB"
                                  , host="[ENDPOINT]")
cursor_psql = connection_psql.cursor()

myList = [...]
for item in myList:
  cursor_psql.execute('''
    -- The query goes here
  ''')

connection_psql.commit()
cursor_psql.close()