postgresql 如何在 DELETE 语句中使用行的物理位置 (ROWID)

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/15026420/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-10-21 00:48:49  来源:igfitidea点击:

How to use the physical location of rows (ROWID) in a DELETE statement

sqldatabasepostgresqlrowid

提问by endrigoantonini

I have a table that has a lot of duplicated rows and no primary key.
I want to remove just the duplicated records, but when I try to do this it would remove all peers.

我有一个有很多重复行且没有主键的表。
我只想删除重复的记录,但是当我尝试这样做时,它会删除所有对等点。

How can I find the ROWIDfrom a table in Postgres?

如何ROWID从 Postgres 的表格中找到?

采纳答案by endrigoantonini

On PostgreSQL the physical location of the row is called CTID.

在 PostgreSQL 上,行的物理位置称为 CTID。

So if you want to view it use a QUERY like this:

因此,如果您想查看它,请使用这样的 QUERY:

SELECT CTID FROM table_name

To use it on a DELETE statement to remove the duplicated records use it like this:

要在 DELETE 语句上使用它来删除重复的记录,请像这样使用它:

DELETE FROM table_name WHERE CTID NOT IN (
  SELECT RECID FROM 
    (SELECT MIN(CTID) AS RECID, other_columns 
      FROM table_name GROUP BY other_columns) 
  a);

Remember that table_name is the desired table and other_columns are the columns that you want to use to filter that.

请记住 table_name 是所需的表,other_columns 是您要用来过滤它的列。

Ie:

IE:

DELETE FROM user_department WHERE CTID NOT IN (
  SELECT RECID FROM 
    (SELECT MIN(CTID) AS RECID, ud.user_id, ud.department_id
      FROM user_department ud GROUP BY ud.user_id, ud.department_id) 
  a);

回答by Erwin Brandstetter

Simplify this by one query level:

将其简化为一个查询级别:

DELETE FROM table_name
WHERE  ctid NOT IN (
   SELECT min(ctid)
   FROM   table_name
   GROUP  BY $other_columns);

.. where duplicates are defined by equality in $other_columns.
There is no need to include columns from the GROUP BYclause in the SELECTlist, so you don't need another subquery.

.. 其中重复项由$other_columns.
不需要GROUP BYSELECT列表中包含子句中的列,因此您不需要另一个子查询。

ctidin the current manual.

ctid在当前手册中。

回答by Kaushik Nayak

You should consider using row_number()if want to delete based on a unique id column(or a timestamp), since ctidalone is not always reliable when you want to only keep recent records etc.

您应该考虑使用row_number()if 要根据唯一的 id 列(或时间戳)进行删除,因为ctid当您只想保留最近的记录等时,单独使用并不总是可靠的。

WITH d 
     AS (SELECT ctid c, 
                row_number() 
                  OVER ( 
                    partition BY s 
                    ORDER BY id) rn 
         FROM   t) 
DELETE FROM t 
WHERE  ctid IN (SELECT c 
               FROM   d 
               WHERE  rn > 1)  ; 

Demo

演示