postgresql 大表上的 Postgres 更新非常慢

Question

提问by Aren Cambre

I have a Postgres 9.1.3 table with 2.06 million rows after WHERE Y=1as per below (it only has a few ten thousand more rows total without any WHERE). I am trying to add data to an empty field with a query like this:

我有一个 Postgres 9.1.3 表，在WHERE Y=1之后有 206 万行，如下所示（它总共只有几万行，没有任何WHERE）。我正在尝试使用如下查询将数据添加到空字段：

WITH B AS (
    SELECT Z,
           rank() OVER (ORDER BY L, N, M, P) AS X
    FROM   A
    WHERE  Y=1
)

UPDATE A
SET A.X = B.X
FROM B
WHERE A.Y=1
  AND B.Z = A.Z;

This query runs for hours and appears to progress very slowly. In fact, the second time I tried this, I had a power outage after the query ran for ~3 hours. After restoring power, I analyzed the table and got this:

此查询运行数小时，并且进展非常缓慢。事实上，我第二次尝试这个时，在查询运行了大约 3 个小时后就断电了。恢复供电后，我分析了表格并得到了这个：

INFO:  analyzing "consistent.master"
INFO:  "master": scanned 30000 of 69354 pages, containing 903542 live rows and 153552 dead rows; 30000 rows in sample, 2294502 estimated total rows
Total query runtime: 60089 ms.

Is it correct to interpret that the query had barely progressed in those hours?

将查询在那些小时内几乎没有进展的解释是否正确？

I have done a VACUUM FULLand ANALYZEbefore running the long query.

在运行长查询之前，我已经完成了VACUUM FULL和ANALYZE。

The query within the WITHonly takes 40 seconds.

WITH 中的查询只需要 40 秒。

All fields referenced above except A.X, and by extension B.X, are indexed: L, M, N, P, Y, Z.

除了 AX 和扩展 BX 之外，上面引用的所有字段都被索引：L、M、N、P、Y、Z。

This is being run on a laptop with 8 GB RAM, a Core i7 Q720 1.6 GHz quad core processor, and Windows 7 x64. I am running Postgres 32 bit for compatibility with PostGIS 1.5.3. 64 bit PostGIS for Windows isn't available yet. (32 bit Postgres means it can't use more than 2 GB RAM in Windows, but I doubt that's an issue here.)

这是在具有 8 GB RAM、Core i7 Q720 1.6 GHz 四核处理器和 Windows 7 x64 的笔记本电脑上运行的。我正在运行 Postgres 32 位以与 PostGIS 1.5.3 兼容。64 位 PostGIS for Windows 尚不可用。（32 位 Postgres 意味着它在 Windows 中不能使用超过 2 GB 的 RAM，但我怀疑这是一个问题。）

Here's the result of EXPLAIN:

这是 EXPLAIN 的结果：

Update on A  (cost=727684.76..945437.01 rows=2032987 width=330)
  CTE B
    ->  WindowAgg  (cost=491007.50..542482.47 rows=2058999 width=43)
          ->  Sort  (cost=491007.50..496155.00 rows=2058999 width=43)
                Sort Key: A.L, A.N, A.M, A.P
                ->  Seq Scan on A  (cost=0.00..85066.80 rows=2058999 width=43)
                      Filter: (Y = 1)
  ->  Hash Join  (cost=185202.29..402954.54 rows=2032987 width=330)
        Hash Cond: ((B.Z)::text = (A.Z)::text)
        ->  CTE Scan on B  (cost=0.00..41179.98 rows=2058999 width=88)
        ->  Hash  (cost=85066.80..85066.80 rows=2058999 width=266)
              ->  Seq Scan on A  (cost=0.00..85066.80 rows=2058999 width=266)
                    Filter: (Y = 1)

Answer 1

回答by maniek

There could be multiple solutions.

可能有多种解决方案。

The update could be blocked on a lock. Consult pg_locks view.
Maybe there are triggers on A? They could be the reason for slowdown.
Try "explain update... " - is the plan significantly different than the plan of plain select? Maybe You could do it in 2 steps - export "B" to a table, and update from that table.
Try dropping the indexes before the update.
Create a new table, drop the old one, rename the new table to old table's name.

更新可能会被锁定阻止。请参阅 pg_locks 视图。
也许A上有触发器？它们可能是放缓的原因。
尝试“解释更新...” - 该计划与普通选择计划有显着不同吗？也许您可以分两步完成 - 将“B”导出到表，然后从该表进行更新。
尝试在更新前删除索引。
创建一个新表，删除旧表，将新表重命名为旧表的名称。

Answer 2

回答by dpetruha

Try to rewrite the query like this:

尝试像这样重写查询：

UPDATE A
SET A.X = B.X
FROM B
WHERE A.Y=1
      AND B.Z = A.Z
      AND A.X IS DISTINCT FROM B.X;

postgresql 大表上的 Postgres 更新非常慢

提问by Aren Cambre

回答by maniek

回答by dpetruha

相关推荐

最近更新

标签

postgresql 大表上的 Postgres 更新非常慢

提问by Aren Cambre

回答by maniek

回答by dpetruha

相关推荐

postgresql 从mysql中的值中选择

在 Postgresql 中拆分逗号分隔的字段并对所有结果表执行 UNION ALL

postgresql Postgres 错误：重复的键值违反了唯一约束

postgresql Heroku 上 Rails 3.1 中的 Postgres 重音不敏感 LIKE 搜索

相关推荐

最近更新

标签