postgresql 如何找出数据何时插入到 Postgres?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/9488640/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-10-20 23:41:07  来源:igfitidea点击:

How to find out when data was inserted to Postgres?

postgresqldatabase-metadata

提问by Ondrae

I have inherited an existing Postgres database full of data. Most of the data has a 'created_date' column value. Some of the earlier data was inserted before this was being tracked.

我继承了一个现有的充满数据的 Postgres 数据库。大多数数据都有一个“created_date”列值。一些较早的数据是在跟踪之前插入的。

Is there a Postgres metadata table hidden away somewhere that tracks when INSERTqueries were done?

是否有隐藏在某处的 Postgres 元数据表来跟踪INSERT查询何时完成?

回答by Erwin Brandstetter

Postgres 9.5 or later

Postgres 9.5 或更高版本

You can enable track_commit_timestampin postgresql.conf(and restart) to start tracking commit timestamps. Then you can get a timestamp for your xmin. Related answer:

您可以启用track_commit_timestamppostgresql.conf(并重新启动)开始跟踪提交时间戳。然后你可以为你的xmin. 相关回答:

Postgres 9.4 or older

Postgres 9.4 或更高版本

There is no such metadata in PostgreSQL unless you record it yourself.

PostgreSQL 中没有这样的元数据,除非你自己记录。

You may be able to deduce someinformation from the row headers (HeapTupleHeaderData), in particular from the insert transaction id xmin. It holds the ID of the transaction in which the row was inserted (needed to decide visibility in PostgreSQL's MVCCmodel). Try (for any table):

您可以从行标题 (HeapTupleHeaderData) 中推断出一些信息,特别是从插入事务 id 中。它包含插入行的事务的 ID(需要决定 PostgreSQL 的MVCC模型中的可见性)。尝试(对于任何表):xmin

SELECT xmin, * FROM tbl LIMIT 10;

Some limitations apply:

一些限制适用:

  • If the database was dumped and restored then, obviously, the information is gone - all rows are inserted in the same transaction.
  • If the database is huge / very old / very heavily written, then it may have gone through transaction ID wraparound, and the order of numbers in xminis ambiguous.
  • 如果数据库被转储并恢复,那么显然信息消失了——所有行都插入到同一个事务中。
  • 如果数据库很大/很旧/写得很重,那么它可能经历了事务ID wraparound,并且数字的顺序xmin不明确。

But for most databases you should be able to derive:

但是对于大多数数据库,您应该能够得出:

  • the chronological order of INSERTs
  • which rows were inserted together
  • when there (probably) was a long period of time between inserts
  • INSERT 的时间顺序
  • 哪些行被插入在一起
  • 当(可能)插入之间有很长一段时间

No timestamp, though.

不过没有时间戳。

回答by Danish Shaikh

track_commit_timestamp (boolean)

track_commit_timestamp(布尔值)

Mostly used at time of replication server setup. Record commit time of transactions. This parameter can only be set in postgresql.conffile or on the server command line. The default value is off.

主要在复制服务器设置时使用。记录事务的提交时间。此参数只能在postgresql.conf文件中或在服务器命令行上设置。默认值为off

回答by Chapman Flack

Building on Erwin Brandstetter's answer, if you have PostgreSQL 9.5 or later, the timestamps of commits are being recorded in the write-ahead logall the time, even if track_commit_timestampis off. They are recorded there to support point-in-time recovery, where you can roll the database to an exact past state that you can specify as a date and time.

基于 Erwin Brandstetter 的回答,如果您有 PostgreSQL 9.5 或更高版本,提交的时间戳会一直记录在预写日志中,即使track_commit_timestamp是关闭的。它们被记录在那里以支持时间点恢复,您可以在其中将数据库滚动到您可以指定为日期和时间的确切过去状态。

What you get by turning track_commit_timestampon is an easier way to retrieve that information, where you can simply query with

开启后track_commit_timestamp,您可以获得一种更简单的方式来检索该信息,您可以在其中简单地查询

SELECT pg_xact_commit_timestamp(xid);

where xidis the xminfrom the row you care about, and it gives you the timestamp.

其中XIDxmin从该行你计较,和它给你的时间戳。

That's convenient, but it only works if:

这很方便,但只有在以下情况下才有效:

  • track_commit_timestampis on
  • it was on when the transaction committed
  • the transaction ID is not far enough in the past to be 'frozen'.
  • track_commit_timestamp开着
  • 它在事务提交时开启
  • 交易 ID 过去还不够远,无法“冻结”。

(PostgreSQL controls the overhead of remembering transaction IDs forever, by eventually 'freezing' old ones. That also controls how far the track_commit_timestamp-dependent functions can look back. There is another setting, vacuum_freeze_max_age, for adjusting that.)

(PostgreSQL 通过最终“冻结”旧的来控制永远记住事务 ID 的开销。这也控制了track_commit_timestamp依赖函数可以回溯多远。还有另一个设置vacuum_freeze_max_age,用于调整它。)

So what do you do if you need the timestamp for a transaction that happened before you turned on track_commit_timestamp?

那么,如果您需要在您开启之前发生的交易的时间戳,您会怎么做track_commit_timestamp

As long as it happened in PG 9.5 or later, the timestamp is in the write-ahead log. If you have been keeping backups sufficient for point-in-time recovery, that gives you a crude way to find the answer: you can restore a base backup from before you think it happened, set a recovery 'pause' target timestamp near where you guess it happened, connect when it pauses and query to see if it happened yet. If not, set a slightly later target, let the recovery continue, and check again. This can all be done using the backups in another PostgreSQL instance, to avoid interfering with one running production.

只要它发生在 PG 9.5 或更高版本中,时间戳就在 write-ahead log 中。如果您一直为时间点恢复保留足够的备份,那么您可以粗略地找到答案:您可以在您认为它发生之前恢复基本备份,在您所在位置附近设置一个恢复“暂停”目标时间戳猜测它发生了,在它暂停时连接并查询它是否发生了。如果没有,设置一个稍晚的目标,让恢复继续,然后再次检查。这一切都可以使用另一个 PostgreSQL 实例中的备份来完成,以避免干扰正在运行的生产。

That is a clumsy-enough procedure you might wish you could just go back in time and tell your former self to turn track_commit_timestampon, so it would have been on when the transaction happened that you are interested in. You canturn on track_commit_timestampbefore starting the server to recover from a backup, but that doesn't quite do the trick: if it was turned off at the time of the backup, it will only begin saving timestamps for new transactions, after the ones it recovers.

这是一个足够笨拙的程序,您可能希望您可以回到过去并告诉您以前的自己打开track_commit_timestamp,这样它就会在您感兴趣的事务发生时打开。您可以track_commit_timestamp在启动服务器之前打开从备份中恢复,但这并不能完全解决问题:如果它在备份时被关闭,它只会在它恢复之后开始为新事务保存时间戳。

It turns out it is possible to fool PostgreSQL into thinking track_commit_timestampwason, and then start the server in recovery, and that has much the desired effect: as it replays transactions from the write-ahead log, it does remember their timestamps, and you can then use pg_xact_commit_timestamp()to query them. It will not have timestamps for anything that was in the base backup, but only for the transactions that followed the base backup and were replayed from the WAL. Still, by choosing a base backup known to be earlier than the wanted transaction, this allows the timestamp to be recovered.

事实证明,有可能让 PostgreSQL 认为它track_commit_timestamp已经开启,然后在恢复中启动服务器,这有很多预期的效果:当它从预写日志中重放事务时,它确实记住了它们的时间戳,你可以然后用于pg_xact_commit_timestamp()查询它们。它不会为基础备份中的任何内容提供时间戳,而只有基础备份之后并从 WAL 重放的事务的时间戳。尽管如此,通过选择已知早于所需事务的基本备份,这允许恢复时间戳。

There is no official tool/option to 'retroactively' set track_commit_timestampin this way, but the (fiddly and unsupported) proof-of-concept has been discussed on pgsql-hackers.

没有track_commit_timestamp以这种方式“追溯”设置的官方工具/选项,但(繁琐且不受支持的)概念验证已pgsql-hackers.

回答by Richard Huxton

Short answer: no.

简短的回答:没有。

If there was, everyone would complain it was a waste of space on all the tables they didn't want to track.

如果有的话,每个人都会抱怨这浪费了他们不想跟踪的所有桌子上的空间。