SQL 如何找到 PostgreSQL 数据库的上次更新时间?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/899203/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-01 02:11:22  来源:igfitidea点击:

How do I find the last time that a PostgreSQL database has been updated?

sqlpostgresqlmetadatadatabase-administration

提问by WolfmanDragon

I am working with a postgreSQL database that gets updated in batches. I need to know when the last time that the database (or a table in the database)has been updated or modified, either will do.

我正在使用分批更新的 postgreSQL 数据库。我需要知道数据库(或数据库中的表)上次更新或修改的时间,两者都可以。

I saw that someone on the postgeSQL forum had suggested that to use logging and query your logs for the time. This will not work for me as that I do not have control over the clients codebase.

我看到 postgeSQL 论坛上有人建议使用日志记录并查询您的日志。这对我不起作用,因为我无法控制客户端代码库。

回答by araqnid

You can write a triggerto run every time an insert/update is made on a particular table. The common usage is to set a "created" or "last_updated" column of the row to the current time, but you could also update the time in a central location if you don't want to change the existing tables.

您可以编写一个触发器以在每次对特定表进行插入/更新时运行。常见用法是将行的“created”或“last_updated”列设置为当前时间,但如果您不想更改现有表,也可以在中央位置更新时间。

So for example a typical way is the following one:

例如,一种典型的方式是以下方式:

CREATE FUNCTION stamp_updated() RETURNS TRIGGER LANGUAGE 'plpgsql' AS $$
BEGIN
  NEW.last_updated := now();
  RETURN NEW;
END
$$;
-- repeat for each table you need to track:
ALTER TABLE sometable ADD COLUMN last_updated TIMESTAMP;
CREATE TRIGGER sometable_stamp_updated
  BEFORE INSERT OR UPDATE ON sometable
  FOR EACH ROW EXECUTE PROCEDURE stamp_updated();

Then to find the last update time, you need to select "MAX(last_updated)" from each table you are tracking and take the greatest of those, e.g.:

然后要找到上次更新时间,您需要从您正在跟踪的每个表中选择“MAX(last_updated)”并取其中最大的一个,例如:

SELECT MAX(max_last_updated) FROM (
  SELECT MAX(last_updated) AS max_last_updated FROM sometable
  UNION ALL
  SELECT MAX(last_updated) FROM someothertable
) updates

For tables with a serial (or similarly-generated) primary key, you can try avoid the sequential scan to find the latest update time by using the primary key index, or you create indices on last_updated.

对于具有串行(或类似生成)主键的表,您可以尝试通过使用主键索引来避免顺序扫描以查找最新更新时间,或者您在 last_updated 上创建索引。

-- get timestamp of row with highest id
SELECT last_updated FROM sometable ORDER BY sometable_id DESC LIMIT 1

Note that this can give slightly wrong results in the case of IDs not being quite sequential, but how much accuracy do you need? (Bear in mind that transactions mean that rows can become visible to you in a different order to them being created.)

请注意,在 ID 不是完全顺序的情况下,这可能会产生稍微错误的结果,但您需要多少准确度?(请记住,事务意味着您可以以与创建行不同的顺序看到行。)

An alternative approach to avoid adding 'updated' columns to each table is to have a central table to store update timestamps in. For example:

避免向每个表添加“更新”列的另一种方法是使用一个中央表来存储更新时间戳。例如:

CREATE TABLE update_log(table_name text PRIMARY KEY, updated timestamp NOT NULL DEFAULT now());
CREATE FUNCTION stamp_update_log() RETURNS TRIGGER LANGUAGE 'plpgsql' AS $$
BEGIN
  INSERT INTO update_log(table_name) VALUES(TG_TABLE_NAME);
  RETURN NEW;
END
$$;
-- Repeat for each table you need to track:
CREATE TRIGGER sometable_stamp_update_log
 AFTER INSERT OR UPDATE ON sometable
 FOR EACH STATEMENT EXECUTE stamp_update_log();

This will give you a table with a row for each table update: you can then just do:

这将为您提供一个表格,其中每个表格更新都有一行:然后您可以执行以下操作:

SELECT MAX(updated) FROM update_log

To get the last update time. (You could split this out by table if you wanted). This table will of course just keep growing: either create an index on 'updated' (which should make getting the latest one pretty fast) or truncate it periodically if that fits with your use case, (e.g. take an exclusive lock on the table, get the latest update time, then truncate it if you need to periodically check if changes have been made).

获取最后更新时间。(如果需要,您可以按表拆分)。这个表当然会继续增长:要么在“更新”上创建一个索引(这应该可以非常快地获得最新的索引),或者如果适合您的用例,则定期截断它,(例如在表上使用排他锁,获取最新的更新时间,然后在您需要定期检查是否进行了更改时将其截断)。

An alternative approach- which might be what the folks on the forum meant- is to set 'log_statement = mod' in the database configuration (either globally for the cluster, or on the database or user you need to track) and then all statements that modify the database will be written to the server log. You'll then need to write something outside the database to scan the server log, filtering out tables you aren't interested in, etc.

另一种方法——这可能是论坛上的人的意思——是在数据库配置中设置“log_statement = mod”(对于集群全局,或者在你需要跟踪的数据库或用户上),然后所有语句修改数据库将写入服务器日志。然后你需要在数据库之外写一些东西来扫描服务器日志,过滤掉你不感兴趣的表,等等。

回答by WolfmanDragon

It looks like you can use pg_stat_databaseto get a transaction count and check if this changes from one backup run to the next - see this dba.se answerand comments for more details

看起来您可以pg_stat_database用来获取事务计数并检查这是否从一次备份运行更改为下一次 -有关更多详细信息,请参阅此 dba.se 答案和评论

回答by Roy Golan

I like Hyman's approach. You can query the table stats and know the number of inserts, updates, deletes and so:

我喜欢Hyman的方法。您可以查询表统计信息并了解插入、更新、删除等的数量:

select n_tup_upd from pg_stat_user_tables  where relname = 'YOUR_TABLE';

every update will increase the count by 1.

每次更新都会将计数增加 1。

bare in mind this method is viable when you have a single DB. multiple instances will require different approach probably.

请记住,当您只有一个数据库时,此方法是可行的。多个实例可能需要不同的方法。

回答by Robert Harvey

See the following article:

请参阅以下文章:

MySQL versus PostgreSQL: Adding a 'Last Modified Time' Column to a Table http://www.pointbeing.net/weblog/2008/03/mysql-versus-postgresql-adding-a-last-modified-column-to-a-table.html

MySQL 与 PostgreSQL:向表中添加“上次修改时间”列 http://www.pointbeing.net/weblog/2008/03/mysql-versus-postgresql-adding-a-last-modified-column-to-a -table.html

回答by guettli

You can write a stored procedure in an "untrusted language" (e.g. plpythonu): This allows access to the files in the postgres "base" directory. Return the larges mtime of these files in the stored procedure.

您可以使用“不受信任的语言”(例如 plpythonu)编写存储过程:这允许访问 postgres“base”目录中的文件。返回存储过程中这些文件的最大时间。

But this is only vague, since vacuum will change these files and the mtime.

但这只是含糊其辞,因为vacuum 会更改这些文件和mtime。