是否有 MySQL 选项/功能来跟踪记录更改的历史记录?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/12563706/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-31 14:57:53  来源:igfitidea点击:

Is there a MySQL option/feature to track history of changes to records?

mysqldatabase

提问by Edward

I've been asked if I can keep track of the changes to the records in a MySQL database. So when a field has been changed, the old vs new is available and the date this took place. Is there a feature or common technique to do this?

有人问我是否可以跟踪 MySQL 数据库中记录的更改。因此,当一个字段发生更改时,旧的与新的可用以及发生的日期。是否有功能或通用技术可以做到这一点?

If so, I was thinking of doing something like this. Create a table called changes. It would contain the same fields as the mastertable but prefixed with old and new, but only for those fields which were actually changed and a TIMESTAMPfor it. It would be indexed with an ID. This way, a SELECTreport could be run to show the history of each record. Is this a good method? Thanks!

如果是这样,我正在考虑做这样的事情。创建一个名为 的表changes。它将包含与表相同的字段,但以 old 和 new 为前缀,但仅适用于实际更改的那些字段,而 aTIMESTAMP代表它。它将以ID. 这样,SELECT可以运行报告来显示每条记录的历史记录。这是一个好方法吗?谢谢!

采纳答案by Neville Kuyt

It's subtle.

这是微妙的。

If the business requirement is "I want to audit the changes to the data - who did what and when?", you can usually use audit tables (as per the trigger example Keethanjan posted). I'm not a huge fan of triggers, but it has the great benefit of being relatively painless to implement - your existing code doesn't need to know about the triggers and audit stuff.

如果业务需求是“我想审计对数据的更改 - 谁在什么时候做了什么?”,您通常可以使用审计表(根据 Keethanjan 发布的触发器示例)。我不是触发器的忠实粉丝,但它具有实现起来相对轻松的巨大好处 - 您现有的代码不需要了解触发器和审计内容。

If the business requirement is "show me what the state of the data was on a given date in the past", it means that the aspect of change over time has entered your solution. Whilst you can, just about, reconstruct the state of the database just by looking at audit tables, it's hard and error prone, and for any complicated database logic, it becomes unwieldy. For instance, if the business wants to know "find the addresses of the letters we should have sent to customers who had outstanding, unpaid invoices on the first day of the month", you likely have to trawl half a dozen audit tables.

如果业务需求是“向我展示过去给定日期的数据状态”,则意味着随时间变化的方面已进入您的解决方案。虽然您几乎可以通过查看审计表来重建数据库的状态,但这很困难且容易出错,而且对于任何复杂的数据库逻辑,它变得笨拙。例如,如果企业想知道“找到我们应该发送给在当月第一天有未付未付发票的客户的信件的地址”,您可能需要搜索六张审计表。

Instead, you can bake the concept of change over time into your schema design (this is the second option Keethanjan suggests). This is a change to your application, definitely at the business logic and persistence level, so it's not trivial.

相反,您可以将随时间变化的概念融入到您的架构设计中(这是 Keethanjan 建议的第二个选项)。这是对您的应用程序的更改,绝对是在业务逻辑和持久性级别,因此并非微不足道。

For example, if you have a table like this:

例如,如果您有一个这样的表:

CUSTOMER
---------
CUSTOMER_ID PK
CUSTOMER_NAME
CUSTOMER_ADDRESS

and you wanted to keep track over time, you would amend it as follows:

并且您想随着时间的推移进行跟踪,您可以将其修改如下:

CUSTOMER
------------
CUSTOMER_ID            PK
CUSTOMER_VALID_FROM    PK
CUSTOMER_VALID_UNTIL   PK
CUSTOMER_STATUS
CUSTOMER_USER
CUSTOMER_NAME
CUSTOMER_ADDRESS

Every time you want to change a customer record, instead of updating the record, you set the VALID_UNTIL on the current record to NOW(), and insert a new record with a VALID_FROM (now) and a null VALID_UNTIL. You set the "CUSTOMER_USER" status to the login ID of the current user (if you need to keep that). If the customer needs to be deleted, you use the CUSTOMER_STATUS flag to indicate this - you may never delete records from this table.

每次要更改客户记录,而不是更新记录时,您将当前记录上的 VALID_UNTIL 设置为 NOW(),并插入带有 VALID_FROM(现在)和空 VALID_UNTIL 的新记录。您将“CUSTOMER_USER”状态设置为当前用户的登录 ID(如果您需要保留)。如果需要删除客户,您可以使用 CUSTOMER_STATUS 标志来表明这一点——您可能永远不会从该表中删除记录。

That way, you can always find what the status of the customer table was for a given date - what was the address? Have they changed name? By joining to other tables with similar valid_from and valid_until dates, you can reconstruct the entire picture historically. To find the current status, you search for records with a null VALID_UNTIL date.

这样,您始终可以找到给定日期的客户表的状态 - 地址是什么?他们改名了吗?通过加入具有类似 valid_from 和 valid_until 日期的其他表,您可以历史地重建整个图片。要查找当前状态,请搜索具有空 VALID_UNTIL 日期的记录。

It's unwieldy (strictly speaking, you don't need the valid_from, but it makes the queries a little easier). It complicates your design and your database access. But it makes reconstructing the world a lot easier.

它很笨重(严格来说,您不需要valid_from,但它使查询更容易一些)。它使您的设计和数据库访问复杂化。但它使重建世界变得容易得多。

回答by transient closure

Here's a straightforward way to do this:

这是执行此操作的直接方法:

First, create a history table for each data table you want to track (example query below). This table will have an entry for each insert, update, and delete query performed on each row in the data table.

首先,为要跟踪的每个数据表创建一个历史表(下面的示例查询)。对于对数据表中的每一行执行的每个插入、更新和删除查询,该表都有一个条目。

The structure of the history table will be the same as the data table it tracks except for three additional columns: a column to store the operation that occured (let's call it 'action'), the date and time of the operation, and a column to store a sequence number ('revision'), which increments per operation and is grouped by the primary key column of the data table.

历史表的结构将与它跟踪的数据表相同,除了三个额外的列:一个列存储发生的操作(我们称之为“操作”),操作的日期和时间,以及一个列存储序列号('revision'),该序列号按操作递增,并按数据表的主键列分组。

To do this sequencing behavior a two column (composite) index is created on the primary key column and revision column. Note that you can only do sequencing in this fashion if the engine used by the history table is MyISAM (See 'MyISAM Notes' on this page)

为了执行这种排序行为,在主键列和修订列上创建了一个两列(复合)索引。请注意,如果历史表使用的引擎是 MyISAM,您只能以这种方式进行排序(请参阅本页上的“MyISAM 注释”)

The history table is fairly easy to create. In the ALTER TABLE query below (and in the trigger queries below that), replace 'primary_key_column' with the actual name of that column in your data table.

历史表很容易创建。在下面的 ALTER TABLE 查询(以及下面的触发器查询)中,将“primary_key_column”替换为数据表中该列的实际名称。

CREATE TABLE MyDB.data_history LIKE MyDB.data;

ALTER TABLE MyDB.data_history MODIFY COLUMN primary_key_column int(11) NOT NULL, 
   DROP PRIMARY KEY, ENGINE = MyISAM, ADD action VARCHAR(8) DEFAULT 'insert' FIRST, 
   ADD revision INT(6) NOT NULL AUTO_INCREMENT AFTER action,
   ADD dt_datetime DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP AFTER revision,
   ADD PRIMARY KEY (primary_key_column, revision);

And then you create the triggers:

然后你创建触发器:

DROP TRIGGER IF EXISTS MyDB.data__ai;
DROP TRIGGER IF EXISTS MyDB.data__au;
DROP TRIGGER IF EXISTS MyDB.data__bd;

CREATE TRIGGER MyDB.data__ai AFTER INSERT ON MyDB.data FOR EACH ROW
    INSERT INTO MyDB.data_history SELECT 'insert', NULL, NOW(), d.* 
    FROM MyDB.data AS d WHERE d.primary_key_column = NEW.primary_key_column;

CREATE TRIGGER MyDB.data__au AFTER UPDATE ON MyDB.data FOR EACH ROW
    INSERT INTO MyDB.data_history SELECT 'update', NULL, NOW(), d.*
    FROM MyDB.data AS d WHERE d.primary_key_column = NEW.primary_key_column;

CREATE TRIGGER MyDB.data__bd BEFORE DELETE ON MyDB.data FOR EACH ROW
    INSERT INTO MyDB.data_history SELECT 'delete', NULL, NOW(), d.* 
    FROM MyDB.data AS d WHERE d.primary_key_column = OLD.primary_key_column;

And you're done. Now, all the inserts, updates and deletes in 'MyDb.data' will be recorded in 'MyDb.data_history', giving you a history table like this (minus the contrived 'data_columns' column)

你已经完成了。现在,“MyDb.data”中的所有插入、更新和删除都将记录在“MyDb.data_history”中,为您提供一个这样的历史表(减去人为的“data_columns”列)

ID    revision   action    data columns..
1     1         'insert'   ....          initial entry for row where ID = 1
1     2         'update'   ....          changes made to row where ID = 1
2     1         'insert'   ....          initial entry, ID = 2
3     1         'insert'   ....          initial entry, ID = 3 
1     3         'update'   ....          more changes made to row where ID = 1
3     2         'update'   ....          changes made to row where ID = 3
2     2         'delete'   ....          deletion of row where ID = 2 

To display the changes for a given column or columns from update to update, you'll need to join the history table to itself on the primary key and sequence columns. You could create a view for this purpose, for example:

要显示一个或多个给定列从更新到更新的更改,您需要在主键和序列列上将历史记录表连接到自身。您可以为此目的创建一个视图,例如:

CREATE VIEW data_history_changes AS 
   SELECT t2.dt_datetime, t2.action, t1.primary_key_column as 'row id', 
   IF(t1.a_column = t2.a_column, t1.a_column, CONCAT(t1.a_column, " to ", t2.a_column)) as a_column
   FROM MyDB.data_history as t1 INNER join MyDB.data_history as t2 on t1.primary_key_column = t2.primary_key_column 
   WHERE (t1.revision = 1 AND t2.revision = 1) OR t2.revision = t1.revision+1
   ORDER BY t1.primary_key_column ASC, t2.revision ASC

Edit: Oh wow, people like my history table thing from 6 years ago :P

编辑:哇哦,人们喜欢我 6 年前的历史记录表:P

My implementation of it is still humming along, getting bigger and more unwieldy, I would assume. I wrote views and pretty nice UI to look at the history in this database, but I don't think it was ever used much. So it goes.

我想我的实现仍然在嗡嗡作响,变得越来越大,越来越笨拙。我编写了视图和非常漂亮的 UI 来查看此数据库中的历史记录,但我认为它没有被使用过很多。所以它去。

To address some comments in no particular order:

不按特定顺序处理一些评论:

  • I did my own implementation in PHP that was a little more involved, and avoided some of the problems described in comments (having indexes transferred over, signifcantly. If you transfer over unique indexes to the history table, things will break. There are solutions for this in the comments). Following this post to the letter could be an adventure, depending on how established your database is.

  • If the relationship between the primary key and the revision column seems off it usually means the composite key is borked somehow. On a few rare occasions I had this happen and was at a loss to the cause.

  • I found this solution to be pretty performant, using triggers as it does. Also, MyISAM is fast at inserts, which is all the triggers do. You can improve this further with smart indexing (or lack of...). Inserting a single row into a MyISAM table with a primary key shouldn't be an operation you need to optimize, really, unless you have significant issues going on elsewhere. In the entire time I was running the MySQL database this history table implementation was on, it was never the cause of any of the (many) performance problems that came up.

  • if you're getting repeated inserts, check your software layer for INSERT IGNORE type queries. Hrmm, can't remember now, but I think there are issues with this scheme and transactions which ultimately fail after running multiple DML actions. Something to be aware of, at least.

  • It's important that the fields in the history table and the data table match up. Or, rather, that your data table doesn't have MORE columns than the history table. Otherwise, insert/update/del queries on the data table will fail, when the inserts to the history tables put columns in the query that don't exist (due to d.* in the trigger queries), and the trigger fails. t would be awesome if MySQL had something like schema-triggers, where you could alter the history table if columns were added to the data table. Does MySQL have that now? I do React these days :P

  • 我在 PHP 中做了我自己的实现,这有点复杂,并避免了注释中描述的一些问题(将索引转移,很重要。如果将唯一索引转移到历史记录表,事情就会中断。有解决方案这在评论中)。跟随这篇文章到这封信可能是一次冒险,这取决于您的数据库是如何建立的。

  • 如果主键和修订列之间的关系看起来不对,通常意味着复合键以某种方式被破坏了。在极少数情况下,我发生了这种情况并且对原因不知所措。

  • 我发现这个解决方案非常高效,就像它一样使用触发器。此外,MyISAM 的插入速度很快,这是所有触发器都能做到的。您可以通过智能索引(或缺少...)进一步改进这一点。将单行插入带有主键的 MyISAM 表不应该是您需要优化的操作,实际上,除非您在其他地方遇到重大问题。在我运行这个历史表实现的 MySQL 数据库的整个过程中,它从来都不是出现的任何(许多)性能问题的原因。

  • 如果您收到重复插入,请检查您的软件层是否有 INSERT IGNORE 类型查询。嗯,现在不记得了,但我认为这个方案和事务存在问题,在运行多个 DML 操作后最终失败。至少要注意一些事情。

  • 历史表和数据表中的字段匹配很重要。或者,更确切地说,您的数据表没有比历史记录表更多的列。否则,对数据表的插入/更新/删除查询将失败,当对历史表的​​插入将不存在的列放入查询中时(由于触发器查询中的 d.*),触发器将失败。如果 MySQL 有类似 schema-triggers 之类的东西,那就太棒了,如果将列添加到数据表中,您可以在其中更改历史记录表。MySQL现在有吗?这些天我确实有反应:P

回答by Keethanjan

You could create triggers to solve this. Here is a tutorial to do so(archived link).

您可以创建触发器来解决这个问题。这是一个教程(存档链接)。

Setting constraints and rules in the database is better than writing special code to handle the same task since it will prevent another developer from writing a different query that bypasses all of the special code and could leave your database with poor data integrity.

For a long time I was copying info to another table using a script since MySQL didn't support triggers at the time. I have now found this trigger to be more effective at keeping track of everything.

This trigger will copy an old value to a history table if it is changed when someone edits a row. Editor IDand last modare stored in the original table every time someone edits that row; the time corresponds to when it was changed to its current form.

在数据库中设置约束和规则比编写特殊代码来处理相同的任务要好,因为它可以防止其他开发人员编写绕过所有特殊代码的不同查询,并可能使您的数据库的数据完整性较差。

很长一段时间以来,我一直在使用脚本将信息复制到另一个表,因为当时 MySQL 不支持触​​发器。我现在发现这个触发器在跟踪一切方面更有效。

如果有人编辑行时更改了旧值,则此触发器会将旧值复制到历史记录表中。Editor IDlast mod在每次有人编辑该行时存储在原始表中;时间对应于更改为当前形式的时间。

DROP TRIGGER IF EXISTS history_trigger $$

CREATE TRIGGER history_trigger
BEFORE UPDATE ON clients
    FOR EACH ROW
    BEGIN
        IF OLD.first_name != NEW.first_name
        THEN
                INSERT INTO history_clients
                    (
                        client_id    ,
                        col          ,
                        value        ,
                        user_id      ,
                        edit_time
                    )
                    VALUES
                    (
                        NEW.client_id,
                        'first_name',
                        NEW.first_name,
                        NEW.editor_id,
                        NEW.last_mod
                    );
        END IF;

        IF OLD.last_name != NEW.last_name
        THEN
                INSERT INTO history_clients
                    (
                        client_id    ,
                        col          ,
                        value        ,
                        user_id      ,
                        edit_time
                    )
                    VALUES
                    (
                        NEW.client_id,
                        'last_name',
                        NEW.last_name,
                        NEW.editor_id,
                        NEW.last_mod
                    );
        END IF;

    END;
$$

Another solution would be to keep an Revision field and update this field on save. You could decide that the max is the newest revision, or that 0 is the most recent row. That's up to you.

另一种解决方案是保留一个修订字段并在保存时更新此字段。您可以决定 max 是最新的修订版,或者 0 是最近的行。随你(由你决定。

回答by Zenex

Here is how we solved it

这是我们解决它的方法

a Users table looked like this

用户表看起来像这样

Users
-------------------------------------------------
id | name | address | phone | email | created_on | updated_on

And the business requirement changed and we were in a need to check all previous addresses and phone numbers a user ever had. new schema looks like this

业务需求发生了变化,我们需要检查用户以前拥有的所有地址和电话号码。新架构看起来像这样

Users (the data that won't change over time)
-------------
id | name

UserData (the data that can change over time and needs to be tracked)
-------------------------------------------------
id | id_user | revision | city | address | phone | email | created_on
 1 |   1     |    0     | NY   | lake st | 9809  | @long | 2015-10-24 10:24:20
 2 |   1     |    2     | Tokyo| lake st | 9809  | @long | 2015-10-24 10:24:20
 3 |   1     |    3     | Sdny | lake st | 9809  | @long | 2015-10-24 10:24:20
 4 |   2     |    0     | Ankr | lake st | 9809  | @long | 2015-10-24 10:24:20
 5 |   2     |    1     | Lond | lake st | 9809  | @long | 2015-10-24 10:24:20

To find the current address of any user, we search for UserData with revision DESC and LIMIT 1

要查找任何用户的当前地址,我们搜索 UserData 版本为 DESC 和 LIMIT 1

To get the address of a user between a certain period of time we can use created_on bewteen (date1 , date 2)

要获取某个时间段之间的用户地址,我们可以使用 created_on bewteen (date1 , date 2)

回答by midenok

MariaDB supports System Versioning since 10.3 which is the standard SQL feature that does exactly what you want: it stores history of table records and provides access to it via SELECTqueries. MariaDB is an open-development fork of MySQL. You can find more on its System Versioning via this link:

MariaDB 从 10.3 开始支持系统版本控制,这是标准 SQL 功能,完全可以满足您的需求:它存储表记录的历史记录并通过SELECT查询提供对它的访问。MariaDB 是 MySQL 的一个开放开发分支。您可以通过此链接找到有关其系统版本控制的更多信息:

https://mariadb.com/kb/en/library/system-versioned-tables/

https://mariadb.com/kb/en/library/system-versioned-tables/

回答by Ouroboros

Why not simply use bin log files? If the replication is set on the Mysql server, and binlog file format is set to ROW, then all the changes could be captured.

为什么不简单地使用 bin 日志文件?如果在Mysql服务器上设置了复制,并且将binlog文件格式设置为ROW,则可以捕获所有更改。

A good python library called noplay can be used. More info here.

可以使用一个名为 noplay 的优秀 Python 库。更多信息在这里

回答by Worthy7

Just my 2 cents. I would create a solution which records exactly what changed, very similar to transient's solution.

只有我的 2 美分。我会创建一个解决方案,准确记录发生了什么变化,非常类似于瞬态的解决方案。

My ChangesTable would simple be:

我的 ChangesTable 很简单:

DateTime | WhoChanged | TableName | Action | ID |FieldName | OldValue

DateTime | WhoChanged | TableName | Action | ID |FieldName | OldValue

1) When an entire row is changed in the main table, lots of entries will go into this table, BUT that is very unlikely, so not a big problem (people are usually only changing one thing) 2) OldVaue (and NewValue if you want) have to be some sort of epic "anytype" since it could be any data, there might be a way to do this with RAW types or just using JSON strings to convert in and out.

1)当主表中的一整行被更改时,很多条目将进入该表,但这是不太可能的,所以不是一个大问题(人们通常只更改一件事)2)OldVaue(和NewValue,如果你想要)必须是某种史诗般的“任何类型”,因为它可以是任何数据,可能有一种方法可以使用 RAW 类型或仅使用 JSON 字符串来转换输入和输出。

Minimum data usage, stores everything you need and can be used for all tables at once. I'm researching this myself right now, but this might end up being the way I go.

最少的数据使用量,存储您需要的所有内容,并可一次用于所有表。我现在自己正在研究这个,但这可能最终成为我走的路。

For Create and Delete, just the row ID, no fields needed. On delete a flag on the main table (active?) would be good.

对于创建和删除,只需行 ID,不需要字段。删除主表上的标志(活动?)会很好。

回答by goforu

The direct way of doing this is to create triggers on tables. Set some conditions or mapping methods. When update or delete occurs, it will insert into 'change' table automatically.

这样做的直接方法是在表上创建触发器。设置一些条件或映射方法。当发生更新或删除时,它会自动插入到“更改”表中。

But the biggest part is what if we got lots columns and lots of table. We have to type every column's name of every table. Obviously, It's waste of time.

但最大的部分是如果我们有很多列和很多表会怎样。我们必须键入每个表的每一列的名称。显然,这是浪费时间。

To handle this more gorgeously, we can create some procedures or functions to retrieve name of columns.

为了更好地处理这个问题,我们可以创建一些过程或函数来检索列名。

We can also use 3rd-part tool simply to do this. Here, I write a java program Mysql Tracker

我们也可以简单地使用第三部分工具来做到这一点。在这里,我写了一个java程序 Mysql Tracker