database 不可重复读和幻读有什么区别?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/11043712/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-08 08:34:57  来源:igfitidea点击:

What is the difference between Non-Repeatable Read and Phantom Read?

databaseoracletransactionsisolation-leveltransaction-isolation

提问by user1357722

What is the difference between non-repeatable read and phantom read?

不可重复读和幻读有什么区别?

I have read the Isolation (database systems) article from Wikipedia, but I have a few doubts. In the below example, what will happen: the non-repeatable readand phantom read?

我已经阅读了Wikipedia 上Isolation (database systems) 文章,但我有一些疑问。在下面的例子中,会发生什么:不可重复读幻读

Transaction A交易A
SELECT ID, USERNAME, accountno, amount FROM USERS WHERE ID=1
OUTPUT:输出:
1----MIKE------29019892---------5000
Transaction B交易B
UPDATE USERS SET amount=amount+5000 where ID=1 AND accountno=29019892;
COMMIT;
Transaction A交易A
SELECT ID, USERNAME, accountno, amount FROM USERS WHERE ID=1

Another doubt is, in the above example, which isolation level should be used? And why?

另一个疑问是,在上面的例子中,应该使用哪种隔离级别?为什么?

回答by Thilo

From Wikipedia(which has great and detailed examples for this):

来自维基百科(其中有很好的详细示例):

A non-repeatable read occurs, when during the course of a transaction, a row is retrieved twice and the values within the row differ between reads.

发生不可重复读取时,在事务过程中,一行被检索两次并且行内的值在读取之间不同。

and

A phantom read occurs when, in the course of a transaction, two identical queries are executed, and the collection of rows returned by the second query is different from the first.

当在事务过程中执行两个相同的查询并且第二个查询返回的行集合与第一个不同时,就会发生幻读。

Simple examples:

简单的例子:

  • User A runs the same query twice.
  • In between, User B runs a transaction and commits.
  • Non-repeatable read: The A row that user A has queried has a different value the second time.
  • Phantom read: All the rows in the query have the same value before and after, but different rows are being selected(because B has deleted or inserted some). Example: select sum(x) from table;will return a different result even if none of the affected rows themselves have been updated, if rows have been added or deleted.
  • 用户 A 两次运行相同的查询。
  • 在这期间,用户 B 运行一个事务并提交。
  • 不可重复读:用户A第二次查询的A行有不同的值。
  • 幻读:查询中的所有行前后都有相同的值,但选择的是不同的行(因为B删除或插入了一些)。示例:select sum(x) from table;如果行被添加或删除,即使受影响的行本身没有被更新,也会返回不同的结果。

In the above example,which isolation level to be used?

在上面的例子中,要使用哪个隔离级别?

What isolation level you need depends on your application. There is a high cost to a "better" isolation level (such as reduced concurrency).

您需要什么隔离级别取决于您的应用程序。“更好”的隔离级别(例如减少并发)的成本很高。

In your example, you won't have a phantom read, because you select only from a single row (identified by primary key). You can have non-repeatable reads, so if that is a problem, you may want to have an isolation level that prevents that. In Oracle, transaction A could also issue a SELECT FOR UPDATE, then transaction B cannot change the row until A is done.

在您的示例中,您不会有幻读,因为您仅从单行(由主键标识)中进行选择。您可以拥有不可重复读取,因此如果这是一个问题,您可能需要一个隔离级别来防止这种情况发生。在 Oracle 中,事务 A 也可以发出 SELECT FOR UPDATE,然后事务 B 在 A 完成之前不能更改行。

回答by BateTech

A simple way I like to think about it is:

我喜欢的一种简单的思考方式是:

Both non-repeatable and phantom reads have to do with data modification operations from a different transaction, which were committed after your transaction began, and then read by your transaction.

不可重复读和幻读都与来自不同事务的数据修改操作有关,这些操作在您的事务开始后提交,然后由您的事务读取。

Non-repeatable reads are when your transaction reads committed UPDATESfrom another transaction. The same row now has different values than it did when your transaction began.

不可重复读取是指您的事务从另一个事务读取已提交的UPDATES。现在,同一行的值与事务开始时的值不同。

Phantom reads are similar but when reading from committed INSERTSand/or DELETESfrom another transaction. There are new rows or rows that have disappeared since you began the transaction.

幻读是类似的,但是当从另一个事务提交的INSERTS和/或DELETES读取时。自您开始事务以来,有新行或已消失的行。

Dirty reads are similarto non-repeatable and phantom reads, but relate to reading UNCOMMITTED data, and occur when an UPDATE, INSERT, or DELETE from another transaction is read, and the other transaction has NOT yet committed the data. It is reading "in progress" data, which may not be complete, and may never actually be committed.

脏读类似于不可重复读和幻读,但与读取未提交的数据有关,发生在读取另一个事务的 UPDATE、INSERT 或 DELETE,而另一个事务尚未提交数据时。它正在读取“进行中”的数据,这些数据可能不完整,并且可能永远不会真正提交。

回答by Vlad Mihalcea

As explained in this article, the Non-Repeatable Readanomaly looks as follows:

本文所述,不可重复读取异常如下所示:

enter image description here

在此处输入图片说明

  1. Alice and Bob start two database transactions.
  2. Bob's reads the post record and title column value is Transactions.
  3. Alice modifies the title of a given post record to the value of ACID.
  4. Alice commits her database transaction.
  5. If Bob's re-reads the post record, he will observe a different version of this table row.
  1. Alice 和 Bob 启动了两个数据库事务。
  2. Bob 读取 post 记录,title 列值为 Transactions。
  3. Alice 将给定的帖子记录的标题修改为 ACID 的值。
  4. Alice 提交她的数据库事务。
  5. 如果 Bob 重新读取 post 记录,他将观察到该表行的不同版本。

In this articleabout Phantom Read, you can see that this anomaly can happen as follows:

这篇关于Phantom Read 的文章中,您可以看到这种异常可能发生如下:

enter image description here

在此处输入图片说明

  1. Alice and Bob start two database transactions.
  2. Bob's reads all the post_comment records associated with the post row with the identifier value of 1.
  3. Alice adds a new post_comment record which is associated with the post row having the identifier value of 1.
  4. Alice commits her database transaction.
  5. If Bob's re-reads the post_comment records having the post_id column value equal to 1, he will observe a different version of this result set.
  1. Alice 和 Bob 启动了两个数据库事务。
  2. Bob 读取与标识符值为 1 的帖子行关联的所有 post_comment 记录。
  3. Alice 添加了一条新的 post_comment 记录,该记录与标识符值为 1 的帖子行相关联。
  4. Alice 提交她的数据库事务。
  5. 如果 Bob 重新读取 post_id 列值等于 1 的 post_comment 记录,他将观察到此结果集的不同版本。

So, while the Non-Repeatable Readapplies to a single row, the Phantom Readis about a range of records which satisfy a given query filtering criteria.

因此,虽然不可重复读适用于单行,但幻读是关于满足给定查询过滤条件的一系列记录。

回答by Subhadeep Ray

Read phenomena

读取现象

  • Dirty reads: read UNCOMMITED data from another transaction
  • Non-repeatable reads: read COMMITTED data from an UPDATEquery from another transaction
  • Phantom reads: read COMMITTED data from an INSERTor DELETEquery from another transaction
  • 脏读:从另一个事务中读取未提交的数据
  • 不可重复读取UPDATE从另一个事务的查询中读取已提交的数据
  • 幻读:从另一个事务INSERTDELETE查询中读取已提交的数据

Note: DELETE statements from another transaction, also have a very low probability of causing Non-repeatable reads in certain cases. It happens when the DELETE statement unfortunately, removes the very same row which your current transaction was querying. But this is a rare case, and far more unlikely to occur in a database which have millions of rows in each table. Tables containing transaction data usually have high data volume in any production environment.

注意:来自另一个事务的 DELETE 语句,在某些情况下也有非常低的概率导致不可重复读取。不幸的是,当 DELETE 语句删除了您当前事务正在查询的同一行时,就会发生这种情况。但这是一种罕见的情况,更不可能发生在每个表中有数百万行的数据库中。包含交易数据的表在任何生产环境中通常都具有很高的数据量。

Also we may observe that UPDATES may be a more frequent job in most use cases rather than actual INSERT or DELETES (in such cases, danger of non-repeatable readsremain only - phantom readsare not possible in those cases). This is why UPDATES are treated differently from INSERT-DELETE and the resulting anomaly is also named differently.

此外,我们可能会观察到,在大多数用例中,UPDATES 可能是更频繁的工作,而不是实际的 INSERT 或 DELETES(在这种情况下,不可重复读取的危险仅存在 -在这些情况下不可能进行幻像读取)。这就是为什么 UPDATES 与 INSERT-DELETE 的处理方式不同,并且由此产生的异常也以不同的方式命名。

There is also an additional processing cost associated with handling for INSERT-DELETEs, rather than just handling the UPDATES.

还有一个与处理 INSERT-DELETE 相关的额外处理成本,而不仅仅是处理 UPDATES。



Benefits of different isolation levels

不同隔离级别的好处

  • READ_UNCOMMITTED prevents nothing. It's the zero isolation level
  • READ_COMMITTED prevents just one, i.e. Dirty reads
  • REPEATABLE_READ prevents two anomalies: Dirty reads and Non-repeatable reads
  • SERIALIZABLE prevents all three anomalies: Dirty reads, Non-repeatable reads and Phantom reads
  • READ_UNCOMMITTED 不会阻止任何事情。这是零隔离级别
  • READ_COMMITTED 只阻止一个,即脏读
  • REPEATABLE_READ 防止两种异常:脏读和不可重复读
  • SERIALIZABLE 可防止所有三种异常情况:脏读、不可重复读和幻读

Then why not just set the transaction SERIALIZABLE at all times? Well, the answer to the above question is: SERIALIZABLE setting makes transactions very slow, which we again don't want.

那么为什么不一直设置事务 SERIALIZABLE 呢?好吧,上述问题的答案是: SERIALIZABLE 设置使事务非常缓慢,这是我们再次不想要的。

In fact transaction time consumption is in the following rate:

事实上,交易时间消耗是在以下比率:

SERIALIZABLE >REPEATABLE_READ >READ_COMMITTED >READ_UNCOMMITTED

SERIALIZABLE >REPEATABLE_READ >READ_COMMITTED >READ_UNCOMMITTED

So READ_UNCOMMITTED setting is the fastest.

所以 READ_UNCOMMITTED 设置是最快的



Summary

概括

Actually we need to analyze the use case and decide an isolation levelso that we optimize the transaction time and also prevent most anomalies.

实际上我们需要分析用例并决定隔离级别,以便我们优化事务时间并防止大多数异常。

Note that databases by default have REPEATABLE_READ setting.

请注意,数据库默认具有 REPEATABLE_READ 设置。

回答by egraldlo

There is a difference in the implementation between these two kinds isolation levels.
For "non-repeatable read", row-locking is needed.
For "phantom read",scoped-locking is needed, even a table-locking.
We can implement these two levels by using two-phase-lockingprotocol.

这两种隔离级别在实现上有所不同。
对于“不可重复读”,需要行锁。
对于“幻读”,需要作用域锁,甚至是表锁。
我们可以通过使用两相锁定协议来实现这两个级别。

回答by Jeffrey Kemp

In a system with non-repeatable reads, the result of Transaction A's second query will reflect the update in Transaction B - it will see the new amount.

在具有不可重复读取的系统中,事务 A 的第二次查询的结果将反映事务 B 中的更新——它将看到新的金额。

In a system that allows phantom reads, if Transaction B were to inserta new row with ID = 1, Transaction A will see the new row when the second query is executed; i.e. phantom reads are a special case of non-repeatable read.

在允许幻读的系统中,如果事务 B 要插入ID = 1 的新行,事务 A 将在执行第二个查询时看到新行;即幻读是不可重复读的特例。

回答by Erwin Smout

The accepted answer indicates most of all that the so-called distinction between the two is actually not significant at all.

公认的答案最重要的是,两者之间的所谓区别实际上根本不重要。

If "a row is retrieved twice and the values within the row differ between reads", then they are not the same row (not the same tuple in correct RDB speak) and it is then indeed by definition also the case that "the collection of rows returned by the second query is different from the first".

如果“一行被检索两次并且行内的值在读取之间不同”,那么它们不是同一行(在正确的 RDB 中不是同一个元组),那么根据定义,“集合第二个查询返回的行与第一个不同”。

As to the question "which isolation level should be used", the more your data is of vital importance to someone, somewhere, the more it will be the case that Serializable is your only reasonable option.

至于“应该使用哪种隔离级别”的问题,您的数据对某人或某处的重要性越高,Serializable 是您唯一合理的选择的情况就越多。

回答by Han R

I think there are some difference between Non-repeateable-read & phantom-read.

我认为不可重复读和幻读之间有一些区别。

The Non-repeateable means there are tow transaction A & B. if B can notice the modification of A, so maybe happen dirty-read, so we let B notices the modification of A after A committing.

Non-repeateable 意味着有两个事务 A 和 B。如果 B 可以注意到 A 的修改,那么可能会发生脏读,所以我们让 B 在 A 提交后注意到 A 的修改。

There is new issue: we let B notice the modification of A after A committing, it means A modify a value of row which the B is holding, sometime B will read the row again, so B will get new value different with first time we get, we call it Non-repeateable, to deal with the issue, we let the B remember something(cause i don't know what will be remembered yet) when B start.

有一个新问题:我们让 B 在 A 提交后注意到 A 的修改,这意味着 A 修改了 B 持有的行的值,有时 B 会再次读取该行,因此 B 将获得与我们第一次不同的新值得到,我们称之为不可重复,为了解决这个问题,我们让 B 在 B 开始时记住一些东西(因为我不知道会记住什么)。

Let's think about the new solution, we can notice there is new issue as well, cause we let B remember something, so whatever happened in A, the B can't be affected, but if B want to insert some data into table and B check the table to make sure there is no record, but this data has been inserted by A, so maybe occur some error. We call it Phantom-read.

考虑一下新的解决方案,我们可以注意到也有新的问题,因为我们让B记住了一些东西,所以无论A发生了什么,B都不会受到影响,但是如果B要向表和B中插入一些数据检查表以确保没有记录,但此数据已被A插入,因此可能会出现一些错误。我们称之为幻读。

回答by sn.anurag

non-repeatable read is an isolation level and phantom read (reading committed value by other transactions) is a concept (type of read e.g. dirty read or snapshot read). Non-repeatable read isolation level allows phantom read but not dirty reads or snapshot reads.

不可重复读是一个隔离级别,幻读(读取其他事务提交的值)是一个概念(读的类型,例如脏读或快照读)。不可重复读隔离级别允许幻读,但不允许脏读或快照读。