java 可线性化和可串行化有什么区别?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/4179587/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-10-30 05:11:15  来源:igfitidea点击:

What is the difference between linearizability and serializability?

javaconcurrency

提问by devnull

What is the difference between linearizability and serializability (in the context of Java)? Can you please explain the difference between these with an example or provide a good reference?

可线性化和可序列化(在 Java 上下文中)有什么区别?你能用一个例子解释这些之间的区别或提供一个很好的参考吗?

回答by andersoj

The central distinction between the two is that serializabilityis a globalproperty; a property of an entire history of operations/transactions. Linearizabilityis a local property; a property of a single operation/transaction. Another distinction is that linearizability includes a notion of real-time, which serializability does not: the linearization point of an operation must lie between its invocation and response times. (See Tim Harris: Transactional Memory, 2ed. See Herlihy's slides from The Art of Multiprocessor Programming, the section on Linearizability, which are available here, for some examples and proofs.

两者之间的主要区别是可串行化是一个全局属性;整个操作/交易历史的属性。 线性化是一个局部属性;单个操作/事务的属性。另一个区别是线性化包括实时的概念,而串行化没有:操作的线性化点必须位于其调用和响应时间之间。(请参阅 Tim Harris:事务性内存,第 2 版。请参阅 Herlihy 的幻灯片,幻灯片来自The Art of Multiprocessor Programming,关于 Linearizability 的部分可在此处获得,以获取一些示例和证明。

Both properties are aimed at the same goal: sequential consistency. From Herlihy's paper:

这两个属性都针对同一个目标:顺序一致性。来自 Herlihy 的论文:

Much work on databases and distributed systems uses serializability as the basic correctness condition for concurrent computations. In this model, a transaction is a thread of control that applies a finite sequence of primitive operations to a set of objects shared with other transactions. A history is serializable if it is equivalent to one in which transactions appear to execute sequentially, i.e., without interleaving. A (partial) precedence order can be defined on non-overlapping pairs of transactions in the obvious way. A history is strictly serializable if the transactions' order in the sequential history is compatible with their precedence order...

...Linearizability can be viewed as a special case of strict serializability where transactions are restricted to consist of a single operation applied to a single object. Nevertheless, this single-operation restriction has far-reaching practical and formal consequences, giving linearizable computations a different flavor from their serializable counterparts. An immediate practical consequence is that concurrency control mechanisms appropriate for serializability are typically inappropriate for linearizability because they introduce unnecessary overhead and place unnecessary restrictions on concurrency.

许多数据库和分布式系统的工作都使用可串行化作为并发计算的基本正确性条件。在此模型中,事务是一个控制线程,它将有限的原始操作序列应用于与其他事务共享的一组对象。如果历史相当于事务按顺序执行(即没有交错)的历史,那么它就是可序列化的。可以以明显的方式在不重叠的交易对上定义(部分)优先顺序。如果事务在顺序历史中的顺序与其优先顺序兼容,则历史是严格可序列化的......

...线性化可以被视为严格串行化的一种特殊情况,其中事务被限制为由应用于单个对象的单个操作组成。尽管如此,这种单操作限制具有深远的实际和形式后果,使线性化计算与可序列化的计算具有不同的风味。一个直接的实际后果是,适用于可串行化的并发控制机制通常不适用于可线性化,因为它们引入了不必要的开销并对并发性施加了不必要的限制。

References:

参考:

More Details:

更多细节:

If you really care about this, read the paper that introduced the definitions. For linearizability, that's Linearizability: A Correctness Condition for Concurrent Objects, Herlihy and Wing. It's dense, but worth the attention. Note that in the software transactional memory community, it's an open question whether linearizability is the right goal / property to aim for.

如果您真的很关心这一点,请阅读介绍定义的论文。对于线性化,这就是Linearizability: A Correctness Condition for Concurrent Objects、 Herlihy 和 Wing。它很密集,但值得关注。请注意,在软件事务内存社区中,线性化是否是正确的目标/属性是一个悬而未决的问题。

Serializabilityis about the outcome of a collection of operations/the "system" being expressible as a specific ordering ("as if execution took place in a specific order...") of all the operations. Linearizability is a property of a single subset of operations in the system... an operation/set of operations are linearizable if they appear to the other operations as if they occurred at a specific instant in (logical) time with respect to the others. The canonical paper here is Papadimitriou, The Serializability of Concurrent Database Updates.

可串行化是关于一组操作的结果/“系统”被表达为所有操作的特定顺序(“好像执行是以特定顺序发生的......”)。线性化是系统中单个操作子集的属性……如果一个操作/一组操作在其他操作中看起来好像它们发生在(逻辑)时间的特定时刻相对于其他操作,则它们是可线性化的。这里的规范论文是Papadimitriou, The Serializability of Concurrent Database Updates

Think "atomic operation" when you're thinking about "linearizable." A (set of) operations are linearizable when they (appear to) occur atomically with respect to other parts of the system. A common formulation is "provide the illusion that each operation takes effect instantaneously between its invocation and response." The formulation of linearizabilityis due to Herlihy, which emphasizes that this is a local property, vs. other kinds of sequential consistency properties like "serializability" which are global.

当您考虑“线性化”时,请考虑“原子操作”。当(一组)操作(似乎)相对于系统的其他部分原子地发生时,它们是可线性化的。一个常见的表述是“提供每个操作在其调用和响应之间立即生效的错觉”。线性化的表述是由于Herlihy,它强调这是一个局部属性,而不是其他类型的顺序一致性属性,如全局的“可串行化”。

回答by Andrejs

There's a great explanation by Peter Bailis here:

Peter Bailis 这里有一个很好的解释:

"In plain English, under linearizability, writes should appear to be instantaneous. Imprecisely, once a write completes, all later reads (where “later” is defined by wall-clock start time) should return the value of that write or the value of a later write. Once a read returns a particular value, all later reads should return that value or the value of a later write."

“简单来说,在linearizability下,写入应该是瞬时的。不准确地说,一旦写入完成,所有后续读取(其中“稍后”由挂钟开始时间定义)应返回该写入的值或稍后写入。一旦读取返回特定值,所有后续读取都应返回该值或稍后写入的值。”

"Serializabilityis a guarantee about transactions, or groups of one or more operations over one or more objects. It guarantees that the execution of a set of transactions (usually containing read and write operations) over multiple items is equivalent to some serial execution (total ordering) of the transactions."

" 可串行化是对事务的保证,或对一个或多个对象的一个​​或多个操作的组。它保证在多个项目上执行一组事务(通常包含读写操作)相当于一些串行执行(总计订单)的交易。”

回答by Stephen C

See @andersoj's answer for a clear description of the difference between serializability and linearizability.

请参阅@andersoj 的回答,以清楚地描述可串行化和可线性化之间的区别。

This is only indirectly relevant to Java concurrent programming. In general, a concurrent Java program does not need to have either a serializable or linearizable history. In the cases that do, serializability is generally sufficient for a program (Java or otherwise) for "correctness", though particular problems could require the stronger linearizability property. But either way, it is the problemthat determines the correctness requirements, not Java.

这仅与 Java 并发编程间接相关。通常,并发 Java 程序不需要具有可序列化或可线性化的历史记录。在这种情况下,可串行化对于程序(Java 或其他)的“正确性”通常是足够的,尽管特定问题可能需要更强的线性化属性。但无论哪种方式,决定正确性要求的是问题,而不是 Java。

回答by JOTN

Check Wikipedia's explanation:

检查维基百科的解释:

http://en.wikipedia.org/wiki/Linearizability#Linearizability_versus_serializability

http://en.wikipedia.org/wiki/Linearizability#Linearizability_versus_serializability

It can also be a bit confusing because we also use the term serialize to refer to converting a class into data stream for storage or network transmission.

它也可能有点令人困惑,因为我们还使用术语序列化来指代将类转换为数据流以进行存储或网络传输。

回答by Ming

A good way to understand this is to look at this problem from a database standpoint. (I know you ask for a context of java, sorry)

理解这一点的一个好方法是从数据库的角度来看这个问题。(我知道你问的是 java 的上下文,抱歉)

Assuming, you are a database. You accepts multiple transactions operating on the same object concurrentlybut you only have one single disk arm.

假设你是一个数据库。您接受同时在同一对象上操作的多个事务,但您只有一个磁盘臂

When you received multiple transactions at the same time, you will have to re-order those operations within transactions in some way so you poor disk arm can handle them one-by-one.

当您同时收到多个事务时,您将不得不以某种方式重新排序事务中的这些操作,以便您可怜的磁盘臂可以一一处理它们。

Serializable

可序列化

you have the ability re-arrange those transactions to make it looks like they happens sequentially (one by one). As you can imagine, it's not always possible if you accept arbitrary transactions (e.g. one bad transaction last 10 years). So naturally, you enforce some restrictions or conflict prevention mechanisms then you can say "I'm serializable! :)" .

您可以重新安排这些交易,使其看起来像是按顺序(一个接一个)发生。可以想象,如果您接受任意交易(例如,过去 10 年的一笔不良交易),这并不总是可行的。因此,很自然地,您强制执行一些限制或冲突预防机制,然后您可以说“我是可序列化的!:)”。

Linearizable

可线性化

Not only do you need to do what serializationneeds you to do. You also take a good look at those transactions. And try very hard to re-arrange those transactions in a sequential fashion without breaking the semantic order of transactions. As you might have noticed, semantic orderis the key. Basically, in order to claim that you are linearizable, you will have to assume/find a linearization pointfor every transactions and then order them according to the linearization point.

你不仅需要做serialization需要你做的事情。您还可以仔细查看这些交易。并且非常努力地以顺序方式重新排列这些事务,而不会破坏事务的语义顺序。正如您可能已经注意到的那样,semantic order是关键。基本上,为了声称您是linearizable,您必须linearization point为每笔交易假设/找到一个,然后根据linearization point.

Therefore, it's uncommon for a versatile RDMS database to say Hey I'm linearizable!. But, it's not uncommon if you are a Key-Value database.

因此,通用 RDMS 数据库说Hey I'm linearizable!. 但是,如果您是 Key-Value 数据库,这种情况并不少见。

e.g. As a KV database, you can say "I am linearizable!" if you can ensure a readwill always get the latest possible write. (assuming the moment of sending response for the readoperation is the linearization point) This sounds trivial, but will be a major challenge if you are a distributed KV database. Also note that serializabilitydoesn't require you to give the same guarantee.

例如,作为 KV 数据库,您可以说“我是linearizable!” 如果您可以确保 aread将始终获得最新的可能写入。(假设为read操作发送响应的时刻是linearization point)这听起来微不足道,但如果您是分布式 KV 数据库,这将是一个重大挑战。另请注意,serializability不需要您提供相同的保证。